fallow: The codebase analyzer for TypeScript and JavaScript

This page covers what fallow’s numbers mean, when to act on them, and where the research comes from.

Pass --explain to any command with --format json to include metric definitions directly in the JSON output as a _meta object. The MCP server always includes _meta automatically.

Complexity metrics

These metrics appear in fallow health output, both per-function (findings) and per-file (file scores).

Cyclomatic complexity

The number of linearly independent paths through a function’s control flow graph. For structured programs (like JS/TS), this simplifies to 1 + the number of decision points (if/else, switch cases, loops, ternary ?:, logical &&/||, catch).

Range	Interpretation	Action
1–10	Simple, easy to test exhaustively	No action needed
11–20	Moderate, most functions should stay here	Review if growing
21–50	High, hard to test all paths	Split into smaller functions
50+	Very high, testing all paths is impractical	Refactor urgently

Default threshold: 20

Research

Introduced by Thomas McCabe in “A Complexity Measure” (1976). The practical upper bound of 20 comes from NIST SP 500-235 (Watson & McCabe, 1996).

Cognitive complexity

How hard a function is to understand when reading top-to-bottom. Unlike cyclomatic complexity (which counts paths for testing), cognitive complexity measures comprehension difficulty. It penalizes nesting depth, break/continue to labels, recursion, and sequences of logical operators.

Range	Interpretation	Action
0–7	Easy to understand at a glance	No action needed
8–15	Moderate, may benefit from extraction	Extract helpers if growing
15+	Hard to follow, likely mixing concerns	Refactor: extract, simplify, or decompose

Default threshold: 15 (matches SonarSource’s default rule). The 0–7 “easy” range is fallow’s guideline based on common industry practice.

Research

Developed by G. Ann Campbell at SonarSource. Full specification: SonarSource white paper.

Complexity density

Total cyclomatic complexity divided by lines of code. Normalizing by file size lets you compare fairly: a 500-line file with complexity 50 (density 0.1) vs. a 10-line file with complexity 10 (density 1.0).

Value	Interpretation	Action
< 0.3	Low, data files, configs, simple modules	No action needed
0.3–1.0	Moderate, normal application code	Review the densest functions
> 1.0	Very dense, nearly every line is a decision point	Refactor: extract logic, simplify conditions

Cyclomatic vs cognitive: a code example

These two functions illustrate why both metrics matter:

// Cyclomatic: 6 (one per case)
// Cognitive: 1 (flat switch is easy to read)
function getStatusLabel(status: string): string {
  switch (status) {
    case "pending": return "Pending";
    case "active": return "Active";
    case "paused": return "Paused";
    case "cancelled": return "Cancelled";
    case "completed": return "Done";
    default: return "Unknown";
  }
}

The switch has higher cyclomatic complexity (more paths to test) but is trivially readable. The nested if chain has lower cyclomatic complexity but is harder to follow. Cognitive complexity captures that difference.

File health scores

Per-file scores from fallow health --file-scores. Zero-function files (barrel files, re-export files) are excluded.

Maintainability Index (MI)

Fallow uses a simplified Maintainability Index adapted for JavaScript/TypeScript module graphs.

fan_out_penalty = min(ln(fan_out + 1) × 4, 15)
MI = 100 - (complexity_density × 30) - (dead_code_ratio × 20) - fan_out_penalty

Clamped to [0, 100]. Higher is better.

Range	Interpretation	Action
70–100	Good maintainability	Monitor
40–70	Moderate	Review periodically, address if declining
0–40	Poor	Prioritize for refactoring

fallow health --file-scores --top 5

● File health scores (5 worst)

   57.5    src/helpers/errorUtil.ts
           1 fan-in    0 fan-out  100% dead  0.75 density

   76.2    src/helpers/util.ts
          35 fan-in    0 fan-out   86% dead  0.22 density

   90.5    src/types.ts
          42 fan-in    3 fan-out    0% dead  0.20 density

How fallow's MI relates to the classic Maintainability Index

The original MI from Oman and Hagemeister (1992) combines Halstead Volume, cyclomatic complexity, lines of code, and comment percentage. Microsoft adapted it to a 0–100 scale for Visual Studio.Fallow’s variant makes three changes for the JS/TS ecosystem:

Replaces Halstead Volume with complexity density. Halstead metrics require full type resolution, which fallow doesn’t have (syntactic analysis only). Complexity density captures the same “how complex per unit of code” signal.
Adds dead code ratio. Unused exports are a JS/TS-specific maintenance burden that the classic MI doesn’t account for.
Adds fan-out coupling penalty. High import counts are a strong signal of maintenance difficulty in module-based codebases. The penalty uses logarithmic scaling (ln(fan_out + 1) × 4, capped at 15 points), reflecting diminishing marginal risk per additional import.

Because the formula differs, fallow’s MI scores are not directly comparable to Visual Studio or SonarQube MI scores.

Fan-in, fan-out, and dead code ratio

Fan-in: number of files that import this file. High fan-in = high blast radius when changed. Fan-out: number of files this file imports. High fan-out = high coupling and change propagation risk. Dead code ratio: fraction of value exports with zero references (0-1). Type-only exports (interfaces, type aliases) are excluded.

Metric	Signal	Action
Fan-in > 20	Many dependents, critical file	Extra review before changes
Fan-out > 15	Many imports, high coupling	Consider splitting into smaller modules
Dead code = 0	All exports used	No action needed
Dead code > 0.3	Significant unused code	Run `fallow fix` to auto-remove

Research: fan-in/fan-out

Fan-in/fan-out coupling metrics originate from Henry and Kafura (1981). Later refined in the Chidamber & Kemerer OO metrics suite (1994), where Coupling Between Objects (CBO) captures the same idea for class-level dependencies.

Hotspot metrics

Hotspots are files that are both complex and frequently changing. Bugs concentrate at this intersection, and refactoring here yields the highest return. Available via fallow health --hotspots.

The core insight: Complex code that never changes is stable (leave it alone). Simple code that changes often is fine (easy to modify). But complex code that changes often is where defects concentrate.

Hotspot score

normalized_churn = weighted_commits / max_weighted_commits   (0..1)
normalized_complexity = complexity_density / max_density      (0..1)
score = normalized_churn × normalized_complexity × 100       (0..100)

The multiplicative relationship means a file needs both high churn AND high complexity to score highly. Either dimension alone produces a low score.

Score	Interpretation	Action
70–100	Critical hotspot	Prioritize refactoring in next sprint
40–70	Moderate hotspot	Schedule review, reduce complexity
0–40	Low risk	Monitor

fallow health --hotspots --top 3

● Hotspots (3 files, since 6 months)

   72.1    src/types.ts
           47 commits  +1,892 -847 churn  0.20 density  42 fan-in  accelerating

   38.4    src/helpers/util.ts
           23 commits  +412 -198 churn  0.22 density  35 fan-in  stable

   12.7    src/locales/en.ts
            8 commits  +145 -32 churn  0.35 density   2 fan-in  cooling

Weighted commits

Recency-weighted commit count using exponential decay with a 90-day half-life:

A commit yesterday contributes ~1.0
A commit 90 days ago contributes ~0.5
A commit 180 days ago contributes ~0.25

Research

Recency-weighted change metrics were introduced by Graves et al. (2000). The empirical link between churn and defect density was established by Nagappan and Ball (2005) at Microsoft.

Churn trend

Compares commit frequency in the first half vs second half of the analysis window:

Trend	Meaning	Action
`accelerating`	Recent > 1.5× older half	High priority if complexity is also high
`stable`	Balanced frequency	Normal, monitor score
`cooling`	Recent < 0.67× older half	Lower priority, stabilizing

Research: churn × complexity methodology

Pioneered by Michael Feathers and systematized by Adam Tornhill in Your Code as a Crime Scene (2015) and Software Design X-Rays (2018).

Duplication metrics

From fallow dupes. Fallow uses token-based clone detection with configurable normalization.

Detection modes

Mode	Clone type	What is normalized
`strict`	Type-1	Nothing, exact token match
`mild`	Type-2	Identifiers abstracted
`weak`	Type-2+	Identifiers + literals abstracted
`semantic`	Type-2+	Identifiers + literals + types stripped

fallow dupes --mode mild

● Clone group (42 lines, 2 instances)
  src/validators/userValidator.ts:12-53
  src/validators/orderValidator.ts:8-49

  Family: src/validators/userValidator.ts ↔ src/validators/orderValidator.ts
    3 clone groups, 118 duplicated lines
    → Extract shared validation module

Key metrics

Duplication %: fraction of total source tokens in clone groups. Computed over the full file set before --top truncation. Token / line count: per-group size. Tokens are language-aware (keywords, identifiers, operators). Larger clones have higher refactoring value. Clone groups: a set of 2+ code fragments with identical normalized token sequences at different locations. Clone families: multiple clone groups sharing the same files, indicating systematic duplication (e.g., copy-pasted modules). Families suggest extract-module refactoring rather than per-function extraction.

Research: clone detection

Clone taxonomy surveyed in Roy, Cordy, and Koschke (2009). Token-based detection pioneered by Baker (1995).

Limitations

Metrics are signals, not verdicts. Be aware of these edge cases.

Generated files (GraphQL codegen, Prisma client, OpenAPI types) may have high complexity density but shouldn’t be refactored. Use health.ignore patterns to exclude them.
Test files with many it()/test() blocks can have high cyclomatic complexity. This is usually fine. Test suites should cover many paths.
Config files (Webpack, ESLint) may appear as hotspots due to frequent changes during setup. Their churn typically cools after initial configuration.
MI scores are not portable. Fallow’s formula differs from Visual Studio and SonarQube, so scores are not directly comparable across tools.
Hotspot scores are project-relative. A score of 80 means “worst in your project,” not “objectively bad.” A well-maintained project may have a top score of 30.
Fan-in/out counts modules, not imports. import { a, b, c } from './utils' counts as 1 edge, not 3.

JSON `_meta` object

When --explain is passed (or via MCP), each command’s JSON output includes a _meta object:

{
  "schema_version": 3,
  "_meta": {
    "docs": "https://docs.fallow.tools/cli/health",
    "metrics": {
      "maintainability_index": {
        "name": "Maintainability Index",
        "description": "Composite score: 100 - (complexity_density × 30) - ...",
        "range": "[0, 100]",
        "interpretation": "higher is better; <40 poor, 40–70 moderate, >70 good"
      }
    }
  }
}

AI agents and CI systems can use this to interpret metric values without consulting external documentation.

Getting Started

Analysis

Configuration

Frameworks

Integrations

Explanations

Migration

Metric definitions

Complexity metrics

Cyclomatic complexity

Cognitive complexity

Complexity density

Cyclomatic vs cognitive: a code example

File health scores

Maintainability Index (MI)

Fan-in, fan-out, and dead code ratio

Hotspot metrics

Hotspot score

Weighted commits

Churn trend

Duplication metrics

Detection modes

Key metrics

Limitations

JSON `_meta` object

Getting Started

Analysis

Configuration

Frameworks

Integrations

Explanations

Migration

​Complexity metrics

​Cyclomatic complexity

​Cognitive complexity

​Complexity density

​Cyclomatic vs cognitive: a code example

​File health scores

​Maintainability Index (MI)

​Fan-in, fan-out, and dead code ratio

​Hotspot metrics

​Hotspot score

​Weighted commits

​Churn trend

​Duplication metrics

​Detection modes

​Key metrics

​Limitations

​JSON _meta object

Complexity metrics

Cyclomatic complexity

Cognitive complexity

Complexity density

Cyclomatic vs cognitive: a code example

File health scores

Maintainability Index (MI)

Fan-in, fan-out, and dead code ratio

Hotspot metrics

Hotspot score

Weighted commits

Churn trend

Duplication metrics

Detection modes

Key metrics

Limitations

JSON `_meta` object