Complexity metrics
These metrics appear infallow health output, both per-function (findings) and per-file (file scores).
Cyclomatic complexity
The number of linearly independent paths through a function’s control flow graph. For structured programs (like JS/TS), this simplifies to 1 + the number of decision points (if/else, switch cases, loops, ternary ?:, logical &&/||, catch).
| Range | Interpretation | Action |
|---|---|---|
| 1–10 | Simple, easy to test exhaustively | No action needed |
| 11–20 | Moderate, most functions should stay here | Review if growing |
| 21–50 | High, hard to test all paths | Split into smaller functions |
| 50+ | Very high, testing all paths is impractical | Refactor urgently |
Research
Research
Introduced by Thomas McCabe in “A Complexity Measure” (1976). The practical upper bound of 20 comes from NIST SP 500-235 (Watson & McCabe, 1996).
Cognitive complexity
How hard a function is to understand when reading top-to-bottom. Unlike cyclomatic complexity (which counts paths for testing), cognitive complexity measures comprehension difficulty. It penalizes nesting depth,break/continue to labels, recursion, and sequences of logical operators.
| Range | Interpretation | Action |
|---|---|---|
| 0–7 | Easy to understand at a glance | No action needed |
| 8–15 | Moderate, may benefit from extraction | Extract helpers if growing |
| 15+ | Hard to follow, likely mixing concerns | Refactor: extract, simplify, or decompose |
Research
Research
Developed by G. Ann Campbell at SonarSource. Full specification: SonarSource white paper.
Complexity density
Total cyclomatic complexity divided by lines of code. Normalizing by file size lets you compare fairly: a 500-line file with complexity 50 (density 0.1) vs. a 10-line file with complexity 10 (density 1.0).| Value | Interpretation | Action |
|---|---|---|
| < 0.3 | Low, data files, configs, simple modules | No action needed |
| 0.3–1.0 | Moderate, normal application code | Review the densest functions |
| > 1.0 | Very dense, nearly every line is a decision point | Refactor: extract logic, simplify conditions |
Cyclomatic vs cognitive: a code example
These two functions illustrate why both metrics matter:switch has higher cyclomatic complexity (more paths to test) but is trivially readable. The nested if chain has lower cyclomatic complexity but is harder to follow. Cognitive complexity captures that difference.
File health scores
Per-file scores fromfallow health --file-scores. Zero-function files (barrel files, re-export files) are excluded.
Maintainability Index (MI)
Fallow uses a simplified Maintainability Index adapted for JavaScript/TypeScript module graphs.| Range | Interpretation | Action |
|---|---|---|
| 70–100 | Good maintainability | Monitor |
| 40–70 | Moderate | Review periodically, address if declining |
| 0–40 | Poor | Prioritize for refactoring |
fallow health --file-scores --top 5
How fallow's MI relates to the classic Maintainability Index
How fallow's MI relates to the classic Maintainability Index
The original MI from Oman and Hagemeister (1992) combines Halstead Volume, cyclomatic complexity, lines of code, and comment percentage. Microsoft adapted it to a 0–100 scale for Visual Studio.Fallow’s variant makes three changes for the JS/TS ecosystem:
- Replaces Halstead Volume with complexity density. Halstead metrics require full type resolution, which fallow doesn’t have (syntactic analysis only). Complexity density captures the same “how complex per unit of code” signal.
- Adds dead code ratio. Unused exports are a JS/TS-specific maintenance burden that the classic MI doesn’t account for.
- Adds fan-out coupling penalty. High import counts are a strong signal of maintenance difficulty in module-based codebases. The penalty uses logarithmic scaling (
ln(fan_out + 1) × 4, capped at 15 points), reflecting diminishing marginal risk per additional import.
Fan-in, fan-out, and dead code ratio
Fan-in: number of files that import this file. High fan-in = high blast radius when changed. Fan-out: number of files this file imports. High fan-out = high coupling and change propagation risk. Dead code ratio: fraction of value exports with zero references (0-1). Type-only exports (interfaces, type aliases) are excluded.| Metric | Signal | Action |
|---|---|---|
| Fan-in > 20 | Many dependents, critical file | Extra review before changes |
| Fan-out > 15 | Many imports, high coupling | Consider splitting into smaller modules |
| Dead code = 0 | All exports used | No action needed |
| Dead code > 0.3 | Significant unused code | Run fallow fix to auto-remove |
Research: fan-in/fan-out
Research: fan-in/fan-out
Fan-in/fan-out coupling metrics originate from Henry and Kafura (1981). Later refined in the Chidamber & Kemerer OO metrics suite (1994), where Coupling Between Objects (CBO) captures the same idea for class-level dependencies.
Hotspot metrics
Hotspots are files that are both complex and frequently changing. Bugs concentrate at this intersection, and refactoring here yields the highest return. Available viafallow health --hotspots.
The core insight: Complex code that never changes is stable (leave it alone). Simple code that changes often is fine (easy to modify). But complex code that changes often is where defects concentrate.
Hotspot score
| Score | Interpretation | Action |
|---|---|---|
| 70–100 | Critical hotspot | Prioritize refactoring in next sprint |
| 40–70 | Moderate hotspot | Schedule review, reduce complexity |
| 0–40 | Low risk | Monitor |
fallow health --hotspots --top 3
Weighted commits
Recency-weighted commit count using exponential decay with a 90-day half-life:- A commit yesterday contributes ~1.0
- A commit 90 days ago contributes ~0.5
- A commit 180 days ago contributes ~0.25
Research
Research
Recency-weighted change metrics were introduced by Graves et al. (2000). The empirical link between churn and defect density was established by Nagappan and Ball (2005) at Microsoft.
Churn trend
Compares commit frequency in the first half vs second half of the analysis window:| Trend | Meaning | Action |
|---|---|---|
accelerating | Recent > 1.5× older half | High priority if complexity is also high |
stable | Balanced frequency | Normal, monitor score |
cooling | Recent < 0.67× older half | Lower priority, stabilizing |
Research: churn × complexity methodology
Research: churn × complexity methodology
Pioneered by Michael Feathers and systematized by Adam Tornhill in Your Code as a Crime Scene (2015) and Software Design X-Rays (2018).
Duplication metrics
Fromfallow dupes. Fallow uses token-based clone detection with configurable normalization.
Detection modes
| Mode | Clone type | What is normalized |
|---|---|---|
strict | Type-1 | Nothing, exact token match |
mild | Type-2 | Identifiers abstracted |
weak | Type-2+ | Identifiers + literals abstracted |
semantic | Type-2+ | Identifiers + literals + types stripped |
fallow dupes --mode mild
Key metrics
Duplication %: fraction of total source tokens in clone groups. Computed over the full file set before--top truncation.
Token / line count: per-group size. Tokens are language-aware (keywords, identifiers, operators). Larger clones have higher refactoring value.
Clone groups: a set of 2+ code fragments with identical normalized token sequences at different locations.
Clone families: multiple clone groups sharing the same files, indicating systematic duplication (e.g., copy-pasted modules). Families suggest extract-module refactoring rather than per-function extraction.
Research: clone detection
Research: clone detection
Clone taxonomy surveyed in Roy, Cordy, and Koschke (2009). Token-based detection pioneered by Baker (1995).
Limitations
- Generated files (GraphQL codegen, Prisma client, OpenAPI types) may have high complexity density but shouldn’t be refactored. Use
health.ignorepatterns to exclude them. - Test files with many
it()/test()blocks can have high cyclomatic complexity. This is usually fine. Test suites should cover many paths. - Config files (Webpack, ESLint) may appear as hotspots due to frequent changes during setup. Their churn typically cools after initial configuration.
- MI scores are not portable. Fallow’s formula differs from Visual Studio and SonarQube, so scores are not directly comparable across tools.
- Hotspot scores are project-relative. A score of 80 means “worst in your project,” not “objectively bad.” A well-maintained project may have a top score of 30.
- Fan-in/out counts modules, not imports.
import { a, b, c } from './utils'counts as 1 edge, not 3.
JSON _meta object
When --explain is passed (or via MCP), each command’s JSON output includes a _meta object: