> ## Documentation Index
> Fetch the complete documentation index at: https://docs.fallow.tools/llms.txt
> Use this file to discover all available pages before exploring further.

# Code duplication

> Detect copy-pasted code blocks across your TypeScript and JavaScript codebase. Built-in to the same binary as dead-code analysis. Suffix-array detection runs 8-26x faster than standalone tools.

`fallow dupes` finds duplicated code blocks across your entire codebase. 8-26x faster than jscpd on real-world projects.

```bash theme={null}
fallow dupes
```

## Why built-in duplication matters

Most dead-code analysis tools stop at finding unused exports and unreachable files. Fallow goes further: it includes duplication detection in the same binary, using the same module graph. This means you can cross-reference dead code with duplication in a single pass.

When you run `fallow dead-code --include-dupes`, fallow identifies code blocks that are both duplicated and unused. These are the highest-value cleanup targets: removing them eliminates dead code and reduces duplication simultaneously.

Running duplication analysis alongside dead-code detection also means:

* **One tool, one config, one CI step**: No need to install and configure a separate duplication detector
* **Shared file discovery**: The same ignore patterns, entry points, and workspace config apply to both analyses
* **Cross-analysis insights**: Clone families that span unused files are flagged as combined findings
* **Consistent output formats**: JSON, SARIF, markdown, compact, and CodeClimate output work the same way for duplication as for dead code

## Detection modes

<Tabs>
  <Tab title="Strict">
    Exact token-for-token clones only. No normalization is applied; the code must be character-identical after tokenization.

    ```bash theme={null}
    fallow dupes --mode strict
    ```

    Best for finding exact copy-paste where nothing was changed.
  </Tab>

  <Tab title="Mild (default)">
    The default mode. Strict and mild modes produce identical results. AST-based tokenization already strips whitespace and comments before comparison.

    ```bash theme={null}
    fallow dupes --mode mild
    ```

    The recommended starting point for general-purpose detection.
  </Tab>

  <Tab title="Weak">
    Matches clones even when string literal values differ. Variable names must still match.

    ```bash theme={null}
    fallow dupes --mode weak
    ```

    Good for finding code that was duplicated and only had strings (URLs, messages, labels) changed.
  </Tab>

  <Tab title="Semantic">
    Catches clones with renamed variables and different literal values. Uses token-type normalization to match structurally equivalent code.

    ```bash theme={null}
    fallow dupes --mode semantic
    ```

    The most aggressive mode. Finds code that was copied and then adapted with new variable names and values.
  </Tab>
</Tabs>

<Tip>
  Start with `mild` mode (default). Upgrade to `semantic` when you want to catch clones with renamed variables.
</Tip>

Here's what typical output looks like:

```bash title="$ fallow dupes" theme={null}
● Duplicates (3 clone groups)

     57 lines  2 instances
    src/components/Calendar/CalendarMonth.stories.tsx:597-653
    src/components/Calendar/CalendarYear.stories.tsx:818-874

     42 lines  3 instances
    src/features/forecasting/server/procedures/analytics.ts:141-181
    src/features/forecasting/server/procedures/cashflow.ts:153-194
    src/features/forecasting/server/procedures/income.ts:590-631

  Identical code blocks detected via suffix-array analysis — https://docs.fallow.tools/explanations/duplication#clone-groups

✓ 27,255 lines (19.4%) duplicated across 398 files (0.23s)
```

In semantic mode, fallow also reports renamed identifiers:

```bash title="$ fallow dupes --mode semantic" theme={null}
● Duplicates (2 clone groups)

    196 lines  2 instances
    src/lib/dutch-holidays.ts:193-388
    src/lib/dutch-holidays.ts:389-584
    Renamed: holidays2024→holidays2025, year2024→year2025

     42 lines  3 instances
    src/features/forecasting/server/procedures/analytics.ts:141-181
    src/features/forecasting/server/procedures/cashflow.ts:153-194
    src/features/forecasting/server/procedures/income.ts:590-631
    Renamed: analyticsData→cashflowData→incomeData

  Identical code blocks detected via suffix-array analysis — https://docs.fallow.tools/explanations/duplication#clone-groups

✗ 94,457 lines (67.2%) duplicated across 775 files (3.74s)
```

## Thresholds and limits

```bash theme={null}
fallow dupes --min-tokens 50   # Minimum tokens per clone (default: 50)
fallow dupes --min-lines 5     # Minimum lines per clone (default: 5)
fallow dupes --threshold 5     # Fail if duplication exceeds 5%
fallow dupes --skip-local      # Only cross-directory duplicates
```

## Clone families

Clone groups sharing the same file set are grouped into **clone families** with refactoring suggestions:

* **Extract function**: clones are in the same file
* **Extract module**: clones span multiple files

## Cross-language detection

Compare TypeScript and JavaScript files by stripping type annotations:

```bash theme={null}
fallow dupes --cross-language
```

Fallow normalizes `.ts` files to their `.js` equivalent for comparison. This catches clones where one copy was converted from TypeScript to JavaScript or vice versa.

## Ignoring imports

Files with the same set of imports (sorted alphabetically by a formatter) often produce false-positive clones. Strip `import` declarations from the token stream:

```bash theme={null}
fallow dupes --ignore-imports
```

Or set it permanently in config:

```jsonc theme={null}
{
  "duplicates": {
    "ignoreImports": true
  }
}
```

Only affects ES `import` statements. CommonJS `require()` calls and re-exports are not filtered.

## Incremental analysis

Only check duplication in files changed since a git ref:

```bash theme={null}
fallow dupes --changed-since main
```

Useful in CI to only report new duplication introduced in a pull request.

## Baseline comparison

Adopt duplication limits incrementally:

```bash theme={null}
# Save current duplication as baseline
fallow dupes --save-baseline fallow-baselines/dupes.json

# Fail only on new duplication
fallow dupes --baseline fallow-baselines/dupes.json
```

## Debugging

Trace all clones of a specific code location:

```bash theme={null}
fallow dupes --trace src/utils.ts:42
```

## Benchmarks vs jscpd

| Project                                       |  Files |    fallow |  jscpd | Speedup |
| :-------------------------------------------- | -----: | --------: | -----: | ------: |
| [fastify](https://github.com/fastify/fastify) |    286 |  **76ms** |  1.96s | **26x** |
| [vue/core](https://github.com/vuejs/core)     |    522 | **124ms** |  3.11s | **25x** |
| [next.js](https://github.com/vercel/next.js)  | 20,416 | **2.89s** | 24.37s |  **8x** |

Fallow uses a <Tooltip tip="A sorted array of all suffixes of a string, enabling efficient pattern matching without pairwise comparison">suffix array</Tooltip> with <Tooltip tip="Longest Common Prefix array, used alongside suffix arrays to find the longest shared sequences between code blocks">LCP</Tooltip> for clone detection, avoiding quadratic pairwise comparison.

## See also

<CardGroup cols={3}>
  <Card title="CLI: dupes" icon="terminal" href="/cli/dupes">
    Full reference for the `fallow dupes` command and its flags.
  </Card>

  <Card title="Configuration" icon="gear" href="/configuration/overview">
    Set default duplication thresholds and modes in your config file.
  </Card>

  <Card title="Migrating from jscpd" icon="arrow-right" href="/migration/from-jscpd">
    Replace jscpd with fallow in your project.
  </Card>
</CardGroup>
