| 1 | # jwt-differential-fuzzer |
| 2 | |
| 3 | Differential JWT verification harness. Feeds the same |
| 4 | `(token, key, alg-allowlist)` triple into N JWT libraries simultaneously |
| 5 | and surfaces any disagreement in the `valid` field. Disagreements at the |
| 6 | verification boundary are auth-bypass primitives. |
| 7 | |
| 8 | A JWT library that accepts a token another major library rejects, given |
| 9 | identical inputs, is either misimplementing the spec or interpreting it |
| 10 | differently than the rest of the ecosystem. Either way, applications that |
| 11 | share tokens across services written in different languages can be split |
| 12 | between accepting and rejecting verifiers, and that asymmetry is |
| 13 | exploitable. |
| 14 | |
| 15 | Wycheproof has static test vectors. This harness runs the libraries live, |
| 16 | in matched containers, against a corpus that grows over time. |
| 17 | |
| 18 | See [PLAN.md](PLAN.md) for the full architecture writeup. See |
| 19 | [findings/](findings/) for advisories produced by this harness. |
| 20 | |
| 21 | ## Libraries under test (v1) |
| 22 | |
| 23 | | ID | Library | Language | Why | |
| 24 | | ---------- | -------------------------------- | -------- | ------------------------------ | |
| 25 | | `nodejwt` | `jsonwebtoken` (Auth0) | Node | ~10M weekly npm downloads | |
| 26 | | `pyjwt` | `PyJWT` | Python | Historical alg-confusion CVEs | |
| 27 | | `pyjose` | `python-jose` | Python | Looser parser, CVE-2024-33663 territory | |
| 28 | | `panva` | `jose` (panva) | Node | Most spec-compliant JS lib; oracle | |
| 29 | | `gojwt` | `golang-jwt/jwt` v5 | Go | Used in K8s, Helm, etc. | |
| 30 | |
| 31 | Each runs as an HTTP server inside a minimal Docker container exposing a |
| 32 | single `POST /verify` endpoint that returns |
| 33 | `{"lib": "...", "valid": bool, "error": "..."}`. |
| 34 | |
| 35 | ## Architecture |
| 36 | |
| 37 | ``` |
| 38 | +------------------------------+ |
| 39 | | orchestrator/differ.py | |
| 40 | | (corpus -> fanout -> compare)| |
| 41 | +---------------+--------------+ |
| 42 | | |
| 43 | | HTTP /verify (parallel fanout) |
| 44 | v |
| 45 | +----------+ +----------+ +----------+ +----------+ +----------+ |
| 46 | | nodejwt | | pyjwt | | pyjose | | panva | | gojwt | |
| 47 | | :7001 | | :7002 | | :7003 | | :7004 | | :7005 | |
| 48 | +----------+ +----------+ +----------+ +----------+ +----------+ |
| 49 | | |
| 50 | v |
| 51 | +-----------------------------------+ |
| 52 | | BYPASS rows | |
| 53 | | (libs disagree on valid) | |
| 54 | +-----------------------------------+ |
| 55 | ``` |
| 56 | |
| 57 | The orchestrator submits every corpus case to every running target in |
| 58 | parallel, then collapses the responses by `valid`. If the set of "accept" |
| 59 | verifiers and the set of "reject" verifiers are both non-empty, the row |
| 60 | is a BYPASS-class disagreement. Errors are bucketed (not literal-string |
| 61 | compared) so different wording across libs doesn't cause false positives. |
| 62 | |
| 63 | ## Test corpus |
| 64 | |
| 65 | `corpus/seed.json` ships with baseline positive controls (RS256, HS256, |
| 66 | ES256 happy paths) plus a growing set of bug-class cases: |
| 67 | |
| 68 | - **alg confusion** - HS256 token signed against the RSA public key |
| 69 | - **kid injection** - SQL-i/path traversal patterns in kid |
| 70 | - **jku spoof** - external jku URL pointing at attacker-controlled JWKS |
| 71 | - **crit handling** - RFC 7515 §4.1.11 critical-header enforcement |
| 72 | - **JWE/JWS confusion** - JWE token sent into a JWS verifier |
| 73 | - **ECDSA edge cases** - r/s of zero, n, n-1 |
| 74 | - **header JSON quirks** - duplicate keys, NUL bytes, BOM, unicode |
| 75 | |
| 76 | `scripts/build_corpus.py` can extend the corpus from generators. |
| 77 | |
| 78 | ## Running |
| 79 | |
| 80 | ```bash |
| 81 | git clone https://github.com/zionboggan/jwt-differential-fuzzer |
| 82 | cd jwt-differential-fuzzer |
| 83 | |
| 84 | scripts/up.sh |
| 85 | python3 orchestrator/differ.py --corpus corpus/seed.json |
| 86 | ``` |
| 87 | |
| 88 | `scripts/up.sh` brings the 5 targets up via Docker Compose; the orchestrator |
| 89 | prints one row per case with the per-library verdict and flags any BYPASS- |
| 90 | class disagreements. `scripts/down.sh` tears the targets down. |
| 91 | |
| 92 | For environments without Docker, `scripts/up_native.sh` runs each target |
| 93 | natively against a managed Python venv / npm install / go build under |
| 94 | `.native/`. |
| 95 | |
| 96 | Single case: |
| 97 | |
| 98 | ```bash |
| 99 | python3 orchestrator/differ.py --corpus corpus/seed.json --only crit-crit-eca |
| 100 | ``` |
| 101 | |
| 102 | Run against a subset of targets: |
| 103 | |
| 104 | ```bash |
| 105 | python3 orchestrator/differ.py --corpus corpus/seed.json --targets nodejwt,panva |
| 106 | ``` |
| 107 | |
| 108 | ## Findings |
| 109 | |
| 110 | Each disagreement that reproduces with a working spec citation gets a |
| 111 | write-up in [findings/](findings/) and a coordinated disclosure attempt |
| 112 | upstream. The `findings/` directory is the audit trail of confirmed |
| 113 | issues, with PoC code, sister-advisory comparisons, and a disclosure |
| 114 | timeline section. |
| 115 | |
| 116 | Filing follows responsible disclosure norms: |
| 117 | |
| 118 | 1. Confirm the disagreement is reproducible against the latest released |
| 119 | version of each affected library. |
| 120 | 2. Confirm a spec citation that picks a winner (i.e., the RFC says X, |
| 121 | library Y does not implement X). |
| 122 | 3. File a GitHub Security Advisory at the affected repository. |
| 123 | 4. Request a CVE via the repository's CNA or MITRE. |
| 124 | 5. Wait for the upstream patch or the embargo window expiration before |
| 125 | broadening publication. |
| 126 | |
| 127 | The advisories currently in `findings/` are public-disclosure-stage; their |
| 128 | sister advisories at other libraries are already CVE'd. |
| 129 | |
| 130 | ## License |
| 131 | |
| 132 | MIT. See [LICENSE](LICENSE). |