README.md · JWT Differential Fuzzer

132 lines · markdown

# jwt-differential-fuzzer
 
Differential JWT verification harness. Feeds the same
`(token, key, alg-allowlist)` triple into N JWT libraries simultaneously
and surfaces any disagreement in the `valid` field. Disagreements at the
verification boundary are auth-bypass primitives.
 
A JWT library that accepts a token another major library rejects, given
identical inputs, is either misimplementing the spec or interpreting it
differently than the rest of the ecosystem. Either way, applications that
share tokens across services written in different languages can be split
between accepting and rejecting verifiers, and that asymmetry is
exploitable.
 
Wycheproof has static test vectors. This harness runs the libraries live,
in matched containers, against a corpus that grows over time.
 
See [PLAN.md](PLAN.md) for the full architecture writeup. See
[findings/](findings/) for advisories produced by this harness.
 
## Libraries under test (v1)
 
| ID         | Library                          | Language | Why                            |
| ---------- | -------------------------------- | -------- | ------------------------------ |
| `nodejwt`  | `jsonwebtoken` (Auth0)           | Node     | ~10M weekly npm downloads      |
| `pyjwt`    | `PyJWT`                          | Python   | Historical alg-confusion CVEs  |
| `pyjose`   | `python-jose`                    | Python   | Looser parser, CVE-2024-33663 territory |
| `panva`    | `jose` (panva)                   | Node     | Most spec-compliant JS lib; oracle |
| `gojwt`    | `golang-jwt/jwt` v5              | Go       | Used in K8s, Helm, etc.        |
 
Each runs as an HTTP server inside a minimal Docker container exposing a
single `POST /verify` endpoint that returns
`{"lib": "...", "valid": bool, "error": "..."}`.
 
## Architecture
 
```
                +------------------------------+
                |    orchestrator/differ.py    |
                |  (corpus -> fanout -> compare)|
                +---------------+--------------+
                                |
                                | HTTP /verify (parallel fanout)
                                v
       +----------+ +----------+ +----------+ +----------+ +----------+
       | nodejwt  | | pyjwt    | | pyjose   | | panva    | | gojwt    |
       | :7001    | | :7002    | | :7003    | | :7004    | | :7005    |
       +----------+ +----------+ +----------+ +----------+ +----------+
                                |
                                v
              +-----------------------------------+
              | BYPASS rows                       |
              | (libs disagree on valid)          |
              +-----------------------------------+
```
 
The orchestrator submits every corpus case to every running target in
parallel, then collapses the responses by `valid`. If the set of "accept"
verifiers and the set of "reject" verifiers are both non-empty, the row
is a BYPASS-class disagreement. Errors are bucketed (not literal-string
compared) so different wording across libs doesn't cause false positives.
 
## Test corpus
 
`corpus/seed.json` ships with baseline positive controls (RS256, HS256,
ES256 happy paths) plus a growing set of bug-class cases:
 
- **alg confusion** - HS256 token signed against the RSA public key
- **kid injection** - SQL-i/path traversal patterns in kid
- **jku spoof** - external jku URL pointing at attacker-controlled JWKS
- **crit handling** - RFC 7515 §4.1.11 critical-header enforcement
- **JWE/JWS confusion** - JWE token sent into a JWS verifier
- **ECDSA edge cases** - r/s of zero, n, n-1
- **header JSON quirks** - duplicate keys, NUL bytes, BOM, unicode
 
`scripts/build_corpus.py` can extend the corpus from generators.
 
## Running
 
```bash
git clone https://github.com/zionboggan/jwt-differential-fuzzer
cd jwt-differential-fuzzer
 
scripts/up.sh
python3 orchestrator/differ.py --corpus corpus/seed.json
```
 
`scripts/up.sh` brings the 5 targets up via Docker Compose; the orchestrator
prints one row per case with the per-library verdict and flags any BYPASS-
class disagreements. `scripts/down.sh` tears the targets down.
 
For environments without Docker, `scripts/up_native.sh` runs each target
natively against a managed Python venv / npm install / go build under
`.native/`.
 
Single case:
 
```bash
python3 orchestrator/differ.py --corpus corpus/seed.json --only crit-crit-eca
```
 
Run against a subset of targets:
 
```bash
python3 orchestrator/differ.py --corpus corpus/seed.json --targets nodejwt,panva
```
 
## Findings
 
Each disagreement that reproduces with a working spec citation gets a
write-up in [findings/](findings/) and a coordinated disclosure attempt
upstream. The `findings/` directory is the audit trail of confirmed
issues, with PoC code, sister-advisory comparisons, and a disclosure
timeline section.
 
Filing follows responsible disclosure norms:
 
1. Confirm the disagreement is reproducible against the latest released
   version of each affected library.
2. Confirm a spec citation that picks a winner (i.e., the RFC says X,
   library Y does not implement X).
3. File a GitHub Security Advisory at the affected repository.
4. Request a CVE via the repository's CNA or MITRE.
5. Wait for the upstream patch or the embargo window expiration before
   broadening publication.
 
The advisories currently in `findings/` are public-disclosure-stage; their
sister advisories at other libraries are already CVE'd.
 
## License
 
MIT. See [LICENSE](LICENSE).

1	# jwt-differential-fuzzer
2
3	Differential JWT verification harness. Feeds the same
4	`(token, key, alg-allowlist)` triple into N JWT libraries simultaneously
5	and surfaces any disagreement in the `valid` field. Disagreements at the
6	verification boundary are auth-bypass primitives.
7
8	A JWT library that accepts a token another major library rejects, given
9	identical inputs, is either misimplementing the spec or interpreting it
10	differently than the rest of the ecosystem. Either way, applications that
11	share tokens across services written in different languages can be split
12	between accepting and rejecting verifiers, and that asymmetry is
13	exploitable.
14
15	Wycheproof has static test vectors. This harness runs the libraries live,
16	in matched containers, against a corpus that grows over time.
17
18	See [PLAN.md](PLAN.md) for the full architecture writeup. See
19	[findings/](findings/) for advisories produced by this harness.
20
21	## Libraries under test (v1)
22
23	\| ID \| Library \| Language \| Why \|
24	\| ---------- \| -------------------------------- \| -------- \| ------------------------------ \|
25	\| `nodejwt` \| `jsonwebtoken` (Auth0) \| Node \| ~10M weekly npm downloads \|
26	\| `pyjwt` \| `PyJWT` \| Python \| Historical alg-confusion CVEs \|
27	\| `pyjose` \| `python-jose` \| Python \| Looser parser, CVE-2024-33663 territory \|
28	\| `panva` \| `jose` (panva) \| Node \| Most spec-compliant JS lib; oracle \|
29	\| `gojwt` \| `golang-jwt/jwt` v5 \| Go \| Used in K8s, Helm, etc. \|
30
31	Each runs as an HTTP server inside a minimal Docker container exposing a
32	single `POST /verify` endpoint that returns
33	`{"lib": "...", "valid": bool, "error": "..."}`.
34
35	## Architecture
36
37	```
38	+------------------------------+
39	\| orchestrator/differ.py \|
40	\| (corpus -> fanout -> compare)\|
41	+---------------+--------------+
42	\|
43	\| HTTP /verify (parallel fanout)
44	v
45	+----------+ +----------+ +----------+ +----------+ +----------+
46	\| nodejwt \| \| pyjwt \| \| pyjose \| \| panva \| \| gojwt \|
47	\| :7001 \| \| :7002 \| \| :7003 \| \| :7004 \| \| :7005 \|
48	+----------+ +----------+ +----------+ +----------+ +----------+
49	\|
50	v
51	+-----------------------------------+
52	\| BYPASS rows \|
53	\| (libs disagree on valid) \|
54	+-----------------------------------+
55	```
56
57	The orchestrator submits every corpus case to every running target in
58	parallel, then collapses the responses by `valid`. If the set of "accept"
59	verifiers and the set of "reject" verifiers are both non-empty, the row
60	is a BYPASS-class disagreement. Errors are bucketed (not literal-string
61	compared) so different wording across libs doesn't cause false positives.
62
63	## Test corpus
64
65	`corpus/seed.json` ships with baseline positive controls (RS256, HS256,
66	ES256 happy paths) plus a growing set of bug-class cases:
67
68	- alg confusion - HS256 token signed against the RSA public key
69	- kid injection - SQL-i/path traversal patterns in kid
70	- jku spoof - external jku URL pointing at attacker-controlled JWKS
71	- crit handling - RFC 7515 §4.1.11 critical-header enforcement
72	- JWE/JWS confusion - JWE token sent into a JWS verifier
73	- ECDSA edge cases - r/s of zero, n, n-1
74	- header JSON quirks - duplicate keys, NUL bytes, BOM, unicode
75
76	`scripts/build_corpus.py` can extend the corpus from generators.
77
78	## Running
79
80	```bash
81	git clone https://github.com/zionboggan/jwt-differential-fuzzer
82	cd jwt-differential-fuzzer
83
84	scripts/up.sh
85	python3 orchestrator/differ.py --corpus corpus/seed.json
86	```
87
88	`scripts/up.sh` brings the 5 targets up via Docker Compose; the orchestrator
89	prints one row per case with the per-library verdict and flags any BYPASS-
90	class disagreements. `scripts/down.sh` tears the targets down.
91
92	For environments without Docker, `scripts/up_native.sh` runs each target
93	natively against a managed Python venv / npm install / go build under
94	`.native/`.
95
96	Single case:
97
98	```bash
99	python3 orchestrator/differ.py --corpus corpus/seed.json --only crit-crit-eca
100	```
101
102	Run against a subset of targets:
103
104	```bash
105	python3 orchestrator/differ.py --corpus corpus/seed.json --targets nodejwt,panva
106	```
107
108	## Findings
109
110	Each disagreement that reproduces with a working spec citation gets a
111	write-up in [findings/](findings/) and a coordinated disclosure attempt
112	upstream. The `findings/` directory is the audit trail of confirmed
113	issues, with PoC code, sister-advisory comparisons, and a disclosure
114	timeline section.
115
116	Filing follows responsible disclosure norms:
117
118	1. Confirm the disagreement is reproducible against the latest released
119	version of each affected library.
120	2. Confirm a spec citation that picks a winner (i.e., the RFC says X,
121	library Y does not implement X).
122	3. File a GitHub Security Advisory at the affected repository.
123	4. Request a CVE via the repository's CNA or MITRE.
124	5. Wait for the upstream patch or the embargo window expiration before
125	broadening publication.
126
127	The advisories currently in `findings/` are public-disclosure-stage; their
128	sister advisories at other libraries are already CVE'd.
129
130	## License
131
132	MIT. See [LICENSE](LICENSE).