Zion Boggan zionboggan.com ↗

Release v0.4.12: security hardening, cross-language parity, JCS, and PQ KEM mirror

Accumulated maintenance since v0.4.11, full suite verified green (Rust 157/0,
Python 91/2, both cross-language conformance suites bit-identical):

- XFF source-IP spoofing fix and operator-auth fail-closed gate (Python + Rust)
- JCS (RFC 8785) canonicalization unified across Python and Rust, closing a
  latent non-ASCII cross-language signature divergence on every signed byte
- subtle::ConstantTimeEq replaces the hand-rolled auth-token compare, closing
  an O(1) token-length timing leak
- Rust open_sealed enforces the jurisdiction policy via check_policy, at parity
  with Python
- HW-P256 manifest parse parity in Python (fixes the cross-language interop break)
- Post-quantum ML-KEM-768 KEM mirror (OSGT-HYBRID-v1, Phase A) in Rust,
  byte-identical to Python and proven by cross-language conformance both directions
- CI now runs the full test suite plus cross-language conformance on every PR
- Minimum supported Rust version raised to 1.85 for the ml-kem dependency

See CHANGELOG.md for the full entry.
3d63046   Zion Boggan committed on Jun 17, 2026 (5 days ago)
.github/workflows/opsec.yml +3 -0
@@ -5,6 +5,9 @@ on:
push:
branches: [main]
+permissions:
+ contents: read
+
env:
FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: true
.github/workflows/source-style.yml +3 -0
@@ -5,6 +5,9 @@ on:
push:
branches: [main]
+permissions:
+ contents: read
+
env:
FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: true
.github/workflows/tests.yml +105 -0
@@ -0,0 +1,105 @@
+name: tests
+
+# Enforces the project's "conformance is ground truth" rule on every change.
+# Three jobs:
+# python -> main()-style scripts + pytest-collectable + (optional) PQ
+# rust -> cargo test --workspace --release
+# conformance -> cross-language bit-identity (needs both sides green first)
+#
+# The main()-style test files predate pytest collection in this repo and run
+# via __main__; pytest only collects test_hw_p256 and test_source_comment_style
+# today. Converting the rest to pytest test_* is tracked separately; until then
+# both runners are invoked here so coverage is actually enforced.
+
+on:
+ pull_request:
+ push:
+ branches: [main]
+
+permissions:
+ contents: read
+
+env:
+ FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: true
+
+jobs:
+ python:
+ runs-on: ubuntu-latest
+ timeout-minutes: 20
+ steps:
+ - uses: actions/checkout@v4
+ - uses: actions/setup-python@v5
+ with:
+ python-version: '3.11'
+ cache: pip
+ cache-dependency-path: requirements.txt
+ - name: Install Python dependencies
+ run: |
+ python -m pip install --upgrade pip
+ pip install -r requirements.txt pytest
+ - name: Install tkinter (cli.gui imports it at module top)
+ run: sudo apt-get update && sudo apt-get install -y python3-tk
+ - name: Try install liboqs (optional, for PQ suite)
+ run: pip install liboqs-python || echo "::notice::liboqs-python unavailable; test_pq will skip"
+ - name: Run main()-style test scripts
+ run: |
+ rc=0
+ for t in test_e2e test_e2e_v2 test_rekor_unit test_policy_unit test_tlog_unit \
+ test_registry_unit test_registry_conformance test_rekor_backcompat \
+ test_text_format_unit test_l3_policy_unit test_siem_unit \
+ test_gui_hardening_unit; do
+ echo "::group::tests/$t.py"
+ if python "tests/$t.py"; then echo "[OK] $t"; else echo "::error::tests/$t.py failed"; rc=1; fi
+ echo "::endgroup::"
+ done
+ exit $rc
+ - name: Run PQ tests (skipped cleanly if liboqs absent)
+ run: |
+ if python -c "import oqs" 2>/dev/null; then
+ python tests/test_pq.py
+ else
+ echo "::notice::liboqs not installed; skipping test_pq (optional PQ suite)"
+ fi
+ - name: Run pytest-collectable tests
+ run: pytest tests/ -v
+
+ rust:
+ runs-on: ubuntu-latest
+ timeout-minutes: 25
+ defaults:
+ run:
+ working-directory: oversight-rust
+ steps:
+ - uses: actions/checkout@v4
+ - uses: dtolnay/rust-toolchain@stable
+ - uses: Swatinem/rust-cache@v2
+ with:
+ workspaces: oversight-rust
+ - name: cargo test --workspace --release
+ run: cargo test --workspace --release
+
+ conformance:
+ runs-on: ubuntu-latest
+ timeout-minutes: 25
+ needs: [python, rust]
+ steps:
+ - uses: actions/checkout@v4
+ - uses: actions/setup-python@v5
+ with:
+ python-version: '3.11'
+ cache: pip
+ cache-dependency-path: requirements.txt
+ - uses: dtolnay/rust-toolchain@stable
+ - uses: Swatinem/rust-cache@v2
+ with:
+ workspaces: oversight-rust
+ - name: Install Python dependencies
+ run: |
+ python -m pip install --upgrade pip
+ pip install -r requirements.txt
+ - name: Cross-language conformance (CLASSIC seal/open)
+ working-directory: oversight-rust
+ run: bash tests/conformance_cross_lang.sh
+ - name: Cross-language conformance (Rekor DSSE/PAE)
+ working-directory: oversight-rust
+ run: bash tests/conformance_rekor.sh
.gitignore +1 -0
@@ -60,6 +60,7 @@ NOTES_*.md
*_HANDOFF.md
CODEX_*.md
CLAUDE_*.md
+MAINTAINER_*.md
START_HERE.md
RUNBOOK.md
docs/RUNBOOK.md
CHANGELOG.md +109 -1
@@ -1,6 +1,114 @@
# Oversight CHANGELOG
-## Unreleased
+## v0.4.12 - 2026-06-17 Security hardening, cross-language parity, JCS canonicalization, and the post-quantum KEM mirror
+
+- **Minimum supported Rust version raised to 1.85.** The post-quantum KEM
+ mirror depends on RustCrypto's `ml-kem` crate, which requires Rust 1.85
+ (edition 2024). The workspace `rust-version` is bumped from 1.75 to 1.85
+ to declare this accurately; the workspace itself stays on edition 2021.
+ Consumers building from source need a 1.85+ toolchain.
+
+- **Post-quantum KEM mirror (OSGT-HYBRID-v1), Phase A.** The Rust crypto
+ crate now implements the ML-KEM-768 half of the hybrid DEK wrap,
+ byte-identical to the Python reference. New functions
+ `oversight_crypto::hybrid_wrap_dek`, `hybrid_unwrap_dek`, and
+ `mlkem768_generate_keypair` reproduce the Python construction exactly:
+ IKM is `ss_x || ss_pq || x25519_ephemeral_pub || mlkem_ciphertext`,
+ HKDF-SHA256 with a zero salt and `info=b"oversight-hybrid-v1-dek-wrap"`,
+ then XChaCha20-Poly1305 with `aad=b"oversight-hybrid-dek"`. The envelope
+ JSON shape (`suite`, `x25519_ephemeral_pub`, `mlkem_ciphertext`, `nonce`,
+ `wrapped_dek`, all hex) matches `oversight_core.crypto.hybrid_wrap_dek`,
+ so a Python-wrapped envelope opens in Rust and vice versa. The primitive
+ is RustCrypto's pure-Rust `ml-kem` crate (FIPS 203, no C build, chosen
+ over liboqs because the build environment lacks cmake and the pure-Rust
+ crate satisfies the same NIST-primitives-via-RustCrypto/liboqs rule). Rust keeps its
+ decapsulation keys in the recommended 64-byte seed form; only the
+ 1184-byte public key ever crosses the language boundary, so no deprecated
+ expanded-key import is needed. Six Rust unit tests mirror
+ `tests/test_pq.py` (round trip, JSON shape, tamper-classical,
+ tamper-PQ, wrong recipient, overhead bound), and a new cross-language
+ conformance vector (`tests/conformance_hybrid_kem.py`) proves both
+ directions (PY wrap to RS unwrap, RS wrap to PY unwrap) against a live
+ liboqs, wired into `conformance_cross_lang.sh` as step 5. Scope: KEM
+ only. The container HYBRID seal/open dispatch and ML-DSA manifest
+ signing are deferred (Phase B and Phase C respectively); per the owner's
+ decision, Phase C will use dual Ed25519 AND ML-DSA-65 signatures.
+
+- **Suppress liboqs-python import-time stdout.** `liboqs-python` attaches a
+ `StreamHandler(sys.stdout)` at INFO and logs a line when imported, which
+ contaminated stdout for any caller importing `oversight_core.crypto` and
+ broke byte-identity conformance capture. `oversight_core/crypto.py` now
+ redirects stdout during the `import oqs` so the message is dropped.
+ Behavior of the PQ primitives is unchanged.
+
+- **Rust `open_sealed` now enforces the `jurisdiction` policy.** The
+ `oversight-policy::check_policy` function already implemented
+ jurisdiction matching (and had passing unit tests), but the container
+ open path never called it: `open_sealed` and `open_sealed_with_provider`
+ inlined only the `not_after` / `not_before` time checks and skipped
+ jurisdiction entirely, so the policy crate's enforcement was dead code
+ on the real open path. Python enforces at `container.py:226`
+ (`check_policy` before decryption); Rust now mirrors that single
+ chokepoint. Both Rust open functions now call
+ `oversight_policy::check_policy(&manifest, policy_ctx)` after issuer
+ trust and before DEK unwrap, which also retires the duplicated inline
+ time checks onto the tested canonical path. Observable change: time
+ violations now surface as `ContainerError::Policy` (typed, matching
+ Python's `PolicyViolation`) rather than `ContainerError::Precondition`.
+ Four new container-level tests pin the behavior (mismatch rejected,
+ match allowed, GLOBAL allowed, no-context skip), and the existing
+ `expired_file_rejected` assertion is updated to the typed error.
+ Backward compatibility: GLOBAL manifests (the default) and opens with
+ no `PolicyContext` (the raw-open path used by the cross-language
+ conformance harness) are unaffected. Scope honesty: this closes the
+ open-time gap the review flagged. It does NOT implement the
+ registry-time MUST in SPEC.md §8.4 (`policy.jurisdiction` mismatch
+ causing `/register` rejection), which is unimplemented in BOTH
+ languages and tracked separately. A `--jurisdiction` CLI flag for an
+ opener asserting a non-default region is a deferred follow-up, not
+ required for the gap closure (the CLI's default `GLOBAL` context
+ already rejects a manifest requiring a specific jurisdiction).
+
+- **Constant-time token comparison via `subtle::ConstantTimeEq`.** The
+ Oversight Rust registry's operator-token and DNS-secret comparison was
+ a hand-rolled loop in `oversight-registry/src/auth.rs` with an early
+ return on length mismatch, which trivially leaked whether an attacker's
+ guessed length was correct in O(1). Replaced with
+ `subtle::ConstantTimeEq` (audited RustCrypto crate, already a transitive
+ dependency via `ed25519-dalek` and `chacha20poly1305`, now declared
+ explicitly). Four new registry unit tests pin the correctness matrix
+ (equal inputs across lengths, same-length different content,
+ mismatched-length no-early-return, single-bit difference). The
+ hand-rolled early-return leak is closed; the residual
+ max(supplied, expected) timing observation from `subtle`'s slice impl
+ is documented in the function-level rationale and is acceptable for
+ Oversight's high-entropy operator tokens.
+
+- **Canonicalization unified on RFC 8785 JCS.** The Python reference now
+ canonicalizes via a vendored `oversight_core.jcs.jcs_dumps` that is
+ byte-exact with the Rust reference's `serde_jcs::to_vec`. Previously
+ Python used `json.dumps(sort_keys=True, separators=(",",":"))` with the
+ default `ensure_ascii=True`, which diverged from Rust for any non-ASCII
+ string value (Python escaped non-ASCII as `\uXXXX`; Rust emitted raw
+ UTF-8). The divergence was latent because every committed test fixture
+ and the conformance harness used ASCII-only content, but any production
+ manifest with a non-ASCII recipient_id, issuer_id, or filename would
+ have signed to different bytes across the two implementations and
+ failed cross-language verification. The port covers manifest signing,
+ manifest wire form, transparency-log leaf payloads, transparency-log
+ signed heads, sealed-container wrapped_dek, DSSE statement payloads,
+ DSSE envelope serialization, registry manifest verification, registry
+ sidecar comparison, and registry evidence-bundle signing. Standalone
+ tooling (sample generators, live demo, canary keeper) uses the
+ `sort_keys=True, ensure_ascii=False` form, which is byte-identical to
+ JCS for the no-floats subset these tools emit. `conformance_cross_lang.sh`
+ gains a non-ASCII `recipient_id` round trip that exercises the
+ divergence end-to-end and would fail under any non-JCS serialization.
+ Backward compatibility: for ASCII-only content (every committed fixture
+ and every existing test vector), the new JCS bytes are identical to the
+ old sort_keys bytes, so existing signatures continue to verify.
+
+### Earlier batches in v0.4.12 (pre-JCS, kept for review provenance)
- **Live registry deployment config.** `docker-compose.yml` now has a `live`
Caddy profile with public TLS routing for the registry, beacon, OCSP-style,
README.md +21 -13
@@ -8,7 +8,7 @@ Format-agnostic. Post-quantum ready (ML-KEM-768 + ML-DSA-65). Layered watermarki
No cloud vendor lock-in. No paid service required. No custom cryptography. Apache 2.0.
-**Website:** [https://oversight-protocol.github.io/oversight/](https://oversightprotocol.dev/)
+**Website:** [https://oversightprotocol.dev/](https://oversightprotocol.dev/)
**Mobile companion (verifier):** [oversight-protocol/oversight-mobile](https://github.com/oversight-protocol/oversight-mobile) - Flutter UI on top of the same Rust crates that power this CLI, currently in internal TestFlight beta.
**Outlook add-in pilot:** [oversightprotocol.dev/integrations/outlook/](https://oversightprotocol.dev/integrations/outlook/), read-mode task pane that verifies and decrypts sealed attachments with the same browser inspector modules.
@@ -364,17 +364,18 @@ These items are included in v0.4.4/v0.4.5 and current `main`:
## Repository layout
```
-oversight/ Python reference (6,800 LOC)
+oversight/ Python reference (5,689 LOC)
├── oversight_core/
│ ├── crypto.py X25519 + Ed25519 + XChaCha20 + HKDF + PQ hybrid
│ ├── container.py .sealed binary format
│ ├── manifest.py signed canonical-JSON manifest
│ ├── watermark.py L1 zero-width, L2 whitespace
│ ├── semantic.py L3 synonyms + punctuation
-│ ├── synonyms_v2.py 150-class expanded dictionary
+│ ├── synonyms_v2.py 151-class expanded dictionary
│ ├── policy.py not_after / max_opens / jurisdiction
│ ├── beacon.py DNS / HTTP / OCSP / license beacons
│ ├── tlog.py Merkle transparency log
+│ ├── rekor.py Sigstore Rekor v2 (DSSE + PAE)
│ ├── timestamp.py RFC 3161 (FreeTSA + DigiCert)
│ ├── decoy.py Ollama-powered decoy files
│ └── formats/{text,image,pdf,docx}.py
@@ -384,17 +385,24 @@ oversight/ Python reference (6,800 LOC)
│ ├── flywheel_oversight_match.py Flywheel scraper hook
│ └── perseus_canarykeeper.py Perseus Discord alert agent
├── cli/oversight.py
-├── tests/{test_e2e.py,test_e2e_v2.py,test_pq.py}
-└── docs/{SPEC.md,ROADMAP.md,RUNBOOK.md}
+├── tests/{test_e2e.py,test_e2e_v2.py,test_pq.py,...}
+└── docs/{SPEC.md,ROADMAP.md,EMBEDDING.md,security.md}
-oversight-rust/ Rust port (~1,500 LOC, core complete)
+oversight-rust/ Rust port (2,934 LOC)
├── Cargo.toml workspace
├── oversight-crypto/ X25519, Ed25519, XChaCha20, HKDF, zeroize
├── oversight-manifest/ JCS canonical JSON, Ed25519 sign/verify
├── oversight-container/ .sealed format parser, hard caps
├── oversight-watermark/ L1 + L2
+├── oversight-tlog/ RFC 6962 Merkle log, signed heads
+├── oversight-policy/ fs2 flock + atomic rename, TOCTOU-safe
+├── oversight-semantic/ 151-class dict + L3 airgap-survivor
+├── oversight-formats/ text, image (DCT), pdf, docx adapters
+├── oversight-rekor/ Sigstore Rekor v2 (DSSE + PAE)
+├── oversight-registry/ Axum + SQLite registry (parity with FastAPI)
├── oversight-cli/ keygen / seal / open / inspect
-└── tests/conformance_cross_lang.sh bit-for-bit Python<->Rust conformance
+├── fuzz/ cargo-fuzz harnesses (container, manifest)
+└── tests/conformance_*.sh bit-for-bit Python<->Rust conformance
```
## Quickstart
@@ -412,7 +420,7 @@ python tests/test_pq.py # 7 checks (needs liboqs)
```bash
cd oversight-rust
-cargo test --workspace # 21 checks
+cargo test --workspace # 142 checks
cargo run -- keygen --out alice.json
cargo run -- seal --input doc.txt --output doc.sealed \
--issuer issuer.json --recipient-pub <hex> --recipient-id alice@test
@@ -445,24 +453,24 @@ current stable line.
| Layer | Checks | Status |
|---|---|---|
-| Python pytest suite | 11 | green |
+| Python pytest suite | 15 | green |
| Rust oversight-container | 17 | green |
| Rust oversight-crypto | 21 | green |
| Rust oversight-formats | 40 | green |
| Rust oversight-manifest | 3 | green |
| Rust oversight-policy | 7 | green |
-| Rust oversight-registry | 12 | green |
+| Rust oversight-registry | 17 | green |
| Rust oversight-rekor | 10 | green |
| Rust oversight-semantic | 8 | green |
| Rust oversight-tlog | 14 | green |
| Rust oversight-watermark | 4 | green |
-| Cross-language conformance | 3 | green |
-| Total automated Rust unit tests | 136 | all green |
+| Cross-language conformance | 2 scripts | green |
+| Total automated Rust unit tests | 142 | all green |
## Design principles (what Oversight never does)
- **No custom cryptography.** Every primitive is NIST-standardized or equivalent. `x25519-dalek`, `ed25519-dalek`, `chacha20poly1305`, `hkdf`, `sha2`, ML-KEM-768, ML-DSA-65 via liboqs. That's the whole list.
-- **No cloud vendor lock-in.** Dropped the original AWS Nitro Enclaves plan. Hardware-key protection uses any FIDO2 device (YubiKey, OnlyKey, Nitrokey). Transparency log can run on public Sigstore Rekor or self-hosted; your choice.
+- **No cloud vendor lock-in.** Dropped the original AWS Nitro Enclaves plan. Hardware-key protection uses any PIV / PKCS#11 hardware key (YubiKey, Nitrokey, OnlyKey); see `docs/HARDWARE_KEYS.md`. Transparency log can run on public Sigstore Rekor or self-hosted; your choice.
- **No RATs, no defensive malware.** Every "phone home" mechanism is a passive beacon - the kind of network call a normal document reader makes during rendering (image fetch, OCSP lookup, DNS resolution). We never execute code on a reader's machine.
- **No tracking of personal identifiers.** Mark IDs are random 128-bit tokens. The registry maps them to recipient IDs that the issuer chose - the issuer decides how much identity binding to apply.
- **No paid service required.** Year-1 all-in cost estimate: ~$6,200 (YubiKeys + domain + one conference). See `docs/ROADMAP.md`.
SECURITY.md +53 -0
@@ -0,0 +1,53 @@
+# Security Policy
+
+## Reporting a Vulnerability
+
+Do not open a public GitHub issue for a suspected vulnerability.
+
+Preferred channels, in order:
+
+1. **GitHub Security Advisories.** Use the "Report a vulnerability" button on
+ the Security tab of `github.com/oversight-protocol/oversight`. The report is
+ private to the maintainers and feeds the coordinated disclosure workflow.
+2. **Email.** `zionboggan@gmail.com` with `[Oversight disclosure]` in the
+ subject line, as a fallback if the Security tab is unavailable.
+
+Include in the report:
+
+- the affected component (`oversight_core`, the specific `oversight-rust`
+ crate, the FastAPI or Axum registry, the CLI, or a deployment artifact);
+- a minimal reproduction or proof of concept;
+- the version tag or commit you tested against;
+- your assessment of impact and any exploit prerequisites.
+
+## Response
+
+Reports are acknowledged within 5 business days. A preliminary assessment
+follows within 14 days. Coordinated disclosure timing is decided per report
+based on severity and fix complexity. Reporters are credited in the release
+advisory unless they ask to remain unnamed.
+
+## Scope
+
+**In scope:**
+
+- the protocol code: `oversight_core` (Python reference), the `oversight-rust`
+ workspace, both registry implementations (FastAPI and Axum), and the CLI;
+- the `.sealed` container format, manifest signing, the transparency log, and
+ the Python to Rust cross-language conformance guarantees;
+- the deployment artifacts shipped in this repository (`Dockerfile`,
+ `docker-compose.yml`, `Caddyfile`).
+
+**Out of scope:**
+
+- vulnerabilities in third-party dependencies, which belong upstream;
+- self-hosted deployments that modified the shipped config;
+- attacks that require already compromising the operator account, the registry
+ identity key, or a recipient private key.
+
+## Security Design Notes
+
+The honest threat model, watermark layer limits, beacon guarantees, collusion
+caveats, and policy boundary notes live in `docs/security.md`. Read that
+document before relying on any single attribution signal. Oversight's
+attribution layers are forensic evidence, not proof.
docs/ROADMAP.md +2 -2
@@ -1,6 +1,6 @@
# Oversight Roadmap
-Last revised 2026-04-22. The launch plan is gated on product usability and
+Last revised 2026-06-16. The launch plan is gated on product usability and
threat-model honesty, not on a calendar date.
## Where we are
@@ -21,7 +21,7 @@ threat-model honesty, not on a calendar date.
or against a live URL. An operator claims v1 compatibility with
`OVERSIGHT_REGISTRY_URL=https://registry.example.org python3 tests/test_registry_conformance.py`.
4. **Browser inspector and classic-suite decrypt** shipped on
- `oversight-protocol.github.io/oversight/viewer/`. Drag-drop `.sealed`
+ `oversightprotocol.dev/viewer/`. Drag-drop `.sealed`
parsing, WebCrypto Ed25519 signature verification, canonical JSON
byte-identical to Python, optional registry lookups, and full
decryption of classic-suite sealed files using WebCrypto X25519 + HKDF-SHA256
docs/SPEC.md +5 -2
@@ -222,7 +222,7 @@ shared secret.
### 5.2 Manifest
-The manifest is canonical JSON (sorted keys, no whitespace, UTF-8). Required fields:
+The manifest is canonical JSON per RFC 8785 (JCS: keys sorted by UTF-16 code unit, no whitespace, non-ASCII emitted as raw UTF-8). Required fields:
- `file_id` (UUID v4)
- `issued_at` (unix seconds, UTC)
@@ -403,4 +403,7 @@ Reserved URN namespace: `urn:oversight:file:<file_id>`
## 13. Appendix A - Test vectors (normative)
-To follow in v0.2. Implementations SHOULD include a conformance test suite producing and verifying known sealed blobs.
+Cross-language conformance scripts live at `oversight-rust/tests/conformance_*.sh`
+and assert byte-identical seal/open and Rekor DSSE/PAE between the Python
+reference and the Rust port. Implementations SHOULD run them on every change
+and SHOULD add published byte-exact vectors for every suite they ship.
docs/V05_REKOR_PLAN.md +5 -0
@@ -1,5 +1,10 @@
# v0.5 - Sigstore Rekor v2 Migration Plan
+> **STATUS: Shipped.** v0.5 (Sigstore Rekor v2 integration) is live in
+> `oversight_core/rekor.py` and the `oversight-rekor` Rust crate, with
+> cross-language conformance enforced by `oversight-rust/tests/conformance_rekor.sh`.
+> This document is the original migration plan, kept for design context.
+
Drafted 2026-04-19. Approved scope: public Rekor v2 only (no self-host).
USENIX Cycle 2 strategy: v0.4.1 frozen as paper artifact safety net;
v0.5 lands as a stretch goal if evaluation work comes together first.
docs/spec/registry-v1.md +16 -11
@@ -35,20 +35,25 @@ reference clients can treat as interchangeable with the origin deployment.
### Canonicalization
The manifest signature is computed over a canonical JSON serialization
-with the following exact rules. Implementations that deviate cannot
-verify manifests produced by the reference client.
-
-1. Serialize the manifest dictionary with recursively sorted keys.
-2. Use the separators `","` and `":"` with no whitespace.
-3. Encode the resulting string as UTF-8 before feeding it to the
+per RFC 8785 (JSON Canonicalization Scheme, "JCS"). Implementations that
+deviate cannot verify manifests produced by the reference client.
+
+1. Keys are sorted recursively by UTF-16 code unit (RFC 8785 §3.2.3).
+2. String values are emitted as raw UTF-8 with only the mandatory JCS
+ escapes (`"`, `\`, and `U+0000`-`U+001F`); non-ASCII characters are
+ NOT escaped as `\uXXXX`.
+3. Separators are `","` and `":"` with no whitespace.
+4. The serialized string is encoded as UTF-8 before being fed to the
Ed25519 verifier.
-4. The `signature_ed25519` field is stripped before canonicalization
+5. The `signature_ed25519` field is stripped before canonicalization
and re-attached to the signed object before it is wire-transmitted.
-In Python the canonical form matches
-`json.dumps(manifest, sort_keys=True, separators=(",", ":")).encode("utf-8")`.
-In Rust the reference implementation uses the `canonical_json` crate
-with identical output. The cross-language conformance suite pins this.
+In Python the canonical form is produced by
+`oversight_core.jcs.jcs_dumps(manifest)`. In Rust the reference uses
+`serde_jcs::to_vec` with identical output. The cross-language conformance
+suite (`oversight-rust/tests/conformance_cross_lang.sh`) pins this with
+both an ASCII baseline and a non-ASCII `recipient_id` round trip that
+would diverge under any non-JCS serialization.
### Signature verification
examples/live_demo_v2.py +2 -1
@@ -15,6 +15,7 @@ from oversight_core import (
content_hash, seal, open_sealed, beacon, watermark,
)
from oversight_core import semantic
+from oversight_core.jcs import jcs_dumps
REG = "http://127.0.0.1:8765"
@@ -136,7 +137,7 @@ def main():
from cryptography.hazmat.primitives.asymmetric.ed25519 import Ed25519PublicKey
pub = Ed25519PublicKey.from_public_bytes(bytes.fromhex(bundle["registry_pub"]))
sig = bytes.fromhex(bundle.pop("bundle_signature_ed25519"))
- msg = json.dumps(bundle, sort_keys=True, separators=(",", ":")).encode("utf-8")
+ msg = jcs_dumps(bundle)
try:
pub.verify(sig, msg)
print(" [ok] bundle signature VERIFIED - this bundle came from this registry.")
integrations/perseus_canarykeeper.py +1 -1
@@ -121,7 +121,7 @@ class RegistryMonitor:
if not sig_hex:
log.warning(f"bundle for {file_id} has no signature")
return None
- msg = json.dumps(bundle, sort_keys=True, separators=(",", ":")).encode("utf-8")
+ msg = json.dumps(bundle, sort_keys=True, separators=(",", ":"), ensure_ascii=False).encode("utf-8")
try:
self.pinned_pub.verify(bytes.fromhex(sig_hex), msg)
except InvalidSignature:
oversight-rust/Cargo.lock +202 -39
@@ -14,7 +14,7 @@ version = "0.5.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d122413f284cf2d62fb1b7db97e02edb8cda96d769b16e443a4f6195e35662b0"
dependencies = [
- "crypto-common",
+ "crypto-common 0.1.7",
"generic-array",
]
@@ -239,6 +239,15 @@ dependencies = [
"generic-array",
]
+[[package]]
+name = "block-buffer"
+version = "0.12.1"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "d2f6c7dbe95a6ed67ad9f18e57daf93a2f034c524b99fd2b76d18fdfeb6660aa"
+dependencies = [
+ "hybrid-array",
+]
+
[[package]]
name = "bumpalo"
version = "3.20.2"
@@ -287,7 +296,7 @@ checksum = "c3613f74bd2eac03dad61bd53dbe620703d4371614fe0bc3b9f04dd36fe4e818"
dependencies = [
"cfg-if",
"cipher",
- "cpufeatures",
+ "cpufeatures 0.2.17",
]
[[package]]
@@ -323,7 +332,7 @@ version = "0.4.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "773f3b9af64447d2ce9850330c473515014aa235e6a783b02db81ff39e4a3dad"
dependencies = [
- "crypto-common",
+ "crypto-common 0.1.7",
"inout",
"zeroize",
]
@@ -368,6 +377,12 @@ version = "1.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c8d4a3bb8b1e0c1050499d1815f5ab16d04f0959b233085fb31653fbfc9d98f9"
+[[package]]
+name = "cmov"
+version = "0.5.4"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "0c9ea0ac24bc397ab3c98583a3c9ba74fa56b09a4449bbe172b9b1ddb016027a"
+
[[package]]
name = "colorchoice"
version = "1.0.5"
@@ -389,6 +404,12 @@ version = "0.9.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c2459377285ad874054d797f3ccebf984978aa39129f6eafde5cdc8315b612f8"
+[[package]]
+name = "const-oid"
+version = "0.10.2"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "a6ef517f0926dd24a1582492c791b6a4818a4d94e789a334894aa15b0d12f55c"
+
[[package]]
name = "core-foundation-sys"
version = "0.8.7"
@@ -404,6 +425,15 @@ dependencies = [
"libc",
]
+[[package]]
+name = "cpufeatures"
+version = "0.3.0"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "8b2a41393f66f16b0823bb79094d54ac5fbd34ab292ddafb9a0456ac9f87d201"
+dependencies = [
+ "libc",
+]
+
[[package]]
name = "crc"
version = "3.4.0"
@@ -469,7 +499,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0dc92fb57ca44df6db8059111ab3af99a63d5d0f8375d9972e319a379c6bab76"
dependencies = [
"generic-array",
- "rand_core",
+ "rand_core 0.6.4",
"subtle",
"zeroize",
]
@@ -481,10 +511,29 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "78c8292055d1c1df0cce5d180393dc8cce0abec0a7102adb6c7b1eef6016d60a"
dependencies = [
"generic-array",
- "rand_core",
+ "rand_core 0.6.4",
"typenum",
]
+[[package]]
+name = "crypto-common"
+version = "0.2.2"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "ce6e4c961d6cd6c9a86db418387425e8bdeaf05b3c8bc1411e6dca4c252f1453"
+dependencies = [
+ "hybrid-array",
+ "rand_core 0.10.1",
+]
+
+[[package]]
+name = "ctutils"
+version = "0.4.2"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "7d5515a3834141de9eafb9717ad39eea8247b5674e6066c404e8c4b365d2a29e"
+dependencies = [
+ "cmov",
+]
+
[[package]]
name = "curve25519-dalek"
version = "4.1.3"
@@ -492,9 +541,9 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "97fb8b7c4503de7d6ae7b42ab72a5a59857b4c937ec27a3d4539dba95b5ab2be"
dependencies = [
"cfg-if",
- "cpufeatures",
+ "cpufeatures 0.2.17",
"curve25519-dalek-derive",
- "digest",
+ "digest 0.10.7",
"fiat-crypto",
"rustc_version",
"subtle",
@@ -518,11 +567,21 @@ version = "0.7.10"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e7c1832837b905bbfb5101e07cc24c8deddf52f93225eee6ead5f4d63d53ddcb"
dependencies = [
- "const-oid",
+ "const-oid 0.9.6",
"pem-rfc7468",
"zeroize",
]
+[[package]]
+name = "der"
+version = "0.8.0"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "71fd89660b2dc699704064e59e9dba0147b903e85319429e131620d022be411b"
+dependencies = [
+ "const-oid 0.10.2",
+ "zeroize",
+]
+
[[package]]
name = "deranged"
version = "0.5.8"
@@ -549,12 +608,22 @@ version = "0.10.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9ed9a281f7bc9b7576e61468ba615a66a5c8cfdff42420a70aa82701a3b1e292"
dependencies = [
- "block-buffer",
- "const-oid",
- "crypto-common",
+ "block-buffer 0.10.4",
+ "const-oid 0.9.6",
+ "crypto-common 0.1.7",
"subtle",
]
+[[package]]
+name = "digest"
+version = "0.11.3"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "f1dd6dbb5841937940781866fa1281a1ff7bd3bf827091440879f9994983d5c2"
+dependencies = [
+ "block-buffer 0.12.1",
+ "crypto-common 0.2.2",
+]
+
[[package]]
name = "displaydoc"
version = "0.2.5"
@@ -578,12 +647,12 @@ version = "0.16.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ee27f32b5c5292967d2d4a9d7f1e0b0aed2c15daded5a60300e4abb9d8020bca"
dependencies = [
- "der",
- "digest",
+ "der 0.7.10",
+ "digest 0.10.7",
"elliptic-curve",
"rfc6979",
"signature",
- "spki",
+ "spki 0.7.3",
]
[[package]]
@@ -592,7 +661,7 @@ version = "2.2.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "115531babc129696a58c64a4fef0a8bf9e9698629fb97e9e40767d235cfbcd53"
dependencies = [
- "pkcs8",
+ "pkcs8 0.10.2",
"signature",
]
@@ -604,7 +673,7 @@ checksum = "70e796c081cee67dc755e1a36a0a172b897fab85fc3f6bc48307991f64e4eca9"
dependencies = [
"curve25519-dalek",
"ed25519",
- "rand_core",
+ "rand_core 0.6.4",
"serde",
"sha2",
"subtle",
@@ -628,14 +697,14 @@ checksum = "b5e6043086bf7973472e0c7dff2142ea0b680d30e18d9cc40f267efbf222bd47"
dependencies = [
"base16ct",
"crypto-bigint",
- "digest",
+ "digest 0.10.7",
"ff",
"generic-array",
"group",
"hkdf",
"pem-rfc7468",
- "pkcs8",
- "rand_core",
+ "pkcs8 0.10.2",
+ "rand_core 0.6.4",
"sec1",
"subtle",
"zeroize",
@@ -698,7 +767,7 @@ version = "0.13.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c0b50bfb653653f9ca9095b427bed08ab8d75a137839d9ad64eb11810d5b6393"
dependencies = [
- "rand_core",
+ "rand_core 0.6.4",
"subtle",
]
@@ -873,7 +942,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f0f9ef7462f7c099f518d754361858f86d8a07af53ba9af0fe635bbccb151a63"
dependencies = [
"ff",
- "rand_core",
+ "rand_core 0.6.4",
"subtle",
]
@@ -930,7 +999,7 @@ version = "0.12.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6c49c37c09c17a53d937dfbb742eb3a961d65a994e6bcdcf37e7399d0cc8ab5e"
dependencies = [
- "digest",
+ "digest 0.10.7",
]
[[package]]
@@ -978,6 +1047,16 @@ version = "1.0.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "df3b46402a9d5adb4c86a0cf463f42e19994e3ee891101b1841f30a545cb49a9"
+[[package]]
+name = "hybrid-array"
+version = "0.4.12"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "9155a582abd142abc056962c29e3ce5ff2ad5469f4246b537ed42c5deba857da"
+dependencies = [
+ "ctutils",
+ "typenum",
+]
+
[[package]]
name = "hyper"
version = "1.9.0"
@@ -1204,6 +1283,26 @@ dependencies = [
"wasm-bindgen",
]
+[[package]]
+name = "keccak"
+version = "0.2.0"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "9e24a010dd405bd7ed803e5253182815b41bf2e6a80cc3bfc066658e03a198aa"
+dependencies = [
+ "cfg-if",
+ "cpufeatures 0.3.0",
+]
+
+[[package]]
+name = "kem"
+version = "0.3.0"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "01737161ba802849cfd486b5bd209d38ba4943494c249a8126005170c7621edd"
+dependencies = [
+ "crypto-common 0.2.2",
+ "rand_core 0.10.1",
+]
+
[[package]]
name = "lazy_static"
version = "1.5.0"
@@ -1302,7 +1401,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d89e7ee0cfbedfc4da3340218492196241d89eefb6dab27de5df917a6d2e78cf"
dependencies = [
"cfg-if",
- "digest",
+ "digest 0.10.7",
]
[[package]]
@@ -1344,6 +1443,31 @@ dependencies = [
"windows-sys 0.61.2",
]
+[[package]]
+name = "ml-kem"
+version = "0.3.2"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "5e15f3e5b957493873e396a66914e83e616b6afe335cdef7efe5c6e1216aba66"
+dependencies = [
+ "hybrid-array",
+ "kem",
+ "module-lattice",
+ "pkcs8 0.11.0",
+ "rand_core 0.10.1",
+ "sha3",
+]
+
+[[package]]
+name = "module-lattice"
+version = "0.2.3"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "0c61b87c9683ab7cb1c6871d261ad5479b6b10ceb52c4352aaca3b5d35a8febe"
+dependencies = [
+ "ctutils",
+ "hybrid-array",
+ "num-traits",
+]
+
[[package]]
name = "moxcms"
version = "0.8.1"
@@ -1462,8 +1586,10 @@ dependencies = [
"ed25519-dalek",
"hex",
"hkdf",
+ "ml-kem",
"p256",
- "rand_core",
+ "rand_core 0.6.4",
+ "serde",
"serde_json",
"sha2",
"thiserror 1.0.69",
@@ -1526,12 +1652,13 @@ dependencies = [
"oversight-manifest",
"oversight-rekor",
"oversight-tlog",
- "rand_core",
+ "rand_core 0.6.4",
"serde",
"serde_jcs",
"serde_json",
"sha2",
"sqlx",
+ "subtle",
"thiserror 1.0.69",
"tokio",
"tower 0.4.13",
@@ -1547,7 +1674,7 @@ dependencies = [
"base64",
"ed25519-dalek",
"hex",
- "rand_core",
+ "rand_core 0.6.4",
"serde",
"serde_jcs",
"serde_json",
@@ -1584,7 +1711,7 @@ dependencies = [
name = "oversight-watermark"
version = "0.5.0"
dependencies = [
- "rand_core",
+ "rand_core 0.6.4",
]
[[package]]
@@ -1655,8 +1782,18 @@ version = "0.10.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f950b2377845cebe5cf8b5165cb3cc1a5e0fa5cfa3e1f7f55707d8fd82e0a7b7"
dependencies = [
- "der",
- "spki",
+ "der 0.7.10",
+ "spki 0.7.3",
+]
+
+[[package]]
+name = "pkcs8"
+version = "0.11.0"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "451913da69c775a56034ea8d9003d27ee8948e12443eae7c038ba100a4f21cb7"
+dependencies = [
+ "der 0.8.0",
+ "spki 0.8.0",
]
[[package]]
@@ -1684,7 +1821,7 @@ version = "0.8.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8159bd90725d2df49889a078b54f4f79e87f1f8a8444194cdca81d38f5393abf"
dependencies = [
- "cpufeatures",
+ "cpufeatures 0.2.17",
"opaque-debug",
"universal-hash",
]
@@ -1781,6 +1918,12 @@ dependencies = [
"getrandom 0.2.17",
]
+[[package]]
+name = "rand_core"
+version = "0.10.1"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "63b8176103e19a2643978565ca18b50549f6101881c443590420e4dc998a3c69"
+
[[package]]
name = "rangemap"
version = "1.7.1"
@@ -1980,9 +2123,9 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d3e97a565f76233a6003f9f5c54be1d9c5bdfa3eccfb189469f11ec4901c47dc"
dependencies = [
"base16ct",
- "der",
+ "der 0.7.10",
"generic-array",
- "pkcs8",
+ "pkcs8 0.10.2",
"subtle",
"zeroize",
]
@@ -2077,8 +2220,18 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a7507d819769d01a365ab707794a4084392c824f54a7a6a7862f8c3d0892b283"
dependencies = [
"cfg-if",
- "cpufeatures",
- "digest",
+ "cpufeatures 0.2.17",
+ "digest 0.10.7",
+]
+
+[[package]]
+name = "sha3"
+version = "0.11.0"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "be176f1a57ce4e3d31c1a166222d9768de5954f811601fb7ca06fc8203905ce1"
+dependencies = [
+ "digest 0.11.3",
+ "keccak",
]
[[package]]
@@ -2112,8 +2265,8 @@ version = "2.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "77549399552de45a898a580c1b41d445bf730df867cc44e6c0233bbc4b8329de"
dependencies = [
- "digest",
- "rand_core",
+ "digest 0.10.7",
+ "rand_core 0.6.4",
]
[[package]]
@@ -2160,7 +2313,17 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d91ed6c858b01f942cd56b37a94b3e0a1798290327d1236e4d9cf4eaca44d29d"
dependencies = [
"base64ct",
- "der",
+ "der 0.7.10",
+]
+
+[[package]]
+name = "spki"
+version = "0.8.0"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "1d9efca8738c78ee9484207732f728b1ef517bbb1833d6fc0879ca898a522f6f"
+dependencies = [
+ "base64ct",
+ "der 0.8.0",
]
[[package]]
@@ -2613,7 +2776,7 @@ version = "0.5.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "fc1de2c688dc15305988b563c3854064043356019f97a4b46276fe734c4f07ea"
dependencies = [
- "crypto-common",
+ "crypto-common 0.1.7",
"subtle",
]
@@ -3091,7 +3254,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c7e468321c81fb07fa7f4c636c3972b9100f0346e5b6a9f2bd0603a52f7ed277"
dependencies = [
"curve25519-dalek",
- "rand_core",
+ "rand_core 0.6.4",
"serde",
"zeroize",
]
oversight-rust/Cargo.toml +9 -1
@@ -18,7 +18,7 @@ exclude = ["fuzz"]
[workspace.package]
version = "0.5.0"
edition = "2021"
-rust-version = "1.75"
+rust-version = "1.85"
license = "Apache-2.0"
repository = "https://github.com/oversight-protocol/oversight"
authors = ["Oversight contributors"]
@@ -36,6 +36,9 @@ chacha20poly1305 = { version = "0.10", features = ["alloc"] }
hkdf = "0.12"
sha2 = "0.10"
rand_core = "0.6.2"
+# ML-KEM-768 (FIPS 203) for the OSGT-HYBRID-v1 key-encapsulation half.
+# Pure Rust, RustCrypto-published, no C build (liboqs would need cmake).
+ml-kem = "0.3"
# Serialization / encoding
serde = { version = "1", features = ["derive"] }
@@ -51,6 +54,11 @@ uuid = { version = "1", features = ["v4", "serde"] }
# For canonical JSON
serde_jcs = "0.1"
+# Constant-time primitives (token comparison, Choice-based bool handling).
+# Already a transitive dep via ed25519-dalek / chacha20poly1305; declared
+# explicitly so auth code can rely on it without going through another crate.
+subtle = "2"
+
[profile.release]
opt-level = 3
lto = true
oversight-rust/oversight-container/src/lib.rs +99 -44
@@ -311,27 +311,7 @@ pub fn open_sealed(
}
}
- let now = std::time::SystemTime::now()
- .duration_since(std::time::UNIX_EPOCH)
- .map(|d| d.as_secs() as i64)
- .unwrap_or(0);
- if let Some(na) = sf.manifest.policy.get("not_after").and_then(|v| v.as_i64()) {
- if now > na {
- return Err(ContainerError::Precondition("file expired (not_after)"));
- }
- }
- if let Some(nb) = sf
- .manifest
- .policy
- .get("not_before")
- .and_then(|v| v.as_i64())
- {
- if now < nb {
- return Err(ContainerError::Precondition(
- "file not yet released (not_before)",
- ));
- }
- }
+ oversight_policy::check_policy(&sf.manifest, policy_ctx)?;
let dek = if let Some(slots) = sf.wrapped_dek.get("slots").and_then(|v| v.as_array()) {
let mut recovered = None;
@@ -454,27 +434,7 @@ pub fn open_sealed_with_provider(
}
}
- let now = std::time::SystemTime::now()
- .duration_since(std::time::UNIX_EPOCH)
- .map(|d| d.as_secs() as i64)
- .unwrap_or(0);
- if let Some(na) = sf.manifest.policy.get("not_after").and_then(|v| v.as_i64()) {
- if now > na {
- return Err(ContainerError::Precondition("file expired (not_after)"));
- }
- }
- if let Some(nb) = sf
- .manifest
- .policy
- .get("not_before")
- .and_then(|v| v.as_i64())
- {
- if now < nb {
- return Err(ContainerError::Precondition(
- "file not yet released (not_before)",
- ));
- }
- }
+ oversight_policy::check_policy(&sf.manifest, policy_ctx)?;
let dek = match (sf.suite_id, provider.algorithm()) {
(SUITE_CLASSIC_V1_ID, KeyAlgorithm::X25519) => {
@@ -862,11 +822,106 @@ mod tests {
)
.unwrap();
match open_sealed(&blob, alice.x25519_priv.as_ref(), None, None) {
- Err(ContainerError::Precondition("file expired (not_after)")) => (),
- other => panic!("expected expiry error, got {:?}", other.is_ok()),
+ Err(ContainerError::Policy(err)) => assert!(
+ err.to_string().contains("expired"),
+ "expected expiry violation, got: {err}"
+ ),
+ other => panic!("expected Policy expiry error, got {:?}", other.is_ok()),
}
}
+ #[test]
+ fn jurisdiction_mismatch_rejected_in_open_sealed() {
+ let issuer = ClassicIdentity::generate();
+ let alice = ClassicIdentity::generate();
+ let plaintext = b"region-locked";
+ let mut m = make_manifest(&issuer, &alice, plaintext);
+ m.policy["jurisdiction"] = serde_json::json!("EU");
+ let blob = seal(
+ plaintext,
+ &mut m,
+ issuer.ed25519_priv.as_ref(),
+ &alice.x25519_pub,
+ )
+ .unwrap();
+
+ let ctx = PolicyContext::default().with_jurisdiction("US");
+ match open_sealed(&blob, alice.x25519_priv.as_ref(), None, Some(&ctx)) {
+ Err(ContainerError::Policy(err)) => assert!(
+ err.to_string().contains("Jurisdiction mismatch"),
+ "expected jurisdiction violation, got: {err}"
+ ),
+ other => panic!(
+ "expected Policy jurisdiction error, got ok={:?}",
+ other.is_ok()
+ ),
+ }
+ }
+
+ #[test]
+ fn jurisdiction_match_allows_open() {
+ let issuer = ClassicIdentity::generate();
+ let alice = ClassicIdentity::generate();
+ let plaintext = b"region-locked";
+ let mut m = make_manifest(&issuer, &alice, plaintext);
+ m.policy["jurisdiction"] = serde_json::json!("EU");
+ let blob = seal(
+ plaintext,
+ &mut m,
+ issuer.ed25519_priv.as_ref(),
+ &alice.x25519_pub,
+ )
+ .unwrap();
+
+ let ctx = PolicyContext::default().with_jurisdiction("EU");
+ let opened = open_sealed(&blob, alice.x25519_priv.as_ref(), None, Some(&ctx));
+ assert!(
+ opened.is_ok(),
+ "matching jurisdiction must open: {:?}",
+ opened.err()
+ );
+ }
+
+ #[test]
+ fn jurisdiction_global_allows_open() {
+ let issuer = ClassicIdentity::generate();
+ let alice = ClassicIdentity::generate();
+ let plaintext = b"global-default";
+ let mut m = make_manifest(&issuer, &alice, plaintext);
+ m.policy["jurisdiction"] = serde_json::json!("GLOBAL");
+ let blob = seal(
+ plaintext,
+ &mut m,
+ issuer.ed25519_priv.as_ref(),
+ &alice.x25519_pub,
+ )
+ .unwrap();
+
+ let ctx = PolicyContext::default();
+ assert!(open_sealed(&blob, alice.x25519_priv.as_ref(), None, Some(&ctx)).is_ok());
+ }
+
+ #[test]
+ fn jurisdiction_skipped_when_no_ctx_supplied() {
+ let issuer = ClassicIdentity::generate();
+ let alice = ClassicIdentity::generate();
+ let plaintext = b"region-locked";
+ let mut m = make_manifest(&issuer, &alice, plaintext);
+ m.policy["jurisdiction"] = serde_json::json!("EU");
+ let blob = seal(
+ plaintext,
+ &mut m,
+ issuer.ed25519_priv.as_ref(),
+ &alice.x25519_pub,
+ )
+ .unwrap();
+
+ // Parity with oversight_core.policy.check_policy: a non-GLOBAL
+ // jurisdiction is not enforced when no context is supplied. This is
+ // the raw-open path used by the cross-language conformance harness.
+ assert!(open_sealed(&blob, alice.x25519_priv.as_ref(), None, None).is_ok());
+ }
+
#[test]
fn max_opens_counts_only_successful_decrypts() {
let issuer = ClassicIdentity::generate();
oversight-rust/oversight-crypto/Cargo.toml +2 -0
@@ -14,6 +14,8 @@ chacha20poly1305.workspace = true
hkdf.workspace = true
sha2.workspace = true
rand_core = { workspace = true, features = ["getrandom"] }
+ml-kem.workspace = true
+serde = { workspace = true }
hex.workspace = true
zeroize.workspace = true
thiserror.workspace = true
oversight-rust/oversight-crypto/examples/hybrid_kem_cli.rs +64 -0
@@ -0,0 +1,64 @@
+//! Standalone helper for cross-language ML-KEM-768 hybrid KEM conformance.
+//!
+//! This is a test aid, not a shipped CLI surface. It exposes the Rust
+//! `hybrid_wrap_dek` / `hybrid_unwrap_dek` / `mlkem768_generate_keypair` to a
+//! shell so `tests/conformance_hybrid_kem.py` can drive both sides of a
+//! Python <-> Rust round trip.
+//!
+//! Usage:
+//! hybrid_kem_cli keygen
+//! -> stdout: {"x_pub":hex,"x_priv":hex,"mlkem_pub":hex,"mlkem_seed":hex}
+//! hybrid_kem_cli wrap <x_pub_hex> <mlkem_pub_hex> <dek_hex>
+//! -> stdout: HybridEnvelope JSON
+//! hybrid_kem_cli unwrap <env_json_file> <x_priv_hex> <mlkem_seed_hex>
+//! -> stdout: recovered dek hex
+
+use std::env;
+
+use oversight_crypto::{
+ hybrid_unwrap_dek, hybrid_wrap_dek, mlkem768_generate_keypair, ClassicIdentity, HybridEnvelope,
+};
+
+fn main() {
+ let args: Vec<String> = env::args().collect();
+ let cmd = args
+ .get(1)
+ .expect("usage: keygen | wrap <x_pub> <mlkem_pub> <dek> | unwrap <env_file> <x_priv> <mlkem_seed>");
+
+ match cmd.as_str() {
+ "keygen" => {
+ let id = ClassicIdentity::generate();
+ let (mlkem_pub, mlkem_seed) = mlkem768_generate_keypair();
+ println!(
+ "{{\"x_pub\":\"{}\",\"x_priv\":\"{}\",\"mlkem_pub\":\"{}\",\"mlkem_seed\":\"{}\"}}",
+ hex::encode(id.x25519_pub),
+ hex::encode(&id.x25519_priv[..]),
+ hex::encode(&mlkem_pub),
+ hex::encode(mlkem_seed),
+ );
+ }
+ "wrap" => {
+ let x_pub = hex::decode(args.get(2).expect("missing x_pub")).expect("x_pub hex");
+ let mlkem_pub =
+ hex::decode(args.get(3).expect("missing mlkem_pub")).expect("mlkem_pub hex");
+ let dek = hex::decode(args.get(4).expect("missing dek")).expect("dek hex");
+ let env = hybrid_wrap_dek(&dek, &x_pub, &mlkem_pub).expect("hybrid_wrap_dek");
+ println!("{}", serde_json::to_string(&env).unwrap());
+ }
+ "unwrap" => {
+ let env_json =
+ std::fs::read_to_string(args.get(2).expect("missing env_file")).expect("read env");
+ let env: HybridEnvelope =
+ serde_json::from_str(&env_json).expect("parse HybridEnvelope");
+ let x_priv = hex::decode(args.get(3).expect("missing x_priv")).expect("x_priv hex");
+ let mlkem_seed =
+ hex::decode(args.get(4).expect("missing mlkem_seed")).expect("mlkem_seed hex");
+ let dek = hybrid_unwrap_dek(&env, &x_priv, &mlkem_seed).expect("hybrid_unwrap_dek");
+ println!("{}", hex::encode(&*dek));
+ }
+ other => {
+ eprintln!("unknown command: {other}");
+ std::process::exit(2);
+ }
+ }
+}
oversight-rust/oversight-crypto/src/lib.rs +282 -0
@@ -33,6 +33,13 @@ use ed25519_dalek::{
VerifyingKey as EdVerifyingKey,
};
use hkdf::Hkdf;
+use ml_kem::{
+ ml_kem_768::{
+ Ciphertext as MlKem768Ciphertext, DecapsulationKey as MlKem768DecapsulationKey,
+ EncapsulationKey as MlKem768EncapsulationKey,
+ },
+ Decapsulate, KeyExport,
+};
use p256::{
ecdh::diffie_hellman as p256_diffie_hellman, elliptic_curve::sec1::ToEncodedPoint,
PublicKey as P256PublicKey, SecretKey as P256SecretKey,
@@ -52,6 +59,12 @@ pub const DEK_LEN: usize = 32;
/// P-256 public key in SEC1 uncompressed encoding (`0x04 || X || Y`).
pub const P256_PUBLIC_KEY_LEN: usize = 65;
+/// ML-KEM-768 (FIPS 203) byte sizes for the OSGT-HYBRID-v1 KEM half.
+pub const MLKEM768_PUB_LEN: usize = 1184;
+pub const MLKEM768_CT_LEN: usize = 1088;
+pub const MLKEM768_SEED_LEN: usize = 64;
+pub const MLKEM768_SHARED_SECRET_LEN: usize = 32;
+
pub const SUITE_CLASSIC_V1: &str = "OSGT-CLASSIC-v1";
pub const SUITE_HYBRID_V1: &str = "OSGT-HYBRID-v1";
/// Hardware-backed recipients use P-256 ECDH so PIV-compatible tokens
@@ -73,6 +86,8 @@ pub enum CryptoError {
Hkdf,
#[error("missing wrapped-DEK field: {0}")]
MissingField(&'static str),
+ #[error("ML-KEM error: {0}")]
+ Kem(String),
}
// -------------------------- Identity --------------------------
@@ -310,6 +325,187 @@ pub fn unwrap_dek(
Ok(Zeroizing::new(plaintext))
}
+// ----------------------- Hybrid (OSGT-HYBRID-v1) -----------------------
+
+/// Wire shape for a hybrid-wrapped DEK. Byte-for-byte JSON compatible with
+/// the Python reference (`oversight_core.crypto.hybrid_wrap_dek`): every value
+/// is hex-encoded, keys are exactly `{suite, x25519_ephemeral_pub,
+/// mlkem_ciphertext, nonce, wrapped_dek}`.
+#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
+pub struct HybridEnvelope {
+ pub suite: String,
+ pub x25519_ephemeral_pub: String,
+ pub mlkem_ciphertext: String,
+ pub nonce: String,
+ pub wrapped_dek: String,
+}
+
+/// Generate an ML-KEM-768 recipient keypair. Returns the 1184-byte NIST
+/// encapsulation (public) key and the 64-byte seed from which the
+/// decapsulation (private) key is derived. The seed form is the RustCrypto
+/// ml-kem crate's recommended encoding; the private key never needs to cross
+/// the language boundary, only the public key does.
+pub fn mlkem768_generate_keypair() -> (Vec<u8>, [u8; MLKEM768_SEED_LEN]) {
+ let mut seed = [0u8; MLKEM768_SEED_LEN];
+ OsRng.fill_bytes(&mut seed);
+ let dk = MlKem768DecapsulationKey::from_seed(seed.into());
+ let pub_bytes = dk.encapsulation_key().to_bytes().to_vec();
+ (pub_bytes, seed)
+}
+
+/// Hybrid DEK wrap: combines an X25519 ECDH shared secret and an ML-KEM-768
+/// KEM shared secret via HKDF. An attacker must break BOTH X25519 and
+/// ML-KEM-768 to recover the KEK. The HKDF IKM binds the KEK to the full
+/// encapsulation (both shared secrets, the X25519 ephemeral pub, and the
+/// ML-KEM ciphertext), mirroring `oversight_core.crypto.hybrid_wrap_dek` so
+/// Rust and Python produce byte-identical envelopes.
+pub fn hybrid_wrap_dek(
+ dek: &[u8],
+ recipient_x25519_pub: &[u8],
+ recipient_mlkem_pub: &[u8],
+) -> Result<HybridEnvelope, CryptoError> {
+ if recipient_x25519_pub.len() != X25519_KEY_LEN {
+ return Err(CryptoError::InvalidKeyLength {
+ expected: X25519_KEY_LEN,
+ got: recipient_x25519_pub.len(),
+ });
+ }
+ if recipient_mlkem_pub.len() != MLKEM768_PUB_LEN {
+ return Err(CryptoError::InvalidKeyLength {
+ expected: MLKEM768_PUB_LEN,
+ got: recipient_mlkem_pub.len(),
+ });
+ }
+
+ let mut eph_bytes = Zeroizing::new([0u8; X25519_KEY_LEN]);
+ OsRng.fill_bytes(eph_bytes.as_mut());
+ let eph = X25519StaticSecret::from(*eph_bytes);
+ let eph_pub = X25519PublicKey::from(&eph);
+ let eph_pub_bytes = eph_pub.to_bytes();
+
+ let mut peer_arr = [0u8; X25519_KEY_LEN];
+ peer_arr.copy_from_slice(recipient_x25519_pub);
+ let peer = X25519PublicKey::from(peer_arr);
+ let ss_x = Zeroizing::new(eph.diffie_hellman(&peer).to_bytes());
+
+ let ek_fixed: [u8; MLKEM768_PUB_LEN] = recipient_mlkem_pub
+ .try_into()
+ .map_err(|_| CryptoError::InvalidKeyLength {
+ expected: MLKEM768_PUB_LEN,
+ got: recipient_mlkem_pub.len(),
+ })?;
+ let ek = MlKem768EncapsulationKey::new(&ek_fixed.into())
+ .map_err(|_| CryptoError::Kem("invalid ML-KEM-768 encapsulation key".into()))?;
+ // FIPS 203 encapsulation randomness is a single 32-byte message `m`; we
+ // sample it with the workspace OsRng and call the deterministic path,
+ // which is output- and security-identical to encapsulate_with_rng.
+ let mut m = [0u8; MLKEM768_SHARED_SECRET_LEN];
+ OsRng.fill_bytes(&mut m);
+ let (mlkem_ct, ss_pq) = ek.encapsulate_deterministic(&m.into());
+
+ let mut ikm = Vec::with_capacity(
+ X25519_KEY_LEN + MLKEM768_SHARED_SECRET_LEN + X25519_KEY_LEN + MLKEM768_CT_LEN,
+ );
+ ikm.extend_from_slice(ss_x.as_ref());
+ ikm.extend_from_slice(ss_pq.as_slice());
+ ikm.extend_from_slice(&eph_pub_bytes);
+ ikm.extend_from_slice(mlkem_ct.as_slice());
+
+ let hk = Hkdf::<Sha256>::new(None, &ikm);
+ let mut kek = Zeroizing::new([0u8; 32]);
+ hk.expand(b"oversight-hybrid-v1-dek-wrap", kek.as_mut())
+ .map_err(|_| CryptoError::Hkdf)?;
+
+ let (nonce, wrapped) = aead_encrypt(kek.as_ref(), dek, b"oversight-hybrid-dek")?;
+
+ Ok(HybridEnvelope {
+ suite: SUITE_HYBRID_V1.to_string(),
+ x25519_ephemeral_pub: hex::encode(eph_pub_bytes),
+ mlkem_ciphertext: hex::encode(mlkem_ct.as_slice()),
+ nonce: hex::encode(nonce),
+ wrapped_dek: hex::encode(&wrapped),
+ })
+}
+
+/// Recover a DEK from a hybrid-wrapped envelope. The recipient supplies its
+/// 32-byte X25519 private key and its 64-byte ML-KEM-768 seed; both shared
+/// secrets are recomputed and the KEK is derived with the same HKDF binding
+/// as `hybrid_wrap_dek`. Mirrors `oversight_core.crypto.hybrid_unwrap_dek`.
+pub fn hybrid_unwrap_dek(
+ envelope: &HybridEnvelope,
+ recipient_x25519_priv: &[u8],
+ recipient_mlkem_seed: &[u8],
+) -> Result<Zeroizing<Vec<u8>>, CryptoError> {
+ if recipient_x25519_priv.len() != X25519_KEY_LEN {
+ return Err(CryptoError::InvalidKeyLength {
+ expected: X25519_KEY_LEN,
+ got: recipient_x25519_priv.len(),
+ });
+ }
+ if recipient_mlkem_seed.len() != MLKEM768_SEED_LEN {
+ return Err(CryptoError::InvalidKeyLength {
+ expected: MLKEM768_SEED_LEN,
+ got: recipient_mlkem_seed.len(),
+ });
+ }
+
+ let eph_pub_bytes = hex::decode(&envelope.x25519_ephemeral_pub).map_err(CryptoError::Hex)?;
+ let mlkem_ct_bytes = hex::decode(&envelope.mlkem_ciphertext).map_err(CryptoError::Hex)?;
+ if eph_pub_bytes.len() != X25519_KEY_LEN {
+ return Err(CryptoError::InvalidKeyLength {
+ expected: X25519_KEY_LEN,
+ got: eph_pub_bytes.len(),
+ });
+ }
+ if mlkem_ct_bytes.len() != MLKEM768_CT_LEN {
+ return Err(CryptoError::InvalidKeyLength {
+ expected: MLKEM768_CT_LEN,
+ got: mlkem_ct_bytes.len(),
+ });
+ }
+
+ let mut priv_arr = Zeroizing::new([0u8; X25519_KEY_LEN]);
+ priv_arr.as_mut().copy_from_slice(recipient_x25519_priv);
+ let sk = X25519StaticSecret::from(*priv_arr);
+ let mut eph_pub_arr = [0u8; X25519_KEY_LEN];
+ eph_pub_arr.copy_from_slice(&eph_pub_bytes);
+ let eph_pub = X25519PublicKey::from(eph_pub_arr);
+ let ss_x = Zeroizing::new(sk.diffie_hellman(&eph_pub).to_bytes());
+
+ let mut seed_arr = [0u8; MLKEM768_SEED_LEN];
+ seed_arr.copy_from_slice(recipient_mlkem_seed);
+ let dk = MlKem768DecapsulationKey::from_seed(seed_arr.into());
+ let ct_fixed: [u8; MLKEM768_CT_LEN] = mlkem_ct_bytes
+ .as_slice()
+ .try_into()
+ .map_err(|_| CryptoError::InvalidKeyLength {
+ expected: MLKEM768_CT_LEN,
+ got: mlkem_ct_bytes.len(),
+ })?;
+ let ct: MlKem768Ciphertext = ct_fixed.into();
+ let ss_pq = dk.decapsulate(&ct);
+ let mut ss_pq_bytes = Zeroizing::new([0u8; MLKEM768_SHARED_SECRET_LEN]);
+ ss_pq_bytes.copy_from_slice(ss_pq.as_slice());
+
+ let mut ikm = Vec::with_capacity(
+ X25519_KEY_LEN + MLKEM768_SHARED_SECRET_LEN + X25519_KEY_LEN + MLKEM768_CT_LEN,
+ );
+ ikm.extend_from_slice(ss_x.as_ref());
+ ikm.extend_from_slice(ss_pq_bytes.as_ref());
+ ikm.extend_from_slice(&eph_pub_bytes);
+ ikm.extend_from_slice(&mlkem_ct_bytes);
+
+ let hk = Hkdf::<Sha256>::new(None, &ikm);
+ let mut kek = Zeroizing::new([0u8; 32]);
+ hk.expand(b"oversight-hybrid-v1-dek-wrap", kek.as_mut())
+ .map_err(|_| CryptoError::Hkdf)?;
+
+ let nonce = hex::decode(&envelope.nonce).map_err(CryptoError::Hex)?;
+ let wrapped = hex::decode(&envelope.wrapped_dek).map_err(CryptoError::Hex)?;
+ let plaintext = aead_decrypt(kek.as_ref(), &nonce, &wrapped, b"oversight-hybrid-dek")?;
+ Ok(Zeroizing::new(plaintext))
+}
+
// -------------------------- KeyProvider --------------------------
/// Algorithm a [`KeyProvider`] uses for ECDH.
@@ -999,4 +1195,90 @@ mod tests {
let recovered = unwrap_dek_with_provider(&wrapped, &provider).unwrap();
assert_eq!(&recovered[..], dek.as_ref());
}
+
+ // ----------------------- Hybrid (OSGT-HYBRID-v1) -----------------------
+
+ fn hybrid_recipient() -> (ClassicIdentity, Vec<u8>, [u8; MLKEM768_SEED_LEN]) {
+ let id = ClassicIdentity::generate();
+ let (mlkem_pub, mlkem_seed) = mlkem768_generate_keypair();
+ (id, mlkem_pub, mlkem_seed)
+ }
+
+ #[test]
+ fn hybrid_dek_round_trips() {
+ let (alice, mlkem_pub, mlkem_seed) = hybrid_recipient();
+ let dek = random_dek();
+ let env = hybrid_wrap_dek(dek.as_ref(), &alice.x25519_pub, &mlkem_pub).unwrap();
+ let recovered = hybrid_unwrap_dek(&env, &alice.x25519_priv[..], &mlkem_seed).unwrap();
+ assert_eq!(&recovered[..], dek.as_ref());
+ }
+
+ #[test]
+ fn hybrid_envelope_json_shape() {
+ let (alice, mlkem_pub, _) = hybrid_recipient();
+ let dek = random_dek();
+ let env = hybrid_wrap_dek(dek.as_ref(), &alice.x25519_pub, &mlkem_pub).unwrap();
+ let json: serde_json::Value = serde_json::to_value(&env).unwrap();
+ assert_eq!(json["suite"].as_str(), Some(SUITE_HYBRID_V1));
+ for k in ["x25519_ephemeral_pub", "mlkem_ciphertext", "nonce", "wrapped_dek"] {
+ assert!(json[k].is_string(), "envelope missing hex field {k}");
+ assert!(hex::decode(json[k].as_str().unwrap()).is_ok(), "{k} not hex");
+ }
+ let ct = hex::decode(json["mlkem_ciphertext"].as_str().unwrap()).unwrap();
+ assert_eq!(ct.len(), MLKEM768_CT_LEN);
+ let parsed: HybridEnvelope = serde_json::from_value(json).unwrap();
+ assert_eq!(parsed.suite, env.suite);
+ assert_eq!(parsed.mlkem_ciphertext, env.mlkem_ciphertext);
+ }
+
+ #[test]
+ fn hybrid_tamper_classical_half_rejected() {
+ let (alice, mlkem_pub, mlkem_seed) = hybrid_recipient();
+ let dek = random_dek();
+ let mut env = hybrid_wrap_dek(dek.as_ref(), &alice.x25519_pub, &mlkem_pub).unwrap();
+ let other = ClassicIdentity::generate();
+ env.x25519_ephemeral_pub = hex::encode(other.x25519_pub);
+ let res = hybrid_unwrap_dek(&env, &alice.x25519_priv[..], &mlkem_seed);
+ assert!(res.is_err(), "tampered classical half must fail unwrap");
+ }
+
+ #[test]
+ fn hybrid_tamper_pq_half_rejected() {
+ let (alice, mlkem_pub, mlkem_seed) = hybrid_recipient();
+ let dek = random_dek();
+ let mut env = hybrid_wrap_dek(dek.as_ref(), &alice.x25519_pub, &mlkem_pub).unwrap();
+ let mut ct = hex::decode(&env.mlkem_ciphertext).unwrap();
+ ct[100] ^= 0x01;
+ env.mlkem_ciphertext = hex::encode(&ct);
+ let res = hybrid_unwrap_dek(&env, &alice.x25519_priv[..], &mlkem_seed);
+ assert!(res.is_err(), "tampered PQ half must fail unwrap");
+ }
+
+ #[test]
+ fn hybrid_wrong_recipient_rejected() {
+ let (alice, mlkem_pub, _) = hybrid_recipient();
+ let dek = random_dek();
+ let env = hybrid_wrap_dek(dek.as_ref(), &alice.x25519_pub, &mlkem_pub).unwrap();
+ let (bob, _bob_pub, bob_seed) = hybrid_recipient();
+ let res = hybrid_unwrap_dek(&env, &bob.x25519_priv[..], &bob_seed);
+ assert!(res.is_err(), "wrong recipient must not unwrap");
+ }
+
+ #[test]
+ fn hybrid_overhead_is_bounded() {
+ let (alice, mlkem_pub, _) = hybrid_recipient();
+ let dek = random_dek();
+ let env = hybrid_wrap_dek(dek.as_ref(), &alice.x25519_pub, &mlkem_pub).unwrap();
+ let json = serde_json::to_string(&env).unwrap();
+ assert!(
+ json.len() < 4096,
+ "hybrid envelope JSON unexpectedly large: {}",
+ json.len()
+ );
+ assert!(
+ json.len() as usize > 2 * MLKEM768_CT_LEN,
+ "hybrid envelope suspiciously small: {}",
+ json.len()
+ );
+ }
}
oversight-rust/oversight-manifest/src/lib.rs +43 -0
@@ -320,4 +320,47 @@ mod tests {
let parsed: Manifest = serde_json::from_value(value).unwrap();
assert!(parsed.verify().unwrap());
}
+
+ // RFC 8785 JCS byte-vector pin. Mirrors tests/test_jcs_canonical_unit.py
+ // so both sides of the cross-language conformance suite anchor the same
+ // canonical bytes. If serde_jcs ever changes behavior, this test and the
+ // Python peer fail together and the divergence is visible at review time.
+ #[test]
+ fn jcs_byte_vectors_match_python_peer() {
+ // Non-ASCII string value: the central regression. Pre-JCS-port the
+ // Python peer emitted b"{\"name\":\"caf\\u00e9\"}" here (ensure_ascii).
+ let v = serde_json::json!({"name": "café"});
+ assert_eq!(
+ serde_jcs::to_vec(&v).unwrap(),
+ b"{\"name\":\"caf\xc3\xa9\"}"
+ );
+
+ // CJK: 日 = U+65E5 -> E6 97 A5, 本 = U+672C -> E6 9C AC
+ let v = serde_json::json!({"k": "日本"});
+ assert_eq!(
+ serde_jcs::to_vec(&v).unwrap(),
+ b"{\"k\":\"\xe6\x97\xa5\xe6\x9c\xac\"}"
+ );
+
+ // Supplementary plane (surrogate pair in UTF-16): 𝄞 = U+1D11E -> F0 9D 84 9E
+ let v = serde_json::json!({"k": "𝄞"});
+ assert_eq!(
+ serde_jcs::to_vec(&v).unwrap(),
+ b"{\"k\":\"\xf0\x9d\x84\x9e\"}"
+ );
+
+ // UTF-16 code-unit key sort order: "abc" < "z" < "ñ" (raw UTF-8 bytes).
+ let v = serde_json::json!({"ñ": 3, "z": 2, "abc": 1});
+ assert_eq!(
+ serde_jcs::to_vec(&v).unwrap(),
+ b"{\"abc\":1,\"z\":2,\"\xc3\xb1\":3}"
+ );
+
+ // ASCII-only content is byte-identical to the historical sort_keys form.
+ let v = serde_json::json!({"event":"register","file_id":"f0","n":3});
+ assert_eq!(
+ serde_jcs::to_vec(&v).unwrap(),
+ b"{\"event\":\"register\",\"file_id\":\"f0\",\"n\":3}"
+ );
+ }
}
oversight-rust/oversight-registry/Cargo.toml +1 -0
@@ -21,6 +21,7 @@ oversight-rekor = { path = "../oversight-rekor", features = ["upload"] }
serde.workspace = true
serde_json.workspace = true
serde_jcs.workspace = true
+subtle.workspace = true
hex.workspace = true
thiserror.workspace = true
ed25519-dalek.workspace = true
oversight-rust/oversight-registry/src/auth.rs +41 -8
@@ -1,5 +1,6 @@
use axum::http::{header, HeaderMap};
use oversight_manifest::Manifest;
+use subtle::ConstantTimeEq;
use crate::error::{RegistryError, Result as RegistryResult};
@@ -50,14 +51,7 @@ pub fn require_optional_token(
}
pub fn constant_time_eq(a: &[u8], b: &[u8]) -> bool {
- if a.len() != b.len() {
- return false;
- }
- let mut diff = 0u8;
- for (&x, &y) in a.iter().zip(b.iter()) {
- diff |= x ^ y;
- }
- diff == 0
+ bool::from(a.ct_eq(b))
}
pub fn verify_manifest_signature(manifest_value: &serde_json::Value) -> (bool, String) {
@@ -182,4 +176,43 @@ mod tests {
Err(RegistryError::Unauthorized(_))
));
}
+
+ #[test]
+ fn constant_time_eq_accepts_equal_inputs_across_lengths() {
+ assert!(constant_time_eq(b"", b""));
+ assert!(constant_time_eq(b"a", b"a"));
+ assert!(constant_time_eq(
+ b"operator-token-32-bytes-long-xxxx",
+ b"operator-token-32-bytes-long-xxxx"
+ ));
+ assert!(constant_time_eq(&[0u8; 128], &[0u8; 128]));
+ }
+
+ #[test]
+ fn constant_time_eq_rejects_same_length_different_content() {
+ assert!(!constant_time_eq(b"a", b"b"));
+ assert!(!constant_time_eq(b"abc", b"abd"));
+ assert!(!constant_time_eq(
+ b"operator-token-32-bytes-long-xxxx",
+ b"operator-token-32-bytes-long-yyyy"
+ ));
+ }
+
+ #[test]
+ fn constant_time_eq_rejects_mismatched_lengths_without_early_return() {
+ assert!(!constant_time_eq(b"", b"a"));
+ assert!(!constant_time_eq(b"a", b""));
+ assert!(!constant_time_eq(b"short", b"longer-string"));
+ assert!(!constant_time_eq(b"longer-string", b"short"));
+ }
+
+ #[test]
+ fn constant_time_eq_rejects_single_bit_difference() {
+ let mut a = vec![0u8; 32];
+ let mut b = vec![0u8; 32];
+ b[15] = 0x01;
+ assert!(!constant_time_eq(&a, &b));
+ a[15] = 0x01;
+ assert!(constant_time_eq(&a, &b));
+ }
}
oversight-rust/oversight-registry/src/main.rs +68 -5
@@ -144,11 +144,13 @@ pub fn timestamp_stub() -> String {
fn client_key(headers: &HeaderMap, addr: Option<&SocketAddr>, trusted_proxy: bool) -> String {
if trusted_proxy {
if let Some(xff) = headers.get("x-forwarded-for").and_then(|v| v.to_str().ok()) {
- if let Some(first) = xff.split(',').next() {
- let trimmed = first.trim();
- if !trimmed.is_empty() {
- return trimmed.to_string();
- }
+ let parts: Vec<&str> = xff
+ .split(',')
+ .map(|s| s.trim())
+ .filter(|s| !s.is_empty())
+ .collect();
+ if let Some(last) = parts.last() {
+ return last.to_string();
}
}
}
@@ -361,6 +363,11 @@ async fn main() -> anyhow::Result<()> {
.map(|s| s.trim().to_string())
.filter(|s| !s.is_empty());
+ let auth_disabled = std::env::var("OVERSIGHT_AUTH_DISABLED")
+ .unwrap_or_default()
+ .trim()
+ == "1";
+
let rekor_url = std::env::var("OVERSIGHT_REKOR_URL")
.unwrap_or_else(|_| oversight_rekor::DEFAULT_REKOR_URL.to_string());
@@ -412,6 +419,17 @@ async fn main() -> anyhow::Result<()> {
"transparency log initialized"
);
+ if operator_token.is_none() && !auth_disabled {
+ return Err(anyhow::anyhow!(
+ "OVERSIGHT_OPERATOR_TOKEN is required to start the registry. Set it to a strong random value, or set OVERSIGHT_AUTH_DISABLED=1 only for isolated local testing."
+ ));
+ }
+ if operator_token.is_none() && auth_disabled {
+ tracing::warn!(
+ "OVERSIGHT_AUTH_DISABLED=1: registry is running without operator authentication. Do NOT do this in production."
+ );
+ }
+
let state = Arc::new(AppState {
db: pool,
tlog,
@@ -502,3 +520,48 @@ async fn shutdown_signal() {
_ = terminate => { tracing::info!("received SIGTERM, shutting down"); }
}
}
+
+#[cfg(test)]
+mod tests {
+ use super::*;
+
+ fn xff_headers(value: &str) -> HeaderMap {
+ let mut h = HeaderMap::new();
+ h.insert("x-forwarded-for", HeaderValue::from_str(value).unwrap());
+ h
+ }
+
+ #[test]
+ fn xff_ignores_spoofed_left_entries() {
+ let h = xff_headers("1.2.3.4, 9.9.9.9");
+ assert_eq!(client_key(&h, None, true), "9.9.9.9");
+ let h = xff_headers("fake, fake2, 203.0.113.7");
+ assert_eq!(client_key(&h, None, true), "203.0.113.7");
+ }
+
+ #[test]
+ fn xff_single_entry_is_returned() {
+ let h = xff_headers("9.9.9.9");
+ assert_eq!(client_key(&h, None, true), "9.9.9.9");
+ }
+
+ #[test]
+ fn xff_whitespace_only_entries_dropped() {
+ let h = xff_headers(" , , 9.9.9.9");
+ assert_eq!(client_key(&h, None, true), "9.9.9.9");
+ }
+
+ #[test]
+ fn xff_empty_falls_back_to_addr() {
+ let h = xff_headers("");
+ let addr: SocketAddr = "127.0.0.1:8000".parse().unwrap();
+ assert_eq!(client_key(&h, Some(&addr), true), "127.0.0.1");
+ }
+
+ #[test]
+ fn no_trusted_proxy_ignores_xff_and_uses_addr() {
+ let h = xff_headers("9.9.9.9");
+ let addr: SocketAddr = "127.0.0.1:8000".parse().unwrap();
+ assert_eq!(client_key(&h, Some(&addr), false), "127.0.0.1");
+ }
+}
oversight-rust/oversight-rekor/src/lib.rs +3 -5
@@ -157,8 +157,9 @@ pub struct DsseSignature {
}
impl DsseEnvelope {
- /// Canonical JSON encoding: sorted keys, no whitespace. Bit-identical to
- /// Python `json.dumps(..., sort_keys=True, separators=(",",":"))`.
+ /// Canonical JSON encoding per RFC 8785 (JCS). Byte-identical to the
+ /// Python reference's ``oversight_core.jcs.jcs_dumps``. Keys sorted by
+ /// UTF-16 code unit, no whitespace, non-ASCII output as raw UTF-8.
pub fn to_canonical_json(&self) -> Result<String, RekorError> {
// Build via BTreeMap so keys sort and we control order.
let v = serde_json::json!({
@@ -170,9 +171,6 @@ impl DsseEnvelope {
.map(|s| serde_json::json!({"sig": s.sig, "keyid": s.keyid}))
.collect::<Vec<_>>(),
});
- // Use serde_jcs so multi-byte / unicode handling matches Python's
- // sort_keys behavior. JCS sorts lexicographically by UTF-16 code units;
- // for our ASCII-only field set this matches Python sort_keys exactly.
Ok(serde_jcs::to_string(&v)?)
}
oversight-rust/tests/conformance_cross_lang.sh +73 -0
@@ -106,6 +106,79 @@ PYEOF
cargo run --manifest-path $RUST_CARGO --release -q -- inspect \
--input python-sealed.bin 2>&1 | grep -E "(signature valid|suite|OVERSIGHT)" | head -5
+echo ""
+echo "=== 4. Non-ASCII recipient_id round trip (the JCS divergence case) ==="
+# Pre-JCS-port this failed: Python emitted {"recipient_id":"Zi\u00f3n@org"}
+# (ensure_ascii=True) while Rust emitted {"recipient_id":"Zión@org"} (raw
+# UTF-8). The two signatures covered different bytes, so a Rust-sealed file
+# with a non-ASCII recipient_id failed Python Manifest.verify() and vice
+# versa. After the RFC 8785 JCS unification, both sides serialize to raw
+# UTF-8 and the signatures agree.
+UNICODE_RECIPIENT='Zión@org'
+
+cargo run --manifest-path $RUST_CARGO --release -q -- seal \
+ --input plaintext.txt --output rust-unicode-sealed.bin \
+ --issuer issuer.json --recipient-pub "$ALICE_X_PUB" \
+ --recipient-id "$UNICODE_RECIPIENT" --registry "https://reg.test" 2>&1 | tail -3
+
+python3 <<PYEOF
+import sys
+sys.path.insert(0, '$PYTHON_ROOT')
+from oversight_core.container import open_sealed, SealedFile
+blob = open('rust-unicode-sealed.bin', 'rb').read()
+priv = bytes.fromhex('$ALICE_X_PRIV')
+plaintext, manifest = open_sealed(blob, priv)
+assert manifest.verify(), (
+ "Python Manifest.verify() of Rust-sealed file with non-ASCII recipient_id "
+ "FAILED. This is the JCS divergence: Python and Rust are computing "
+ "different canonical bytes for the same manifest."
+)
+assert manifest.recipient.recipient_id == '$UNICODE_RECIPIENT', (
+ f"recipient_id mismatch: got {manifest.recipient.recipient_id!r}"
+)
+print(f" ✓ Python verifies Rust-sealed manifest with recipient_id={manifest.recipient.recipient_id!r}")
+PYEOF
+
+python3 <<PYEOF
+import sys
+sys.path.insert(0, '$PYTHON_ROOT')
+from oversight_core import ClassicIdentity, content_hash
+from oversight_core.manifest import Manifest, Recipient
+from oversight_core.container import seal
+
+alice_pub = bytes.fromhex('$ALICE_X_PUB')
+issuer_priv = bytes.fromhex('$ISSUER_ED_PRIV')
+issuer_pub = bytes.fromhex('$ISSUER_ED_PUB')
+
+plaintext = open('plaintext.txt', 'rb').read()
+m = Manifest.new(
+ original_filename='plaintext.txt',
+ content_hash=content_hash(plaintext),
+ size_bytes=len(plaintext),
+ issuer_id='cross-test',
+ issuer_ed25519_pub_hex=issuer_pub.hex(),
+ recipient=Recipient(recipient_id='$UNICODE_RECIPIENT', x25519_pub=alice_pub.hex()),
+ registry_url='https://reg.test',
+ content_type='text/plain',
+)
+blob = seal(plaintext, m, issuer_priv, alice_pub)
+open('python-unicode-sealed.bin', 'wb').write(blob)
+assert m.verify(), "Python cannot verify its own signature on a non-ASCII manifest"
+print(f" ✓ Python signed manifest with non-ASCII recipient_id, self-verify OK")
+PYEOF
+
+cargo run --manifest-path $RUST_CARGO --release -q -- open \
+ --input python-unicode-sealed.bin --output rust-unicode-recovered.txt --recipient alice.json 2>&1 | tail -3
+
+diff plaintext.txt rust-unicode-recovered.txt && echo " ✓ Rust opens Python-sealed non-ASCII manifest, plaintext matches"
+
+cargo run --manifest-path $RUST_CARGO --release -q -- inspect \
+ --input python-unicode-sealed.bin 2>&1 | grep -E "signature valid" | head -1
+
+echo ""
+echo "=== 5. Hybrid (OSGT-HYBRID-v1) ML-KEM-768 KEM, Python <-> Rust ==="
+PYTHONPATH="$REPO_ROOT:$PYTHONPATH" python3 "$REPO_ROOT/oversight-rust/tests/conformance_hybrid_kem.py"
+
echo ""
echo "=========================================="
echo " CROSS-LANGUAGE CONFORMANCE: ALL PASS"
oversight-rust/tests/conformance_hybrid_kem.py +103 -0
@@ -0,0 +1,103 @@
+#!/usr/bin/env python3
+"""Cross-language ML-KEM-768 hybrid KEM conformance: Python <-> Rust.
+
+Proves the OSGT-HYBRID-v1 DEK-wrap construction is byte-identical across the
+Python reference (`oversight_core.crypto.hybrid_wrap_dek`/`hybrid_unwrap_dek`)
+and the Rust port (`oversight_crypto::hybrid_wrap_dek`/`hybrid_unwrap_dek`) in
+both directions:
+
+ [1] Rust recipient -> Python wraps -> Rust unwraps
+ [2] Python recipient -> Rust wraps -> Python unwraps
+
+Only ML-KEM *public* keys (1184 bytes) and the X25519 public key cross the
+language boundary; each recipient holds its own private key in its native
+form (Rust seed / Python liboqs expanded). Requires liboqs + liboqs-python;
+SKIPS with a clear message otherwise (CI-safe).
+"""
+
+import json
+import os
+import subprocess
+import sys
+
+REPO = os.path.join(os.path.dirname(__file__), "..", "..")
+sys.path.insert(0, REPO)
+
+from oversight_core.crypto import PQ_AVAILABLE, hybrid_unwrap_dek, hybrid_wrap_dek # noqa: E402
+
+CARGO = os.path.join(REPO, "oversight-rust", "Cargo.toml")
+TARGET = os.environ.get("CARGO_TARGET_DIR", "/root/.cache/oversight-rust-target")
+
+
+def rust(args):
+ cmd = [
+ "cargo", "run", "--manifest-path", CARGO, "--release", "-q",
+ "-p", "oversight-crypto", "--example", "hybrid_kem_cli", "--",
+ ] + args
+ env = dict(os.environ, CARGO_TARGET_DIR=TARGET)
+ proc = subprocess.run(cmd, capture_output=True, text=True, env=env)
+ if proc.returncode != 0:
+ raise RuntimeError(f"rust helper failed: {proc.stderr}")
+ return proc.stdout.strip()
+
+
+def x25519_keypair():
+ from cryptography.hazmat.primitives.serialization import (
+ Encoding, NoEncryption, PrivateFormat, PublicFormat,
+ )
+ from cryptography.hazmat.primitives.asymmetric.x25519 import X25519PrivateKey
+
+ sk = X25519PrivateKey.generate()
+ pub = sk.public_key().public_bytes(Encoding.Raw, PublicFormat.Raw)
+ priv = sk.private_bytes(Encoding.Raw, PrivateFormat.Raw, NoEncryption())
+ return pub, priv
+
+
+def mlkem_keypair():
+ import oqs
+
+ kem = oqs.KeyEncapsulation("ML-KEM-768")
+ pub = kem.generate_keypair()
+ priv = kem.export_secret_key()
+ return pub, priv
+
+
+def main():
+ if not PQ_AVAILABLE:
+ print("SKIP cross-language hybrid KEM: liboqs-python not available")
+ return 0
+
+ dek = os.urandom(32)
+ results = []
+
+ # [1] Rust recipient -> Python wraps -> Rust unwraps.
+ recv = json.loads(rust(["keygen"]))
+ env = hybrid_wrap_dek(
+ dek, bytes.fromhex(recv["x_pub"]), bytes.fromhex(recv["mlkem_pub"])
+ )
+ env_path = "/tmp/_oversight_hybrid_env1.json"
+ with open(env_path, "w") as f:
+ json.dump(env, f)
+ dek_rs = bytes.fromhex(rust(["unwrap", env_path, recv["x_priv"], recv["mlkem_seed"]]))
+ ok1 = dek_rs == dek
+ results.append(ok1)
+ print(f"[1] PY wrap -> RS unwrap: {'PASS' if ok1 else 'FAIL'}")
+
+ # [2] Python recipient -> Rust wraps -> Python unwraps.
+ x_pub, x_priv = x25519_keypair()
+ mlkem_pub, mlkem_priv = mlkem_keypair()
+ env2 = json.loads(rust(["wrap", x_pub.hex(), mlkem_pub.hex(), dek.hex()]))
+ dek_py = hybrid_unwrap_dek(env2, x_priv, mlkem_priv)
+ ok2 = dek_py == dek
+ results.append(ok2)
+ print(f"[2] RS wrap -> PY unwrap: {'PASS' if ok2 else 'FAIL'}")
+
+ if all(results):
+ print("CROSS-LANGUAGE HYBRID KEM: ALL PASS")
+ return 0
+ print("CROSS-LANGUAGE HYBRID KEM: FAIL")
+ return 1
+
+
+if __name__ == "__main__":
+ sys.exit(main())
oversight-rust/tests/conformance_rekor.sh +4 -1
@@ -20,7 +20,10 @@ set -euo pipefail
ROOT="$(cd "$(dirname "$0")/.." && pwd)"
REPO_ROOT="$(cd "$ROOT/.." && pwd)"
-HELPER_BIN="$ROOT/target/release/examples/conformance_helper"
+# Respect CARGO_TARGET_DIR so the script works for out-of-tree builds
+# (CI runners, noexec source mounts). Falls back to in-tree $ROOT/target.
+TARGET_DIR="${CARGO_TARGET_DIR:-$ROOT/target}"
+HELPER_BIN="$TARGET_DIR/release/examples/conformance_helper"
cd "$ROOT"
echo "==> building conformance helper..."
oversight_core/container.py +3 -3
@@ -32,6 +32,8 @@ import io
import json
import struct
from dataclasses import dataclass
+
+from .jcs import jcs_dumps
from typing import Optional
from . import crypto
@@ -82,9 +84,7 @@ class SealedFile:
buf.write(struct.pack(">I", len(manifest_json)))
buf.write(manifest_json)
- wrapped_json = json.dumps(
- self.wrapped_dek, sort_keys=True, separators=(",", ":")
- ).encode("utf-8")
+ wrapped_json = jcs_dumps(self.wrapped_dek)
buf.write(struct.pack(">I", len(wrapped_json)))
buf.write(wrapped_json)
oversight_core/crypto.py +9 -1
@@ -46,7 +46,15 @@ from nacl.bindings import (
# Try to detect PQ availability
try:
- import oqs # type: ignore
+ import contextlib
+ import os as _os
+
+ # liboqs-python attaches a StreamHandler(sys.stdout) and logs an INFO line
+ # at import time, which contaminates stdout for any caller that imports us
+ # (and breaks byte-identity conformance capture). Suppress it during import.
+ with open(_os.devnull, "w") as _devnull:
+ with contextlib.redirect_stdout(_devnull):
+ import oqs # type: ignore
PQ_AVAILABLE = True
except Exception:
oversight_core/jcs.py +103 -0
@@ -0,0 +1,103 @@
+"""
+oversight_core.jcs
+==================
+
+JSON Canonicalization Scheme (RFC 8785) for Oversight.
+
+Byte-exact match with the Rust reference's ``serde_jcs::to_vec``. Every
+canonical-bytes computation that gets hashed or signed in Oversight flows
+through ``jcs_dumps``: manifest signing, transparency-log leaf payloads,
+DSSE statement payloads, evidence bundles, and registry sidecar comparison.
+
+Vendored rather than pip-installed. Rationale: the canonicalization function
+sits on the signing path of a cryptographic protocol, so every line must be
+auditable in-tree, and the Oversight manifest schema carries no floats so we
+implement only the RFC 8785 subset we need and reject floats explicitly rather
+than silently producing a non-canonical float form.
+"""
+
+from __future__ import annotations
+
+from typing import Any
+
+_SHORT_ESCAPES = {
+ 0x08: "\\b",
+ 0x09: "\\t",
+ 0x0A: "\\n",
+ 0x0C: "\\f",
+ 0x0D: "\\r",
+}
+
+
+def jcs_dumps(obj: Any) -> bytes:
+ """Canonicalize ``obj`` to RFC 8785 JSON bytes matching ``serde_jcs``.
+
+ Accepts None, bool, int, str, list, tuple, dict. Floats and any other
+ type raise TypeError; Oversight manifests use only int and str for
+ numeric values, and silently emitting a non-canonical float form would
+ break cross-language signature agreement.
+ """
+ parts: list[str] = []
+ _serialize(obj, parts)
+ return "".join(parts).encode("utf-8")
+
+
+def _serialize(obj: Any, parts: list[str]) -> None:
+ if obj is None:
+ parts.append("null")
+ elif obj is True:
+ parts.append("true")
+ elif obj is False:
+ parts.append("false")
+ elif isinstance(obj, int):
+ parts.append(str(obj))
+ elif isinstance(obj, float):
+ raise TypeError(
+ "JCS: floats are unsupported; Oversight manifests store every "
+ "numeric value as int or string"
+ )
+ elif isinstance(obj, str):
+ _serialize_str(obj, parts)
+ elif isinstance(obj, (list, tuple)):
+ parts.append("[")
+ for i, item in enumerate(obj):
+ if i:
+ parts.append(",")
+ _serialize(item, parts)
+ parts.append("]")
+ elif isinstance(obj, dict):
+ parts.append("{")
+ # RFC 8785 §3.2.3: keys sorted by UTF-16 code unit. For well-formed
+ # Unicode this is equivalent to Python's default code-point sort
+ # because BMP code units and supplementary-plane code points preserve
+ # their relative order under both encodings. Encode as UTF-16-BE so
+ # the sort key is explicit and matches serde_jcs byte ordering.
+ items = sorted(obj.items(), key=lambda kv: kv[0].encode("utf-16-be"))
+ for i, (k, v) in enumerate(items):
+ if not isinstance(k, str):
+ raise TypeError(
+ f"JCS: dict keys must be str, got {type(k).__name__}"
+ )
+ if i:
+ parts.append(",")
+ _serialize_str(k, parts)
+ parts.append(":")
+ _serialize(v, parts)
+ parts.append("}")
+ else:
+ raise TypeError(f"JCS: unsupported type {type(obj).__name__}")
+
+
+def _serialize_str(s: str, parts: list[str]) -> None:
+ parts.append('"')
+ for ch in s:
+ cp = ord(ch)
+ if cp == 0x22:
+ parts.append('\\"')
+ elif cp == 0x5C:
+ parts.append("\\\\")
+ elif cp < 0x20:
+ parts.append(_SHORT_ESCAPES.get(cp, f"\\u{cp:04x}"))
+ else:
+ parts.append(ch)
+ parts.append('"')
oversight_core/manifest.py +11 -4
@@ -19,6 +19,7 @@ from dataclasses import dataclass, field, asdict, fields
from typing import Optional
from .crypto import sign_manifest, verify_manifest, SUITE_CLASSIC_V1
+from .jcs import jcs_dumps
@dataclass
@@ -26,6 +27,11 @@ class Recipient:
recipient_id: str # stable identifier (email hash, user UUID, etc.)
x25519_pub: str # hex
ed25519_pub: Optional[str] = None # hex, for verifying recipient acks
+ # Present on OSGT-HW-P256-v1 manifests (Rust-produced). Python keeps parse
+ # and inspect parity during the transition to the Rust canonical target but
+ # does not implement the HW-P256 seal/open crypto path. Defaults to None so
+ # classic-suite manifests canonicalize byte-identically to before.
+ p256_pub: Optional[str] = None
@dataclass
@@ -143,16 +149,17 @@ class Manifest:
Rules:
- Exclude the two signature fields (replace with empty string sentinel).
- Drop None-valued fields recursively.
- - Sort keys lexicographically.
- - UTF-8 encoded, no whitespace.
+ - RFC 8785 JCS: keys sorted by UTF-16 code unit, no whitespace,
+ non-ASCII output as raw UTF-8. Byte-exact match with the Rust
+ reference's ``serde_jcs::to_vec``.
"""
d = self.to_dict(include_signatures=False)
d = self._strip_none(d)
- return json.dumps(d, sort_keys=True, separators=(",", ":")).encode("utf-8")
+ return jcs_dumps(d)
def to_json(self) -> bytes:
d = self._strip_none(self.to_dict())
- return json.dumps(d, sort_keys=True, separators=(",", ":")).encode("utf-8")
+ return jcs_dumps(d)
@classmethod
def from_json(cls, data: bytes) -> "Manifest":
oversight_core/rekor.py +6 -6
@@ -32,6 +32,8 @@ import urllib.request
from dataclasses import dataclass, field
from typing import Any, Optional
+from oversight_core.jcs import jcs_dumps
+
from cryptography.hazmat.primitives.asymmetric.ed25519 import (
Ed25519PrivateKey,
Ed25519PublicKey,
@@ -127,15 +129,13 @@ class DSSEEnvelope:
signatures: list[dict] # [{"sig": "<b64>", "keyid": "<hex>"}, ...]
def to_json(self) -> str:
- return json.dumps(
+ return jcs_dumps(
{
"payload": self.payload_b64,
"payloadType": self.payload_type,
"signatures": self.signatures,
- },
- sort_keys=True,
- separators=(",", ":"),
- )
+ }
+ ).decode("utf-8")
@classmethod
def from_json(cls, raw: str) -> "DSSEEnvelope":
@@ -202,7 +202,7 @@ def sign_dsse(
``keyid`` is opaque per spec; convention is the hex SHA-256 of the public
key. Empty string is allowed and used in tests.
"""
- payload = json.dumps(statement, sort_keys=True, separators=(",", ":")).encode("utf-8")
+ payload = jcs_dumps(statement)
payload_b64 = base64.b64encode(payload).decode("ascii")
pae = _pae(DSSE_PAYLOAD_TYPE, payload)
sk = Ed25519PrivateKey.from_private_bytes(issuer_ed25519_priv)
oversight_core/tlog.py +4 -4
@@ -33,6 +33,8 @@ from typing import Optional
from cryptography.hazmat.primitives.asymmetric.ed25519 import Ed25519PrivateKey
+from .jcs import jcs_dumps
+
def _h(data: bytes) -> bytes:
return hashlib.sha256(data).digest()
@@ -149,9 +151,7 @@ class TransparencyLog:
def append(self, leaf_data: bytes | str | dict) -> int:
"""Append a leaf. Durable: fsync before return."""
if isinstance(leaf_data, dict):
- leaf_bytes = json.dumps(
- leaf_data, sort_keys=True, separators=(",", ":")
- ).encode("utf-8")
+ leaf_bytes = jcs_dumps(leaf_data)
elif isinstance(leaf_data, str):
leaf_bytes = leaf_data.encode("utf-8")
else:
@@ -203,7 +203,7 @@ class TransparencyLog:
size = self.size()
root = self.root()
head = {"size": size, "root": root.hex()}
- msg = json.dumps(head, sort_keys=True, separators=(",", ":")).encode("utf-8")
+ msg = jcs_dumps(head)
if self._sk:
sig = self._sk.sign(msg)
head["signature"] = sig.hex()
pyproject.toml +3 -3
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
[project]
name = "oversight-protocol"
-version = "0.4.11"
+version = "0.4.12"
description = "Open protocol for cryptographic data provenance, recipient attribution, and leak detection."
readme = "README.md"
license = {text = "Apache-2.0"}
@@ -59,9 +59,9 @@ oversight = "cli.oversight_rich:main"
oversight-gui = "cli.gui:main"
[project.urls]
-Homepage = "https://oversight-protocol.github.io/oversight/"
+Homepage = "https://oversightprotocol.dev"
Repository = "https://github.com/oversight-protocol/oversight"
-Documentation = "https://oversight-protocol.github.io/oversight/docs/"
+Documentation = "https://github.com/oversight-protocol/oversight/blob/main/docs/SPEC.md"
Issues = "https://github.com/oversight-protocol/oversight/issues"
[tool.setuptools.packages.find]
registry/server.py +49 -9
@@ -41,6 +41,7 @@ sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
from oversight_core.tlog import TransparencyLog
from oversight_core.manifest import Manifest
from oversight_core import rekor as rekor_mod
+from oversight_core.jcs import jcs_dumps
DB_PATH = Path(os.environ.get("OVERSIGHT_DB", "/tmp/oversight-registry.sqlite"))
@@ -51,6 +52,9 @@ TRUSTED_PROXY = bool(int(os.environ.get("TRUSTED_PROXY", "0")))
# When TRUSTED_PROXY=1, honor X-Forwarded-For for rate limiting.
DNS_EVENT_SECRET = os.environ.get("OVERSIGHT_DNS_EVENT_SECRET", "")
OPERATOR_TOKEN = os.environ.get("OVERSIGHT_OPERATOR_TOKEN", "").strip()
+# When set to "1", the registry boots without an operator token. Local dev /
+# isolated testing only; never set this in production.
+AUTH_DISABLED = os.environ.get("OVERSIGHT_AUTH_DISABLED", "").strip() == "1"
# Rekor v2 wiring (v0.5 Session B). Off by default so existing tests do not
# generate live network traffic. Set OVERSIGHT_REKOR_ENABLED=1 to opt in.
@@ -225,22 +229,60 @@ class TokenBucket:
BUCKET = TokenBucket(rate=10.0, burst=30, max_keys=100_000)
+def _xff_client(xff: str) -> str | None:
+ """Return the trusted client IP from an X-Forwarded-For header value.
+
+ The directly-connected proxy (Caddy) appends the real client as the
+ RIGHTMOST entry. Entries to its left are attacker-controlled: a client
+ may send any XFF header and the proxy appends rather than replaces, so
+ the leftmost entry must never be trusted for rate-limit bucketing or for
+ the source_ip written into beacon events. Taking the leftmost let an
+ attacker pick their rate-limit bucket and forge attribution.
+ """
+ parts = [p.strip() for p in xff.split(",") if p.strip()]
+ return parts[-1] if parts else None
+
+
def _client_key(request: Request) -> str:
"""Extract the client identifier used for rate limiting."""
if TRUSTED_PROXY:
xff = request.headers.get("x-forwarded-for", "")
- if xff:
- # Last hop is the most recent proxy, first is the original client.
- # For rate limiting the original client IP is what we want.
- return xff.split(",")[0].strip()
+ client = _xff_client(xff) if xff else None
+ if client:
+ return client
return request.client.host if request.client else "unknown"
# ---- app + lifespan ----
+def _enforce_auth_config():
+ """Fail closed at boot.
+
+ Without an operator token the public write endpoints (/register,
+ /attribute) would let anyone self-sign manifests into the append-only
+ tlog and enumerate attribution over /attribute. Refuse to start in that
+ state unless the operator has explicitly opted out with
+ OVERSIGHT_AUTH_DISABLED=1 (intended for isolated local testing only).
+ """
+ if not OPERATOR_TOKEN and not AUTH_DISABLED:
+ raise RuntimeError(
+ "OVERSIGHT_OPERATOR_TOKEN is required to start the registry. "
+ "Set it to a strong random value, or set OVERSIGHT_AUTH_DISABLED=1 "
+ "only for isolated local testing."
+ )
+ if not OPERATOR_TOKEN and AUTH_DISABLED:
+ import warnings
+ warnings.warn(
+ "OVERSIGHT_AUTH_DISABLED=1: registry is running without operator "
+ "authentication. Do NOT do this in production.",
+ stacklevel=2,
+ )
+
+
@asynccontextmanager
async def lifespan(app: FastAPI):
global IDENTITY, TLOG
+ _enforce_auth_config()
init_db()
IDENTITY = load_or_create_identity()
TLOG = TransparencyLog(TLOG_DIR, signing_key_hex=IDENTITY["ed25519_priv"])
@@ -496,9 +538,7 @@ def _verify_manifest_signature(manifest_dict: dict) -> tuple[bool, str]:
Returns (ok, issuer_pub_hex). issuer_pub_hex is the claimed issuer key.
"""
try:
- m = Manifest.from_json(
- json.dumps(manifest_dict, sort_keys=True, separators=(",", ":")).encode("utf-8")
- )
+ m = Manifest.from_json(jcs_dumps(manifest_dict))
except Exception as e:
return False, ""
return m.verify(), m.issuer_ed25519_pub
@@ -507,7 +547,7 @@ def _verify_manifest_signature(manifest_dict: dict) -> tuple[bool, str]:
def _canonical_items(items: list[dict]) -> list[str]:
"""Normalize registration sidecars for exact signed-manifest comparison."""
return sorted(
- json.dumps(item, sort_keys=True, separators=(",", ":"))
+ jcs_dumps(item).decode("utf-8")
for item in items
)
@@ -783,7 +823,7 @@ def evidence_bundle(file_id: str):
),
}
sk = Ed25519PrivateKey.from_private_bytes(bytes.fromhex(IDENTITY["ed25519_priv"]))
- msg = json.dumps(bundle, sort_keys=True, separators=(",", ":")).encode("utf-8")
+ msg = jcs_dumps(bundle)
bundle["bundle_signature_ed25519"] = sk.sign(msg).hex()
return bundle
tests/test_e2e.py +7 -0
@@ -178,5 +178,12 @@ def main():
banner("ALL TESTS PASSED")
+def test_e2e_seal_open_watermark_round_trip():
+ """Pytest entry point. The scenario is one end-to-end flow with internal
+ assertions; pytest's value here is collection + CI integration, not
+ per-step granularity."""
+ main()
+
+
if __name__ == "__main__":
main()
tests/test_e2e_v2.py +7 -0
@@ -312,5 +312,12 @@ def main():
banner("ALL TESTS PASSED")
+def test_e2e_v2_full_round_trip():
+ """Pytest entry point. The scenario is one end-to-end flow with internal
+ assertions; pytest's value here is collection + CI integration, not
+ per-step granularity."""
+ main()
+
+
if __name__ == "__main__":
main()
tests/test_gui_hardening_unit.py +32 -16
@@ -15,7 +15,20 @@ from pathlib import Path
ROOT = Path(__file__).resolve().parent.parent
sys.path.insert(0, str(ROOT))
-from cli import gui
+try:
+ import tkinter # noqa: F401
+except ImportError:
+ tkinter = None
+
+import pytest
+
+pytestmark = pytest.mark.skipif(
+ tkinter is None, reason="python3-tk not installed; GUI tests skipped"
+)
+
+if tkinter is not None:
+ from cli import gui # noqa: E402
+
from oversight_core import ClassicIdentity, Manifest, Recipient, content_hash, seal
from oversight_core.container import SealedFile
from oversight_core.safe_io import is_private_key_file, validate_output_path
@@ -49,7 +62,7 @@ def _sealed_blob() -> bytes:
return seal(plaintext, manifest, issuer.ed25519_priv, recipient.x25519_pub)
-def t1_private_key_outputs_are_blocked():
+def test_private_key_outputs_are_blocked():
with tempfile.TemporaryDirectory() as td:
key_path = Path(td) / "alice.priv.json"
key_path.write_text(json.dumps(_identity_dict()), encoding="utf-8")
@@ -63,7 +76,7 @@ def t1_private_key_outputs_are_blocked():
print(" [PASS] private key output targets are hard-blocked")
-def t2_same_path_outputs_are_blocked():
+def test_same_path_outputs_are_blocked():
with tempfile.TemporaryDirectory() as td:
input_path = Path(td) / "source.txt"
input_path.write_text("source", encoding="utf-8")
@@ -76,7 +89,7 @@ def t2_same_path_outputs_are_blocked():
print(" [PASS] output paths cannot equal input paths")
-def t3_windows_reserved_names_are_rejected():
+def test_windows_reserved_names_are_rejected():
try:
validate_output_path(Path("NUL.priv.json"))
except ValueError as exc:
@@ -86,7 +99,7 @@ def t3_windows_reserved_names_are_rejected():
print(" [PASS] Windows reserved output names are rejected")
-def t4_gui_key_shape_errors_are_friendly():
+def test_gui_key_shape_errors_are_friendly():
with tempfile.TemporaryDirectory() as td:
pub_path = Path(td) / "alice.pub.json"
pub_path.write_text(json.dumps({"id": "alice", "x25519_pub": "00" * 32}), encoding="utf-8")
@@ -99,12 +112,12 @@ def t4_gui_key_shape_errors_are_friendly():
print(" [PASS] key-shape mistakes get actionable GUI errors")
-def t5_gui_registry_domain_uses_user_url():
+def test_gui_registry_domain_uses_user_url():
assert gui._registry_domain("https://registry.example.test:8443/api") == "registry.example.test:8443"
print(" [PASS] GUI beacon domain derives from the configured registry URL")
-def t6_container_rejects_suite_id_tamper():
+def test_container_rejects_suite_id_tamper():
blob = bytearray(_sealed_blob())
blob[7] ^= 0x01
try:
@@ -116,7 +129,7 @@ def t6_container_rejects_suite_id_tamper():
print(" [PASS] unauthenticated suite_id tamper is rejected")
-def t7_container_rejects_trailing_bytes():
+def test_container_rejects_trailing_bytes():
try:
SealedFile.from_bytes(_sealed_blob() + b"junk")
except ValueError as exc:
@@ -130,16 +143,19 @@ def main():
print("=" * 60)
print(" GUI/CLI hardening - focused unit tests")
print("=" * 60)
- t1_private_key_outputs_are_blocked()
- t2_same_path_outputs_are_blocked()
- t3_windows_reserved_names_are_rejected()
- t4_gui_key_shape_errors_are_friendly()
- t5_gui_registry_domain_uses_user_url()
- t6_container_rejects_suite_id_tamper()
- t7_container_rejects_trailing_bytes()
+ test_private_key_outputs_are_blocked()
+ test_same_path_outputs_are_blocked()
+ test_windows_reserved_names_are_rejected()
+ test_gui_key_shape_errors_are_friendly()
+ test_gui_registry_domain_uses_user_url()
+ test_container_rejects_suite_id_tamper()
+ test_container_rejects_trailing_bytes()
print()
print(" ALL TESTS PASSED - 7/7")
if __name__ == "__main__":
- main()
+ if tkinter is None:
+ print("python3-tk not installed; GUI tests skipped")
+ else:
+ main()
tests/test_jcs_canonical_unit.py +136 -0
@@ -0,0 +1,136 @@
+"""
+test_jcs_canonical_unit
+=======================
+
+Byte-exact fixtures for the JSON Canonicalization Scheme (RFC 8785) port.
+
+Background: the Rust reference uses ``serde_jcs::to_vec`` everywhere it
+canonicalizes for signing or hashing. Python was historically on
+``json.dumps(sort_keys=True, separators=(",",":")).encode("utf-8")``, which is
+byte-identical to JCS for the ASCII-only subset but diverges for any non-ASCII
+string value, because Python's default ``ensure_ascii=True`` escapes non-ASCII
+as ``\\uXXXX`` while JCS emits raw UTF-8. That divergence was a latent threat
+to the "bit-identical / conformance is ground truth" claim: any manifest,
+tlog leaf, or evidence bundle containing a non-ASCII character would hash and
+sign to different bytes across the two implementations.
+
+These tests pin the JCS algorithm itself on known vectors (so a future
+refactor cannot silently regress it), prove the non-ASCII divergence is
+closed (the actual bug fix), and prove no regression for the existing
+ASCII-only content (so committed fixtures and existing signatures stay valid).
+"""
+
+from __future__ import annotations
+
+import json
+import os
+import sys
+from pathlib import Path
+
+ROOT = Path(__file__).resolve().parent.parent
+sys.path.insert(0, str(ROOT))
+
+from oversight_core.jcs import jcs_dumps
+
+
+def test_primitives():
+ assert jcs_dumps(None) == b"null"
+ assert jcs_dumps(True) == b"true"
+ assert jcs_dumps(False) == b"false"
+ assert jcs_dumps(0) == b"0"
+ assert jcs_dumps(42) == b"42"
+ assert jcs_dumps(-1) == b"-1"
+ assert jcs_dumps(9223372036854775807) == b"9223372036854775807"
+ assert jcs_dumps("hello") == b'"hello"'
+ assert jcs_dumps("") == b'""'
+ assert jcs_dumps([]) == b"[]"
+ assert jcs_dumps({}) == b"{}"
+
+
+def test_key_sorting_nested():
+ assert jcs_dumps({"b": 1, "a": 2}) == b'{"a":2,"b":1}'
+ assert jcs_dumps({"z": 1, "a": {"y": 2, "x": 3}}) == b'{"a":{"x":3,"y":2},"z":1}'
+ assert jcs_dumps([3, 1, 2]) == b"[3,1,2]"
+
+
+def test_string_escapes():
+ assert jcs_dumps('a"b') == b'"a\\"b"'
+ assert jcs_dumps("a\\b") == b'"a\\\\b"'
+ assert jcs_dumps("a\nb") == b'"a\\nb"'
+ assert jcs_dumps("a\tb") == b'"a\\tb"'
+ assert jcs_dumps("a\rb") == b'"a\\rb"'
+ assert jcs_dumps("a\bb") == b'"a\\bb"'
+ assert jcs_dumps("a\fb") == b'"a\\fb"'
+ assert jcs_dumps("a\x01b") == b'"a\\u0001b"'
+
+
+def test_non_ascii_emits_raw_utf8_not_uXXXX_escape():
+ # This is the central regression: pre-port Python emitted
+ # b'{"name":"caf\\u00e9"}' here, which diverged from serde_jcs and broke
+ # cross-language signature agreement. JCS emits raw UTF-8.
+ assert jcs_dumps({"name": "café"}) == b'{"name":"caf\xc3\xa9"}'
+ # CJK: 日 = U+65E5 -> E6 97 A5, 本 = U+672C -> E6 9C AC
+ assert jcs_dumps({"k": "日本"}) == b'{"k":"\xe6\x97\xa5\xe6\x9c\xac"}'
+ # Supplementary plane (surrogate pair in UTF-16): 𝄞 = U+1D11E -> F0 9D 84 9E
+ assert jcs_dumps({"k": "𝄞"}) == b'{"k":"\xf0\x9d\x84\x9e"}'
+
+
+def test_non_ascii_key_sort_order():
+ # Keys: "abc" (00 61 00 62 00 63), "z" (00 7A), "ñ" (00 F1).
+ # UTF-16-BE byte order: "abc" < "z" < "ñ". Python code-point sort agrees.
+ out = jcs_dumps({"ñ": 3, "z": 2, "abc": 1})
+ assert out == b'{"abc":1,"z":2,"\xc3\xb1":3}'
+
+
+def test_floats_rejected():
+ try:
+ jcs_dumps(1.0)
+ raise AssertionError("jcs_dumps accepted a float")
+ except TypeError:
+ pass
+ try:
+ jcs_dumps({"x": 1.5})
+ raise AssertionError("jcs_dumps accepted a nested float")
+ except TypeError:
+ pass
+
+
+def test_unsupported_types_rejected():
+ for bad in (object(), b"bytes", set(), frozenset()):
+ try:
+ jcs_dumps(bad)
+ raise AssertionError(f"jcs_dumps accepted {type(bad).__name__}")
+ except TypeError:
+ pass
+
+
+def test_ascii_content_byte_identical_to_legacy_sort_keys():
+ # For the ASCII-only, no-floats subset, JCS and the legacy sort_keys form
+ # must produce identical bytes. This is what guarantees that every
+ # existing ASCII manifest, tlog leaf, and evidence bundle continues to
+ # verify after the port.
+ samples = [
+ {"event": "register", "file_id": "f0", "n": 3},
+ {"a": ["x", "y"], "b": {"c": True, "d": None}},
+ {"size": 7, "root": "00" * 32, "signature": "ab" * 64},
+ ]
+ for s in samples:
+ legacy = json.dumps(s, sort_keys=True, separators=(",", ":")).encode("utf-8")
+ assert jcs_dumps(s) == legacy, (
+ f"ASCII divergence!\n legacy: {legacy!r}\n jcs: {jcs_dumps(s)!r}"
+ )
+
+
+def test_tuple_serializes_like_list():
+ assert jcs_dumps((1, 2, 3)) == b"[1,2,3]"
+
+
+def test_round_trip_through_json_parser():
+ # Canonical bytes must round-trip through a strict JSON parser.
+ cases = [
+ {"a": 1, "b": [True, None, "x"], "c": {"d": "café"}},
+ {"issuer": "Zión@test", "hash": "ab" * 16},
+ ]
+ for c in cases:
+ rt = json.loads(jcs_dumps(c).decode("utf-8"))
+ assert rt == c
tests/test_l3_policy_unit.py +6 -6
@@ -14,7 +14,7 @@ def ok(msg: str) -> None:
print(f" [PASS] {msg}")
-def t1_risky_documents_default_l3_off():
+def test_risky_documents_default_l3_off():
text = "The system MUST verify every request. SELECT * FROM users;"
decision = l3_policy.decide_l3(
filename="api-spec.md",
@@ -27,7 +27,7 @@ def t1_risky_documents_default_l3_off():
ok("technical/spec content disables L3 by default")
-def t2_full_l3_requires_ack_metadata():
+def test_full_l3_requires_ack_metadata():
decision = l3_policy.decide_l3(
filename="brief.txt",
content_type="text/plain",
@@ -40,7 +40,7 @@ def t2_full_l3_requires_ack_metadata():
ok("explicit full L3 returns acknowledgement-required decision")
-def t3_safe_l3_preserves_protected_lines():
+def test_safe_l3_preserves_protected_lines():
mark_id = watermark.new_mark_id()
original = (
"The Vendor MUST provide 5 kg by Friday.\n"
@@ -59,7 +59,7 @@ if __name__ == "__main__":
print("=" * 60)
print("oversight_core.l3_policy - focused unit tests")
print("=" * 60)
- t1_risky_documents_default_l3_off()
- t2_full_l3_requires_ack_metadata()
- t3_safe_l3_preserves_protected_lines()
+ test_risky_documents_default_l3_off()
+ test_full_l3_requires_ack_metadata()
+ test_safe_l3_preserves_protected_lines()
print("\n ALL TESTS PASSED - 3/3")
tests/test_manifest_p256_parse_unit.py +63 -0
@@ -0,0 +1,63 @@
+"""
+test_manifest_p256_parse_unit
+=============================
+Regression test for the cross-language HW-P256 manifest parse bug.
+
+Previously the Python manifest parser hard-rejected `p256_pub` as an unknown
+recipient field, which made every Rust-sealed OSGT-HW-P256-v1 container
+unopenable and uninspectable by Python. Rust is the forward canonical
+implementation; Python keeps parse and inspect parity during the transition
+but does not implement the HW-P256 seal/open crypto path. Canonicalization
+of the field set is covered separately by the JCS-unification work.
+"""
+
+from __future__ import annotations
+
+import json
+import sys
+from pathlib import Path
+
+ROOT = Path(__file__).resolve().parent.parent
+sys.path.insert(0, str(ROOT))
+
+from oversight_core.manifest import Manifest, Recipient
+
+
+def test_recipient_has_p256_pub_field():
+ r = Recipient(recipient_id="alice", x25519_pub="00" * 32, p256_pub="ab" * 32)
+ assert r.p256_pub == "ab" * 32
+
+
+def test_p256_pub_defaults_none_so_classic_manifests_unchanged():
+ r = Recipient(recipient_id="alice", x25519_pub="00" * 32)
+ assert r.p256_pub is None
+
+
+def test_manifest_from_json_accepts_p256_pub_recipient():
+ payload = {
+ "file_id": "00000000-0000-4000-8000-000000000000",
+ "issued_at": 1700000000,
+ "suite": "OSGT-HW-P256-v1",
+ "recipient": {
+ "recipient_id": "alice",
+ "x25519_pub": "00" * 32,
+ "p256_pub": "cd" * 32,
+ },
+ }
+ m = Manifest.from_json(json.dumps(payload, sort_keys=True).encode("utf-8"))
+ assert m.suite == "OSGT-HW-P256-v1"
+ assert m.recipient is not None
+ assert m.recipient.p256_pub == "cd" * 32
+
+
+def test_classic_manifest_canonical_bytes_unchanged_by_new_field():
+ # p256_pub defaults to None and must be stripped, so a classic manifest's
+ # canonical bytes are byte-identical to before the field was added.
+ r = Recipient(recipient_id="alice", x25519_pub="00" * 32)
+ m = Manifest(
+ file_id="00000000-0000-4000-8000-000000000000",
+ issued_at=1700000000,
+ recipient=r,
+ )
+ canon = m.canonical_bytes().decode("utf-8")
+ assert "p256_pub" not in canon
tests/test_operator_auth_unit.py +51 -0
@@ -0,0 +1,51 @@
+"""
+test_operator_auth_unit
+=======================
+Regression test for the registry operator-auth fail-closed boot gate.
+
+The registry must refuse to start when OVERSIGHT_OPERATOR_TOKEN is empty
+unless OVERSIGHT_AUTH_DISABLED=1 is set explicitly. Without this gate, the
+public write endpoints (/register, /attribute) let anyone self-sign manifests
+into the append-only transparency log.
+"""
+
+from __future__ import annotations
+
+import sys
+from pathlib import Path
+
+import pytest
+
+ROOT = Path(__file__).resolve().parent.parent
+sys.path.insert(0, str(ROOT))
+
+import registry.server as server
+
+
+def _set(token: str, disabled: bool):
+ server.OPERATOR_TOKEN = token
+ server.AUTH_DISABLED = disabled
+
+
+def test_no_token_not_disabled_refuses_to_boot():
+ _set("", False)
+ with pytest.raises(RuntimeError, match="OVERSIGHT_OPERATOR_TOKEN is required"):
+ server._enforce_auth_config()
+
+
+def test_no_token_but_disabled_boots_with_warning(recwarn):
+ _set("", True)
+ server._enforce_auth_config()
+ assert any(
+ "OVERSIGHT_AUTH_DISABLED" in str(w.message) for w in recwarn.list
+ ), "expected a loud warning when auth is explicitly disabled"
+
+
+def test_token_set_boots_cleanly():
+ _set("a-real-operator-token-value", False)
+ server._enforce_auth_config()
+
+
+def test_token_set_boots_cleanly_even_if_disabled():
+ _set("a-real-operator-token-value", True)
+ server._enforce_auth_config()
tests/test_policy_unit.py +10 -41
@@ -7,8 +7,6 @@ Focused policy/container checks around successful-open counting.
from __future__ import annotations
import sys
-import shutil
-import uuid
from pathlib import Path
ROOT = Path(__file__).resolve().parent.parent
@@ -25,15 +23,7 @@ from oversight_core import (
from oversight_core.policy import PolicyContext, PolicyViolation, record_open
-def ok(msg):
- print(f" [PASS] {msg}")
-
-
-TMP_ROOT = ROOT / ".tmp-tests"
-TMP_ROOT.mkdir(exist_ok=True)
-
-
-def t1_wrong_recipient_does_not_consume_open_count():
+def test_wrong_recipient_does_not_consume_open_count(tmp_path):
issuer = ClassicIdentity.generate()
alice = ClassicIdentity.generate()
bob = ClassicIdentity.generate()
@@ -56,25 +46,19 @@ def t1_wrong_recipient_does_not_consume_open_count():
manifest.policy["max_opens"] = 1
blob = seal(plaintext, manifest, issuer.ed25519_priv, alice.x25519_pub)
- td = TMP_ROOT / f"policy-{uuid.uuid4().hex}"
- td.mkdir(parents=True, exist_ok=False)
+ ctx = PolicyContext(state_dir=tmp_path, mode="LOCAL_ONLY")
try:
- ctx = PolicyContext(state_dir=td, mode="LOCAL_ONLY")
- try:
- open_sealed(blob, bob.x25519_priv, policy_ctx=ctx)
- except Exception:
- pass
- else:
- raise AssertionError("wrong recipient unexpectedly decrypted file")
+ open_sealed(blob, bob.x25519_priv, policy_ctx=ctx)
+ except Exception:
+ pass
+ else:
+ raise AssertionError("wrong recipient unexpectedly decrypted file")
- recovered, _ = open_sealed(blob, alice.x25519_priv, policy_ctx=ctx)
- assert recovered == plaintext
- ok("wrong recipient attempts do not consume max_opens")
- finally:
- shutil.rmtree(td, ignore_errors=True)
+ recovered, _ = open_sealed(blob, alice.x25519_priv, policy_ctx=ctx)
+ assert recovered == plaintext
-def t2_registry_modes_fail_closed():
+def test_registry_modes_fail_closed():
issuer = ClassicIdentity.generate()
alice = ClassicIdentity.generate()
plaintext = b"hello policy"
@@ -101,18 +85,3 @@ def t2_registry_modes_fail_closed():
assert "refusing to fall back" in str(exc)
else:
raise AssertionError(f"{mode} should fail closed until implemented")
- ok("REGISTRY/HYBRID refuse insecure LOCAL_ONLY fallback")
-
-
-def main():
- print("=" * 60)
- print(" oversight_core.policy - focused unit tests")
- print("=" * 60)
- t1_wrong_recipient_does_not_consume_open_count()
- t2_registry_modes_fail_closed()
- print()
- print(" ALL TESTS PASSED - 2/2")
-
-
-if __name__ == "__main__":
- main()
tests/test_pq.py +68 -63
@@ -1,17 +1,21 @@
#!/usr/bin/env python3
"""
-Post-quantum hybrid round-trip test.
+Post-quantum hybrid round-trip tests.
Proves:
1. liboqs is linked and ML-KEM-768 / ML-DSA-65 work.
2. Hybrid DEK wrap (X25519 + ML-KEM-768) round-trips correctly.
3. Tampering with either the classical or PQ component fails.
4. A full hybrid-sealed file can be built and opened.
+
+Skipped automatically when liboqs-python is not installed.
"""
import sys
from pathlib import Path
+import pytest
+
ROOT = Path(__file__).resolve().parent.parent
sys.path.insert(0, str(ROOT))
@@ -23,110 +27,111 @@ from oversight_core.crypto import (
)
-def banner(m): print(f"\n{'='*60}\n {m}\n{'='*60}")
-def ok(m): print(f" [ok] {m}")
-def fail(m): print(f" [FAIL] {m}"); sys.exit(1)
+pytestmark = pytest.mark.skipif(
+ not PQ_AVAILABLE,
+ reason="liboqs-python not installed; install liboqs + liboqs-python to run PQ tests",
+)
-def main():
- banner("0. Check PQ availability")
- if not PQ_AVAILABLE:
- fail("liboqs not linked - install liboqs + liboqs-python")
- ok("liboqs available")
+def test_ml_kem_768_raw_round_trip():
+ from oversight_core.crypto import pq_kem_encap, pq_kem_decap
- banner("1. ML-KEM-768 raw round-trip")
priv, pub = pq_kem_keypair()
- ok(f"keypair: pub={len(pub)}B priv={len(priv)}B")
- from oversight_core.crypto import pq_kem_encap, pq_kem_decap
ct, ss1 = pq_kem_encap(pub)
ss2 = pq_kem_decap(priv, ct)
- if ss1 != ss2:
- fail("ML-KEM shared secrets don't match")
- ok(f"ML-KEM-768 round-trip OK ({len(ss1)}B shared secret)")
+ assert ss1 == ss2, "ML-KEM shared secrets don't match"
+
- banner("2. ML-DSA-65 raw round-trip")
+def test_ml_dsa_65_raw_round_trip():
sig_priv, sig_pub = pq_sig_keypair()
- ok(f"keypair: pub={len(sig_pub)}B priv={len(sig_priv)}B")
msg = b"OVERSIGHT v0.2 post-quantum hybrid test"
signature = pq_sign(msg, sig_priv)
- ok(f"signature: {len(signature)}B")
- if not pq_verify(msg, signature, sig_pub):
- fail("ML-DSA verify failed for valid signature")
- ok("ML-DSA-65 verify accepts valid signature")
- if pq_verify(b"tampered message", signature, sig_pub):
- fail("ML-DSA verify accepted signature over different message")
- ok("ML-DSA-65 verify rejects tampered message")
-
- banner("3. Hybrid DEK wrap (classical + PQ)")
+ assert pq_verify(msg, signature, sig_pub), "ML-DSA verify failed for valid signature"
+ assert not pq_verify(b"tampered message", signature, sig_pub), (
+ "ML-DSA verify accepted signature over different message"
+ )
+
+
+def test_hybrid_dek_wrap_round_trips():
alice_classical = ClassicIdentity.generate()
alice_mlkem_priv, alice_mlkem_pub = pq_kem_keypair()
dek = random_dek()
- print(f" DEK: {len(dek)}B")
-
wrapped = hybrid_wrap_dek(
dek,
x25519_pub=alice_classical.x25519_pub,
mlkem_pub=alice_mlkem_pub,
)
- ok(f"wrapped: suite={wrapped['suite']}")
- ok(f" x25519_ephemeral_pub = {len(bytes.fromhex(wrapped['x25519_ephemeral_pub']))}B")
- ok(f" mlkem_ciphertext = {len(bytes.fromhex(wrapped['mlkem_ciphertext']))}B")
- ok(f" wrapped_dek = {len(bytes.fromhex(wrapped['wrapped_dek']))}B")
-
recovered = hybrid_unwrap_dek(
wrapped,
x25519_priv=alice_classical.x25519_priv,
mlkem_priv=alice_mlkem_priv,
)
- if recovered != dek:
- fail("hybrid unwrap recovered wrong DEK")
- ok("hybrid unwrap recovered original DEK exactly")
+ assert recovered == dek, "hybrid unwrap recovered wrong DEK"
+
- banner("4. Tamper with classical half")
+def test_tamper_with_classical_half_rejected():
+ alice_classical = ClassicIdentity.generate()
+ alice_mlkem_priv, alice_mlkem_pub = pq_kem_keypair()
+ dek = random_dek()
+ wrapped = hybrid_wrap_dek(
+ dek,
+ x25519_pub=alice_classical.x25519_pub,
+ mlkem_pub=alice_mlkem_pub,
+ )
bad = dict(wrapped)
- # Replace X25519 ephemeral pub with a random one
other_classic = ClassicIdentity.generate()
bad["x25519_ephemeral_pub"] = other_classic.x25519_pub.hex()
- try:
+ with pytest.raises(Exception):
hybrid_unwrap_dek(bad, alice_classical.x25519_priv, alice_mlkem_priv)
- fail("tamper of classical half should have failed")
- except Exception as e:
- ok(f"classical tamper correctly rejected: {type(e).__name__}")
- banner("5. Tamper with PQ half")
+
+def test_tamper_with_pq_half_rejected():
+ alice_classical = ClassicIdentity.generate()
+ alice_mlkem_priv, alice_mlkem_pub = pq_kem_keypair()
+ dek = random_dek()
+ wrapped = hybrid_wrap_dek(
+ dek,
+ x25519_pub=alice_classical.x25519_pub,
+ mlkem_pub=alice_mlkem_pub,
+ )
bad2 = dict(wrapped)
- # Corrupt a byte of the mlkem ciphertext
ct_bytes = bytearray(bytes.fromhex(bad2["mlkem_ciphertext"]))
ct_bytes[100] ^= 0x01
bad2["mlkem_ciphertext"] = bytes(ct_bytes).hex()
- try:
+ with pytest.raises(Exception):
hybrid_unwrap_dek(bad2, alice_classical.x25519_priv, alice_mlkem_priv)
- fail("tamper of PQ half should have failed")
- except Exception as e:
- ok(f"PQ tamper correctly rejected: {type(e).__name__}")
- banner("6. Wrong recipient")
+
+def test_wrong_recipient_rejected():
+ alice_classical = ClassicIdentity.generate()
+ alice_mlkem_priv, alice_mlkem_pub = pq_kem_keypair()
+ dek = random_dek()
+ wrapped = hybrid_wrap_dek(
+ dek,
+ x25519_pub=alice_classical.x25519_pub,
+ mlkem_pub=alice_mlkem_pub,
+ )
bob_classical = ClassicIdentity.generate()
bob_mlkem_priv, _ = pq_kem_keypair()
- try:
+ with pytest.raises(Exception):
hybrid_unwrap_dek(wrapped, bob_classical.x25519_priv, bob_mlkem_priv)
- fail("wrong recipient should have failed")
- except Exception as e:
- ok(f"wrong recipient correctly rejected: {type(e).__name__}")
- banner("7. Size comparison: CLASSIC vs HYBRID")
+
+def test_hybrid_overhead_is_bounded():
+ alice_classical = ClassicIdentity.generate()
+ alice_mlkem_priv, alice_mlkem_pub = pq_kem_keypair()
+ dek = random_dek()
+ wrapped = hybrid_wrap_dek(
+ dek,
+ x25519_pub=alice_classical.x25519_pub,
+ mlkem_pub=alice_mlkem_pub,
+ )
classic_wrap = crypto.wrap_dek_for_recipient(dek, alice_classical.x25519_pub)
classic_size = sum(len(bytes.fromhex(v)) for v in classic_wrap.values())
hybrid_size = sum(
len(bytes.fromhex(v)) for k, v in wrapped.items() if k != "suite"
)
- print(f" CLASSIC wrap: {classic_size} bytes (X25519 ephemeral + nonce + wrapped DEK)")
- print(f" HYBRID wrap: {hybrid_size} bytes (X25519 eph + ML-KEM ct + nonce + wrapped DEK)")
- print(f" overhead: {hybrid_size - classic_size} bytes per file")
-
- banner("ALL PQ TESTS PASSED - OVERSIGHT is post-quantum-ready")
-
-
-if __name__ == "__main__":
- main()
+ overhead = hybrid_size - classic_size
+ assert overhead > 0, "hybrid wrap should be larger than classic"
+ assert overhead < 4096, f"hybrid overhead unexpectedly large: {overhead} bytes"
tests/test_registry_conformance.py +29 -0
@@ -114,6 +114,8 @@ def build_in_process_client():
os.environ["OVERSIGHT_DATA_DIR"] = tmp
# Rekor off by default so the harness does not touch the public log.
os.environ.setdefault("OVERSIGHT_REKOR_ENABLED", "0")
+ # This harness exercises registry logic, not operator auth.
+ os.environ.setdefault("OVERSIGHT_AUTH_DISABLED", "1")
# Require the DNS secret to exercise the non-loopback fail-closed path.
os.environ["OVERSIGHT_DNS_EVENT_SECRET"] = "test-dns-secret-123"
@@ -429,5 +431,32 @@ def main() -> None:
shutil.rmtree(tmp, ignore_errors=True)
+def test_registry_v1_conformance_harness() -> None:
+ """Pytest entry point for the registry v1 conformance harness.
+
+ The harness is intentionally a single pytest case: the checks share state
+ (the registered file_id drives the subsequent attribution, evidence, tlog,
+ and beacon checks) and the question under test is one yes/no question,
+ "does this registry meet v1 conformance?" Per-check pass/fail is still
+ printed to stdout so a CI log is a compact conformance report.
+ """
+ PASSED.clear()
+ FAILED.clear()
+ url = os.environ.get("OVERSIGHT_REGISTRY_URL", "").strip()
+ tmp = None
+ try:
+ if url:
+ cli, tmp, _ = build_live_client(url)
+ else:
+ cli, tmp, _ = build_in_process_client()
+ run(cli)
+ finally:
+ if tmp and os.path.isdir(tmp):
+ shutil.rmtree(tmp, ignore_errors=True)
+ assert not FAILED, f"{len(FAILED)} conformance check(s) failed: " + ", ".join(
+ name for name, _ in FAILED
+ )
+
+
if __name__ == "__main__":
main()
tests/test_registry_unit.py +16 -49
@@ -9,9 +9,7 @@ from __future__ import annotations
import base64
import json
import os
-import shutil
import sys
-import uuid
from types import SimpleNamespace
ROOT = os.path.join(os.path.dirname(__file__), "..")
@@ -33,7 +31,14 @@ def _new_identity() -> dict:
}
-def t1_rekor_attestation_uses_real_mark_id_and_digest():
+def _fake_request(host: str, headers: dict[str, str] | None = None):
+ return SimpleNamespace(
+ client=SimpleNamespace(host=host),
+ headers=headers or {},
+ )
+
+
+def test_rekor_attestation_uses_real_mark_id_and_digest():
original_identity = registry_server.IDENTITY
original_enabled = registry_server.REKOR_ENABLED
original_upload = registry_server.rekor_mod.upload_dsse
@@ -85,10 +90,9 @@ def t1_rekor_attestation_uses_real_mark_id_and_digest():
"L2_whitespace": "20" * 16,
}
assert result["log_index"] == 7
- print(" [PASS] registry attests using a real mark_id and content_hash")
-def t2_register_rejects_unsigned_sidecar_mismatch():
+def test_register_rejects_unsigned_sidecar_mismatch():
manifest = {
"beacons": [
{"token_id": "tok-1", "kind": "http_img", "url": "https://b.example/p/tok-1.png"},
@@ -124,17 +128,9 @@ def t2_register_rejects_unsigned_sidecar_mismatch():
assert "watermarks do not match" in exc.detail
else:
raise AssertionError("unsigned request watermarks should be rejected")
- print(" [PASS] register rejects unsigned beacon/watermark sidecars")
-
-
-def _fake_request(host: str, headers: dict[str, str] | None = None):
- return SimpleNamespace(
- client=SimpleNamespace(host=host),
- headers=headers or {},
- )
-def t3_dns_event_requires_secret_for_non_loopback():
+def test_dns_event_requires_secret_for_non_loopback():
original_secret = registry_server.DNS_EVENT_SECRET
try:
registry_server.DNS_EVENT_SECRET = ""
@@ -161,15 +157,12 @@ def t3_dns_event_requires_secret_for_non_loopback():
raise AssertionError("wrong DNS callback secret should be rejected")
finally:
registry_server.DNS_EVENT_SECRET = original_secret
- print(" [PASS] dns_event rejects unauthenticated non-loopback callbacks")
-def t4_evidence_bundle_can_attach_tlog_proofs():
+def test_evidence_bundle_can_attach_tlog_proofs(tmp_path):
original_tlog = registry_server.TLOG
- td = os.path.join(ROOT, ".tmp-tests", f"registry-tlog-{uuid.uuid4().hex}")
- os.makedirs(td, exist_ok=False)
try:
- registry_server.TLOG = TransparencyLog(td)
+ registry_server.TLOG = TransparencyLog(tmp_path)
first = registry_server.TLOG.append({"event": "register", "file_id": "f"})
second = registry_server.TLOG.append({"event": "beacon", "file_id": "f"})
proofs = registry_server._tlog_proofs_for_events([
@@ -179,15 +172,13 @@ def t4_evidence_bundle_can_attach_tlog_proofs():
])
finally:
registry_server.TLOG = original_tlog
- shutil.rmtree(td, ignore_errors=True)
assert [p["event_row"] for p in proofs] == [0, 1]
assert [p["tlog_index"] for p in proofs] == [first, second]
assert all(p["proof"]["root"] for p in proofs)
- print(" [PASS] evidence bundles attach tlog inclusion proofs for events")
-def t5_operator_token_gates_write_side_apis_when_configured():
+def test_operator_token_gates_write_side_apis_when_configured():
original_token = registry_server.OPERATOR_TOKEN
try:
registry_server.OPERATOR_TOKEN = ""
@@ -210,22 +201,18 @@ def t5_operator_token_gates_write_side_apis_when_configured():
raise AssertionError("wrong operator token should be rejected")
finally:
registry_server.OPERATOR_TOKEN = original_token
- print(" [PASS] optional operator token gates write-side APIs")
-def t6_tlog_range_fails_closed_on_corrupt_leaf():
+def test_tlog_range_fails_closed_on_corrupt_leaf(tmp_path):
original_tlog = registry_server.TLOG
- td = os.path.join(ROOT, ".tmp-tests", f"registry-range-{uuid.uuid4().hex}")
- os.makedirs(td, exist_ok=False)
try:
- registry_server.TLOG = TransparencyLog(td)
+ registry_server.TLOG = TransparencyLog(tmp_path)
registry_server.TLOG.append({"event": "register", "file_id": "f"})
out = registry_server.tlog_range(start=0, limit=1)
assert out["count"] == 1
assert out["entries"][0]["index"] == 0
- with open(os.path.join(td, "leaves.jsonl"), "w", encoding="utf-8") as f:
- f.write("{not-json}\n")
+ (tmp_path / "leaves.jsonl").write_text("{not-json}\n", encoding="utf-8")
try:
registry_server.tlog_range(start=0, limit=1)
except HTTPException as exc:
@@ -235,23 +222,3 @@ def t6_tlog_range_fails_closed_on_corrupt_leaf():
raise AssertionError("corrupt tlog range should fail closed")
finally:
registry_server.TLOG = original_tlog
- shutil.rmtree(td, ignore_errors=True)
- print(" [PASS] tlog range rejects corrupt leaf records")
-
-
-def main():
- print("=" * 60)
- print(" registry.server - focused unit tests")
- print("=" * 60)
- t1_rekor_attestation_uses_real_mark_id_and_digest()
- t2_register_rejects_unsigned_sidecar_mismatch()
- t3_dns_event_requires_secret_for_non_loopback()
- t4_evidence_bundle_can_attach_tlog_proofs()
- t5_operator_token_gates_write_side_apis_when_configured()
- t6_tlog_range_fails_closed_on_corrupt_leaf()
- print()
- print(" ALL TESTS PASSED - 6/6")
-
-
-if __name__ == "__main__":
- main()
tests/test_rekor_backcompat.py +30 -57
@@ -14,43 +14,40 @@ These checks run fully offline.
"""
from __future__ import annotations
-import json
-import os
import sys
-import tempfile
+from pathlib import Path
-sys.path.insert(0, os.path.join(os.path.dirname(__file__), ".."))
+ROOT = Path(__file__).resolve().parent.parent
+sys.path.insert(0, str(ROOT))
from oversight_core.tlog import TransparencyLog, verify_inclusion_proof
from oversight_core import rekor as R
+from oversight_core.jcs import jcs_dumps
-def t1_legacy_tlog_still_works():
+def test_legacy_tlog_still_works(tmp_path):
"""A TransparencyLog built and verified the v0.4 way must still pass."""
- with tempfile.TemporaryDirectory() as td:
- tl = TransparencyLog(td)
- for i in range(7):
- tl.append({"event": "register", "i": i, "file_id": f"f{i}"})
- size = tl.size()
- root = tl.root()
- assert size == 7, f"expected size 7, got {size}"
- assert len(root) == 32, "root must be 32 bytes (sha256)"
-
- # Inclusion proof for the middle leaf.
- proof = tl.inclusion_proof(3)
- assert proof is not None, "inclusion_proof returned None for valid index"
- ok = verify_inclusion_proof(
- leaf_hash=bytes.fromhex(proof["leaf_hash"]),
- index=proof["index"],
- proof=[bytes.fromhex(h) for h in proof["proof"]],
- tree_size=proof["tree_size"],
- expected_root=bytes.fromhex(proof["root"]),
- )
- assert ok, "RFC 6962 inclusion proof failed to verify"
- print(" [PASS] 1. v0.4 TransparencyLog API still verifies end-to-end")
-
-
-def t2_legacy_bundle_shape_default_kind():
+ tl = TransparencyLog(tmp_path)
+ for i in range(7):
+ tl.append({"event": "register", "i": i, "file_id": f"f{i}"})
+ size = tl.size()
+ root = tl.root()
+ assert size == 7, f"expected size 7, got {size}"
+ assert len(root) == 32, "root must be 32 bytes (sha256)"
+
+ proof = tl.inclusion_proof(3)
+ assert proof is not None, "inclusion_proof returned None for valid index"
+ ok = verify_inclusion_proof(
+ leaf_hash=bytes.fromhex(proof["leaf_hash"]),
+ index=proof["index"],
+ proof=[bytes.fromhex(h) for h in proof["proof"]],
+ tree_size=proof["tree_size"],
+ expected_root=bytes.fromhex(proof["root"]),
+ )
+ assert ok, "RFC 6962 inclusion proof failed to verify"
+
+
+def test_legacy_bundle_shape_default_kind():
"""A v0.4-shaped bundle (no tlog_kind, no bundle_schema) must be readable
and interpretable as ``oversight-self-merkle-v1``."""
legacy_bundle = {
@@ -70,8 +67,6 @@ def t2_legacy_bundle_shape_default_kind():
"root": "00" * 32,
},
}
- # Forward-looking interpretation: a v0.5+ verifier sees no tlog_kind →
- # treats this as the legacy local-merkle path.
assert "rekor" not in legacy_bundle, "v0.4 bundle must not have a rekor field"
assert "bundle_schema" not in legacy_bundle, "v0.4 bundle must not advertise bundle_schema"
inferred_kind = legacy_bundle.get("tlog_kind", R.LEGACY_TLOG_KIND)
@@ -80,20 +75,18 @@ def t2_legacy_bundle_shape_default_kind():
)
inferred_schema = legacy_bundle.get("bundle_schema", 1)
assert inferred_schema == 1, "missing bundle_schema must default to 1 (v0.4 implicit)"
- print(" [PASS] 2. v0.4 bundle defaults: tlog_kind=legacy, schema=1")
-def t3_v05_bundle_advertises_new_fields():
+def test_v05_bundle_advertises_new_fields():
"""The new bundle the v0.5 path emits MUST advertise both fields explicitly
so an old (v0.4) verifier fails fast with 'unknown schema' rather than
silently mis-routing."""
assert R.BUNDLE_SCHEMA == 2, f"BUNDLE_SCHEMA must be 2, got {R.BUNDLE_SCHEMA}"
assert R.TLOG_KIND == "rekor-v2-dsse", f"TLOG_KIND drift: {R.TLOG_KIND!r}"
assert R.LEGACY_TLOG_KIND == "oversight-self-merkle-v1"
- print(" [PASS] 3. v0.5 constants: TLOG_KIND=rekor-v2-dsse, schema=2")
-def t4_canonical_jcs_unchanged_for_legacy_payload():
+def test_canonical_jcs_unchanged_for_legacy_payload():
"""The exact JCS encoding for a v0.4-shaped event must not have changed.
If this fails, downstream verifiers re-checking historical signatures
over canonical JSON will reject events they previously accepted."""
@@ -111,12 +104,11 @@ def t4_canonical_jcs_unchanged_for_legacy_payload():
+ '"' + "11" * 32 + '",'
+ '"n_beacons":3,"n_watermarks":1,"recipient_id":"r0","timestamp":"2026-04-19T00:00:00Z"}'
)
- actual = json.dumps(event, sort_keys=True, separators=(",", ":"))
+ actual = jcs_dumps(event).decode("utf-8")
assert actual == expected, f"JCS drift!\n exp: {expected}\n got: {actual}"
- print(" [PASS] 4. v0.4 event JCS encoding unchanged")
-def t5_predicate_uri_resolves_at_tagged_path():
+def test_predicate_uri_resolves_at_tagged_path():
"""Sanity: the PREDICATE_TYPE URI references a git-tagged path. We don't
fetch (the e2e test does that); we just confirm the URI shape so a typo
like missing the tag won't make it through to a release."""
@@ -125,22 +117,3 @@ def t5_predicate_uri_resolves_at_tagged_path():
), f"PREDICATE_TYPE not pinned to a v0.5 git tag: {R.PREDICATE_TYPE}"
assert R.PREDICATE_TYPE.endswith("/docs/predicates/registration-v1.md")
assert R.PREDICATE_VERSION == 1
- print(" [PASS] 5. PREDICATE_TYPE pinned to v0.5 git-tagged path")
-
-
-def main() -> int:
- print("=" * 60)
- print(" test_rekor_backcompat - v0.4 contract preservation (offline)")
- print("=" * 60)
- t1_legacy_tlog_still_works()
- t2_legacy_bundle_shape_default_kind()
- t3_v05_bundle_advertises_new_fields()
- t4_canonical_jcs_unchanged_for_legacy_payload()
- t5_predicate_uri_resolves_at_tagged_path()
- print()
- print(" ALL BACKCOMPAT TESTS PASSED - 5/5")
- return 0
-
-
-if __name__ == "__main__":
- sys.exit(main())
tests/test_rekor_e2e.py +10 -35
@@ -9,7 +9,7 @@ the public log shard. It is therefore gated behind the OVERSIGHT_REKOR_E2E=1
environment variable so routine test runs do not append to the public log.
Run with:
- OVERSIGHT_REKOR_E2E=1 python3 tests/test_rekor_e2e.py
+ OVERSIGHT_REKOR_E2E=1 pytest tests/test_rekor_e2e.py
What is verified:
1. A DSSE-wrapped Oversight registration predicate uploads successfully.
@@ -25,10 +25,11 @@ Skipped automatically when OVERSIGHT_REKOR_E2E is unset or 0.
from __future__ import annotations
import base64
-import json
import os
import sys
+import pytest
+
sys.path.insert(0, os.path.join(os.path.dirname(__file__), ".."))
from cryptography.hazmat.primitives.asymmetric.ed25519 import Ed25519PrivateKey
@@ -40,6 +41,11 @@ from oversight_core import rekor as R
GATE = os.environ.get("OVERSIGHT_REKOR_E2E", "0") == "1"
LOG_URL = os.environ.get("OVERSIGHT_REKOR_URL", R.DEFAULT_REKOR_URL)
+pytestmark = pytest.mark.skipif(
+ not GATE,
+ reason="Set OVERSIGHT_REKOR_E2E=1 to run; this writes to the public Sigstore log",
+)
+
def _new_keypair() -> tuple[bytes, bytes, str]:
sk = Ed25519PrivateKey.generate()
@@ -52,10 +58,9 @@ def _new_keypair() -> tuple[bytes, bytes, str]:
return priv_raw, pub_raw, pub_pem
-def t1_live_upload_round_trip():
+def test_live_upload_round_trip():
priv_raw, pub_raw, pub_pem = _new_keypair()
- # Recipient X25519 pubkey is hashed before going on-log.
fake_x25519 = b"\x42" * 32
recipient_hash = R.hash_recipient_pubkey(fake_x25519.hex())
@@ -74,27 +79,21 @@ def t1_live_upload_round_trip():
)
envelope = R.sign_dsse(statement=statement, issuer_ed25519_priv=priv_raw)
- # Round-trip verify BEFORE upload (sanity).
assert R.verify_dsse(envelope, pub_raw), "local DSSE verify failed before upload"
- print(f" uploading to {LOG_URL} ...")
result = R.upload_dsse(envelope=envelope, issuer_ed25519_pub_pem=pub_pem, log_url=LOG_URL)
assert result.transparency_log_entry, "rekor returned empty body"
- print(f" log_index={result.log_index} log_id={(result.log_id or '')[:24]}...")
- # The envelope must still verify after the round trip.
assert R.verify_dsse(envelope, pub_raw), "DSSE verify failed AFTER upload (envelope mutated?)"
- # Privacy invariant: raw X25519 must not appear in the on-log envelope.
on_log_payload = base64.b64decode(envelope.payload_b64)
assert fake_x25519.hex() not in on_log_payload.decode("utf-8", errors="ignore"), (
"raw recipient X25519 pubkey leaked into on-log payload"
)
- print(" [PASS] live round trip + privacy invariant held")
-def t2_response_carries_inclusion_data():
+def test_response_carries_inclusion_data():
"""The bundled response must give a verifier enough to verify offline.
Per the v0.5 plan: the write response is the only place we get an
@@ -119,30 +118,6 @@ def t2_response_carries_inclusion_data():
body = result.transparency_log_entry
assert isinstance(body, dict) and body, "rekor body not a non-empty dict"
- # Either logIndex appears, or inclusionProof / logEntry shape is present.
has_idx = result.log_index is not None
has_proof = any(k in body for k in ("inclusionProof", "inclusion_proof", "logEntry"))
assert has_idx or has_proof, f"response missing index AND proof shape: keys={list(body.keys())}"
- print(f" [PASS] response carries inclusion data (idx={has_idx}, proof_shape={has_proof})")
-
-
-def main() -> int:
- if not GATE:
- print("=" * 60)
- print(" test_rekor_e2e: SKIPPED")
- print(" (set OVERSIGHT_REKOR_E2E=1 to run; this writes to the")
- print(" public Sigstore log)")
- print("=" * 60)
- return 0
- print("=" * 60)
- print(f" test_rekor_e2e: LIVE against {LOG_URL}")
- print("=" * 60)
- t1_live_upload_round_trip()
- t2_response_carries_inclusion_data()
- print()
- print(" ALL E2E TESTS PASSED - 2/2")
- return 0
-
-
-if __name__ == "__main__":
- sys.exit(main())
tests/test_rekor_unit.py +12 -49
@@ -10,7 +10,7 @@ Covers (no network):
3. verify_dsse rejects a tampered payload.
4. verify_dsse rejects a wrong-key signature.
5. build_statement produces the expected in-toto v1 shape.
- 6. Envelope JSON serialization is canonical (sorted keys, no whitespace).
+ 6. Envelope JSON serialization is canonical (JCS; no whitespace).
7. verify_inclusion_offline returns False when transparency_log_entry is empty.
8. verify_inclusion_offline rejects mismatched subject digests.
@@ -20,11 +20,9 @@ test_rekor_e2e.py (added in v0.5 Session B).
from __future__ import annotations
import base64
-import json
import os
import sys
-# allow running without installing the package
sys.path.insert(0, os.path.join(os.path.dirname(__file__), ".."))
from cryptography.hazmat.primitives.asymmetric.ed25519 import Ed25519PrivateKey
@@ -40,14 +38,13 @@ def _new_keypair() -> tuple[bytes, bytes]:
)
-def t1_pae_byte_exact():
+def test_pae_byte_exact():
pae = R._pae("application/vnd.in-toto+json", b'{"a":1}')
expect = b"DSSEv1 28 application/vnd.in-toto+json 7 " + b'{"a":1}'
assert pae == expect, f"PAE mismatch:\n got {pae!r}\n expect {expect!r}"
- print(" [PASS] 1. PAE byte-exact match against spec")
-def t2_sign_verify_roundtrip():
+def test_sign_verify_roundtrip():
priv, pub = _new_keypair()
pred = R.OversightRegistrationPredicate(
file_id="00000000-0000-4000-8000-000000000001",
@@ -60,10 +57,9 @@ def t2_sign_verify_roundtrip():
stmt = R.build_statement("aa" * 16, "bb" * 32, pred)
env = R.sign_dsse(stmt, priv)
assert R.verify_dsse(env, pub), "valid envelope failed verification"
- print(" [PASS] 2. sign_dsse + verify_dsse round trip")
-def t3_tamper_payload_rejected():
+def test_tamper_payload_rejected():
priv, pub = _new_keypair()
pred = R.OversightRegistrationPredicate(
file_id="x",
@@ -80,10 +76,9 @@ def t3_tamper_payload_rejected():
signatures=env.signatures,
)
assert not R.verify_dsse(tampered, pub), "tampered payload accepted!"
- print(" [PASS] 3. tampered payload rejected")
-def t4_wrong_key_rejected():
+def test_wrong_key_rejected():
priv, _ = _new_keypair()
_, other_pub = _new_keypair()
pred = R.OversightRegistrationPredicate(
@@ -96,10 +91,9 @@ def t4_wrong_key_rejected():
)
env = R.sign_dsse(R.build_statement("a", "b", pred), priv)
assert not R.verify_dsse(env, other_pub), "wrong-key sig verified!"
- print(" [PASS] 4. wrong public key rejected")
-def t5_statement_shape():
+def test_statement_shape():
pred = R.OversightRegistrationPredicate(
file_id="fid",
issuer_pubkey_ed25519="pp",
@@ -114,10 +108,9 @@ def t5_statement_shape():
assert s["subject"][0]["name"] == "mark:mark1234"
assert s["subject"][0]["digest"]["sha256"].startswith("deadbeef")
assert s["predicate"]["suite"] == "OSGT-CLASSIC-v1"
- print(" [PASS] 5. in-toto v1 statement shape correct")
-def t6_canonical_envelope_json():
+def test_canonical_envelope_json():
priv, _ = _new_keypair()
pred = R.OversightRegistrationPredicate(
file_id="x",
@@ -132,10 +125,9 @@ def t6_canonical_envelope_json():
again = R.DSSEEnvelope.from_json(raw).to_json()
assert raw == again, "envelope JSON not canonical (round-trip differs)"
assert " " not in raw and "\n" not in raw, "envelope JSON has whitespace"
- print(" [PASS] 6. envelope JSON is canonical and round-trip stable")
-def t7_offline_verify_rejects_empty_tle():
+def test_offline_verify_rejects_empty_tle():
priv, pub = _new_keypair()
pred = R.OversightRegistrationPredicate(
file_id="x",
@@ -148,10 +140,9 @@ def t7_offline_verify_rejects_empty_tle():
env = R.sign_dsse(R.build_statement("a", "b" * 32, pred), priv)
ok, reason = R.verify_inclusion_offline({}, env, pub, "b" * 32)
assert not ok and "transparency_log_entry" in reason, reason
- print(f" [PASS] 7. offline verify rejects empty bundle ({reason})")
-def t8_recipient_pubkey_never_appears_raw():
+def test_recipient_pubkey_never_appears_raw():
"""Privacy: raw X25519 recipient key must never end up in the on-log payload."""
priv, _ = _new_keypair()
raw_pub_hex = "11" * 32
@@ -169,10 +160,9 @@ def t8_recipient_pubkey_never_appears_raw():
assert raw_pub_hex not in raw_payload, "RAW recipient pubkey leaked into on-log payload"
assert pred.recipient_pubkey_sha256 in raw_payload
assert pred.recipient_pubkey_sha256 != raw_pub_hex
- print(" [PASS] 8. raw recipient pubkey is hashed before going on-log")
-def t9_predicate_carries_version_int():
+def test_predicate_carries_version_int():
pred = R.OversightRegistrationPredicate(
file_id="x",
issuer_pubkey_ed25519="pp",
@@ -183,10 +173,9 @@ def t9_predicate_carries_version_int():
)
d = pred.to_dict()
assert d.get("predicate_version") == 1, d
- print(" [PASS] 9. predicate body carries integer predicate_version")
-def t10_bundle_has_5year_replay_fields():
+def test_bundle_has_5year_replay_fields():
"""Bundle must carry log_pubkey, checkpoint, schema URI, schema int."""
priv, _ = _new_keypair()
pred = R.OversightRegistrationPredicate(
@@ -222,10 +211,9 @@ def t10_bundle_has_5year_replay_fields():
assert rekor["checkpoint"], "checkpoint missing"
assert rekor["log_entry_schema"] == "rekor/v1.TransparencyLogEntry"
assert bundle["rfc3161_chain"] == "chainpem"
- print(" [PASS] 10. bundle carries log_pubkey + checkpoint + schema URI + schema=2")
-def t11_offline_verify_rejects_digest_mismatch():
+def test_offline_verify_rejects_digest_mismatch():
priv, pub = _new_keypair()
pred = R.OversightRegistrationPredicate(
file_id="x",
@@ -243,28 +231,3 @@ def t11_offline_verify_rejects_digest_mismatch():
"c" * 32,
)
assert not ok and "subject digest" in reason, reason
- print(f" [PASS] 11. offline verify rejects mismatched digest ({reason})")
-
-
-def main():
- print("=" * 60)
- print(" oversight_core.rekor - unit tests (offline, no network)")
- print("=" * 60)
- t1_pae_byte_exact()
- t2_sign_verify_roundtrip()
- t3_tamper_payload_rejected()
- t4_wrong_key_rejected()
- t5_statement_shape()
- t6_canonical_envelope_json()
- t7_offline_verify_rejects_empty_tle()
- t8_recipient_pubkey_never_appears_raw()
- t9_predicate_carries_version_int()
- t10_bundle_has_5year_replay_fields()
- t11_offline_verify_rejects_digest_mismatch()
- print()
- print(" ALL TESTS PASSED - 11/11")
- print()
-
-
-if __name__ == "__main__":
- main()
tests/test_siem_unit.py +62 -110
@@ -2,13 +2,10 @@
"""Focused tests for the SIEM export formatters and registry-row mapping."""
import base64
-import io
import json
import os
import sqlite3
import sys
-import tempfile
-import time
ROOT = os.path.join(os.path.dirname(__file__), "..")
sys.path.insert(0, ROOT)
@@ -19,10 +16,6 @@ from oversight_core import siem
REGISTRY_ID = "deadbeef" * 8
-def ok(msg: str) -> None:
- print(f" [PASS] {msg}")
-
-
def _sample_event(**overrides) -> siem.OversightEvent:
base = dict(
event_id="42",
@@ -44,7 +37,7 @@ def _sample_event(**overrides) -> siem.OversightEvent:
return siem.OversightEvent(**base)
-def t1_splunk_envelope_carries_time_host_event_and_fields():
+def test_splunk_envelope_carries_time_host_event_and_fields():
evt = _sample_event()
out = siem.to_splunk_hec(evt, source="s", sourcetype="st", index="main", host="h")
@@ -59,19 +52,17 @@ def t1_splunk_envelope_carries_time_host_event_and_fields():
assert out["event"]["tlog_index"] == 7
assert out["fields"]["file_id"] == "file_xyz"
assert out["fields"]["beacon_kind"] == "dns"
- ok("Splunk HEC envelope carries time, host, event, and fields")
-def t2_splunk_drops_empty_optional_fields():
+def test_splunk_drops_empty_optional_fields():
evt = _sample_event(user_agent=None, source_ip=None, qualified_timestamp=None)
out = siem.to_splunk_hec(evt)
assert "user_agent" not in out["event"]
assert "source_ip" not in out["event"]
assert "qualified_timestamp" not in out["event"]
- ok("Splunk envelope omits None optionals rather than emitting null")
-def t3_ecs_document_has_canonical_fields():
+def test_ecs_document_has_canonical_fields():
evt = _sample_event()
out = siem.to_ecs(evt)
assert out["@timestamp"] == siem.iso8601(1_735_000_000)
@@ -85,18 +76,16 @@ def t3_ecs_document_has_canonical_fields():
assert out["labels"]["oversight_token_id"] == "tok_abc"
assert out["oversight"]["registry_id"] == REGISTRY_ID
assert out["oversight"]["tlog_index"] == 7
- ok("ECS record carries @timestamp, event.*, source.ip, user_agent.*, oversight.*")
-def t4_ecs_ua_and_source_absent_when_empty():
+def test_ecs_ua_and_source_absent_when_empty():
evt = _sample_event(user_agent=None, source_ip=None)
out = siem.to_ecs(evt)
assert "source" not in out
assert "user_agent" not in out
- ok("ECS record drops empty source/user_agent blocks entirely")
-def t5_sentinel_flat_row_kql_friendly():
+def test_sentinel_flat_row_kql_friendly():
evt = _sample_event()
out = siem.to_sentinel(evt)
assert out["TimeGenerated"] == siem.iso8601(1_735_000_000)
@@ -107,65 +96,56 @@ def t5_sentinel_flat_row_kql_friendly():
assert json.loads(out["ExtraJson"])["qname"] == "abc.t.example.com"
assert "ExtraJson" in out
assert all(not k.startswith("@") for k in out)
- ok("Sentinel row is flat, KQL-friendly, with JSON-serialized extras")
-def t6_from_registry_row_reads_sqlite_row():
- tmp = tempfile.NamedTemporaryFile(suffix=".db", delete=False)
- tmp.close()
- try:
- con = sqlite3.connect(tmp.name)
- con.row_factory = sqlite3.Row
- con.executescript(
- """
- CREATE TABLE events (
- id INTEGER PRIMARY KEY AUTOINCREMENT,
- token_id TEXT NOT NULL,
- file_id TEXT,
- recipient_id TEXT,
- issuer_id TEXT,
- kind TEXT NOT NULL,
- source_ip TEXT,
- user_agent TEXT,
- extra TEXT,
- timestamp INTEGER NOT NULL,
- qualified_timestamp TEXT,
- tlog_index INTEGER
- );
- """
- )
- con.execute(
- "INSERT INTO events (token_id,file_id,recipient_id,issuer_id,kind,"
- "source_ip,user_agent,extra,timestamp,qualified_timestamp,tlog_index) "
- "VALUES (?,?,?,?,?,?,?,?,?,?,?)",
- ("tok", "file", "rcpt", "iss", "dns",
- "203.0.113.9", "curl/8", json.dumps({"qtype": "A"}),
- 1_735_000_000, "2024-12-24T01:06:40Z", 11),
- )
- con.commit()
-
- row = con.execute("SELECT * FROM events WHERE id=1").fetchone()
- evt = siem.from_registry_row(row, registry_id=REGISTRY_ID)
- con.close()
-
- assert evt.event_kind == "dns"
- assert evt.token_id == "tok"
- assert evt.source_ip == "203.0.113.9"
- assert evt.tlog_index == 11
- assert evt.extra == {"qtype": "A"}
- ok("from_registry_row reads a live SQLite row into OversightEvent")
-
- # iter_registry_events in read-only mode.
- events = list(siem.iter_registry_events(tmp.name, registry_id=REGISTRY_ID))
- assert len(events) == 1
- assert events[0].token_id == "tok"
- ok("iter_registry_events opens the db read-only and yields rows")
- finally:
- os.unlink(tmp.name)
+def test_from_registry_row_reads_sqlite_row(tmp_path):
+ db_path = tmp_path / "events.db"
+ con = sqlite3.connect(db_path)
+ con.row_factory = sqlite3.Row
+ con.executescript(
+ """
+ CREATE TABLE events (
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
+ token_id TEXT NOT NULL,
+ file_id TEXT,
+ recipient_id TEXT,
+ issuer_id TEXT,
+ kind TEXT NOT NULL,
+ source_ip TEXT,
+ user_agent TEXT,
+ extra TEXT,
+ timestamp INTEGER NOT NULL,
+ qualified_timestamp TEXT,
+ tlog_index INTEGER
+ );
+ """
+ )
+ con.execute(
+ "INSERT INTO events (token_id,file_id,recipient_id,issuer_id,kind,"
+ "source_ip,user_agent,extra,timestamp,qualified_timestamp,tlog_index) "
+ "VALUES (?,?,?,?,?,?,?,?,?,?,?)",
+ ("tok", "file", "rcpt", "iss", "dns",
+ "203.0.113.9", "curl/8", json.dumps({"qtype": "A"}),
+ 1_735_000_000, "2024-12-24T01:06:40Z", 11),
+ )
+ con.commit()
+
+ row = con.execute("SELECT * FROM events WHERE id=1").fetchone()
+ evt = siem.from_registry_row(row, registry_id=REGISTRY_ID)
+ con.close()
+
+ assert evt.event_kind == "dns"
+ assert evt.token_id == "tok"
+ assert evt.source_ip == "203.0.113.9"
+ assert evt.tlog_index == 11
+ assert evt.extra == {"qtype": "A"}
+
+ events = list(siem.iter_registry_events(str(db_path), registry_id=REGISTRY_ID))
+ assert len(events) == 1
+ assert events[0].token_id == "tok"
-def t7_sentinel_authorization_matches_microsoft_recipe():
- # Known-value check: fixed inputs, recompute and confirm stability.
+def test_sentinel_authorization_matches_microsoft_recipe():
workspace = "00000000-0000-0000-0000-000000000001"
key_bytes = b"\x01" * 32
shared_key_b64 = base64.b64encode(key_bytes).decode("utf-8")
@@ -187,62 +167,34 @@ def t7_sentinel_authorization_matches_microsoft_recipe():
assert header1 == header2
assert header1.startswith(f"SharedKey {workspace}:")
assert len(header1.split(":")[-1]) >= 40
- ok("Sentinel Authorization header is deterministic and correctly prefixed")
-def t8_filesink_and_stdoutsink_write_jsonl():
+def test_filesink_and_stdoutsink_write_jsonl(tmp_path):
evts = [_sample_event(event_id=str(i)) for i in range(3)]
- tmp = tempfile.NamedTemporaryFile(suffix=".jsonl", delete=False)
- tmp.close()
+ sink_path = tmp_path / "events.jsonl"
+ sink = siem.FileSink(str(sink_path), mode="w")
try:
- sink = siem.FileSink(tmp.name, mode="w")
- try:
- n = siem.export_events(events=iter(evts), fmt="ecs", sink=sink)
- finally:
- sink.close()
- assert n == 3
- with open(tmp.name) as f:
- lines = [json.loads(l) for l in f if l.strip()]
- assert len(lines) == 3
- assert lines[0]["event"]["action"] == "beacon-dns-callback"
- ok("FileSink persists one JSON line per event")
+ n = siem.export_events(events=iter(evts), fmt="ecs", sink=sink)
finally:
- os.unlink(tmp.name)
+ sink.close()
+ assert n == 3
+ lines = [json.loads(l) for l in sink_path.read_text().splitlines() if l.strip()]
+ assert len(lines) == 3
+ assert lines[0]["event"]["action"] == "beacon-dns-callback"
-def t9_unknown_format_raises():
+def test_unknown_format_raises():
try:
siem.format_event(_sample_event(), "wazuh")
except ValueError as e:
assert "wazuh" in str(e)
- ok("format_event rejects unknown SIEM names")
return
raise AssertionError("expected ValueError for unknown SIEM format")
-def t10_action_names_cover_all_beacon_kinds():
+def test_action_names_cover_all_beacon_kinds():
for k in ("dns", "http_img", "ocsp", "license"):
evt = _sample_event(event_kind=k)
assert siem.to_splunk_hec(evt)["event"]["action"].startswith("beacon-")
assert siem.to_ecs(evt)["event"]["action"].startswith("beacon-")
assert siem.to_sentinel(evt)["Action"].startswith("beacon-")
- ok("every known beacon kind maps to a stable action name")
-
-
-def run():
- print("[*] test_siem_unit.py")
- t1_splunk_envelope_carries_time_host_event_and_fields()
- t2_splunk_drops_empty_optional_fields()
- t3_ecs_document_has_canonical_fields()
- t4_ecs_ua_and_source_absent_when_empty()
- t5_sentinel_flat_row_kql_friendly()
- t6_from_registry_row_reads_sqlite_row()
- t7_sentinel_authorization_matches_microsoft_recipe()
- t8_filesink_and_stdoutsink_write_jsonl()
- t9_unknown_format_raises()
- t10_action_names_cover_all_beacon_kinds()
- print("[ok] all SIEM unit tests passed")
-
-
-if __name__ == "__main__":
- run()
tests/test_text_format_unit.py +1 -15
@@ -16,7 +16,7 @@ from oversight_core import watermark
from oversight_core.formats import text as text_format
-def t1_text_adapter_matches_core_order():
+def test_text_adapter_matches_core_order():
original = (
"We begin to show how this is significant and we must help users find answers.\n"
"A second paragraph helps the semantic watermark choose visible variants."
@@ -25,17 +25,3 @@ def t1_text_adapter_matches_core_order():
via_adapter = text_format.apply(original, mark_id, layers=("L1", "L2", "L3"))
via_core = watermark.apply_all(original, mark_id, include_l3=True)
assert via_adapter == via_core, "text adapter diverged from core watermark order"
- print(" [PASS] text adapter applies explicit L3/L2/L1 in the same order as the core pipeline")
-
-
-def main():
- print("=" * 60)
- print(" oversight_core.formats.text - focused unit tests")
- print("=" * 60)
- t1_text_adapter_matches_core_order()
- print()
- print(" ALL TESTS PASSED - 1/1")
-
-
-if __name__ == "__main__":
- main()
tests/test_tlog_unit.py +29 -68
@@ -19,75 +19,36 @@ sys.path.insert(0, str(ROOT))
from oversight_core.tlog import TransparencyLog
-def ok(msg):
- print(f" [PASS] {msg}")
+def test_empty_tree_root_matches_rfc6962(tmp_path):
+ tlog = TransparencyLog(tmp_path)
+ assert tlog.size() == 0
+ assert tlog.root() == hashlib.sha256(b"").digest()
-def t1_empty_tree_root_matches_rfc6962():
- td = ROOT / ".tmp-tests" / f"tlog-{uuid.uuid4().hex}"
- td.mkdir(parents=True, exist_ok=False)
+def test_reopen_rejects_corrupt_leaf_record(tmp_path):
+ (tmp_path / "leaves.jsonl").write_text("{not-json}\n", encoding="utf-8")
try:
- tlog = TransparencyLog(td)
- assert tlog.size() == 0
- assert tlog.root() == hashlib.sha256(b"").digest()
- finally:
- shutil.rmtree(td, ignore_errors=True)
- ok("empty transparency log root matches RFC 6962")
-
-
-def t2_reopen_rejects_corrupt_leaf_record():
- td = ROOT / ".tmp-tests" / f"tlog-{uuid.uuid4().hex}"
- td.mkdir(parents=True, exist_ok=False)
- try:
- (td / "leaves.jsonl").write_text("{not-json}\n", encoding="utf-8")
- try:
- TransparencyLog(td)
- except ValueError:
- pass
- else:
- raise AssertionError("corrupt tlog leaf should fail closed on load")
- finally:
- shutil.rmtree(td, ignore_errors=True)
- ok("corrupt transparency log leaf fails closed on load")
-
-
-def t3_range_records_validate_disk_leaf_hashes():
- td = ROOT / ".tmp-tests" / f"tlog-{uuid.uuid4().hex}"
- td.mkdir(parents=True, exist_ok=False)
+ TransparencyLog(tmp_path)
+ except ValueError:
+ pass
+ else:
+ raise AssertionError("corrupt tlog leaf should fail closed on load")
+
+
+def test_range_records_validate_disk_leaf_hashes(tmp_path):
+ tlog = TransparencyLog(tmp_path)
+ tlog.append({"event": "register", "file_id": "f1"})
+ records = tlog.range_records(0, 1)
+ assert records[0]["index"] == 0
+ assert "leaf_data_hex" in records[0]
+
+ rec = json.loads((tmp_path / "leaves.jsonl").read_text(encoding="utf-8"))
+ rec["leaf_data"] = "tampered"
+ rec.pop("leaf_data_hex", None)
+ (tmp_path / "leaves.jsonl").write_text(json.dumps(rec) + "\n", encoding="utf-8")
try:
- tlog = TransparencyLog(td)
- tlog.append({"event": "register", "file_id": "f1"})
- records = tlog.range_records(0, 1)
- assert records[0]["index"] == 0
- assert "leaf_data_hex" in records[0]
-
- rec = json.loads((td / "leaves.jsonl").read_text(encoding="utf-8"))
- rec["leaf_data"] = "tampered"
- rec.pop("leaf_data_hex", None)
- (td / "leaves.jsonl").write_text(json.dumps(rec) + "\n", encoding="utf-8")
- try:
- tlog.range_records(0, 1)
- except ValueError as exc:
- assert "leaf hash mismatch" in str(exc)
- else:
- raise AssertionError("tampered leaf should fail closed during range read")
- finally:
- shutil.rmtree(td, ignore_errors=True)
- ok("range_records validates leaf payload hashes")
-
-
-def main():
- tmp_root = ROOT / ".tmp-tests"
- tmp_root.mkdir(exist_ok=True)
- print("=" * 60)
- print(" oversight_core.tlog - focused unit tests")
- print("=" * 60)
- t1_empty_tree_root_matches_rfc6962()
- t2_reopen_rejects_corrupt_leaf_record()
- t3_range_records_validate_disk_leaf_hashes()
- print()
- print(" ALL TESTS PASSED - 3/3")
-
-
-if __name__ == "__main__":
- main()
+ tlog.range_records(0, 1)
+ except ValueError as exc:
+ assert "leaf hash mismatch" in str(exc)
+ else:
+ raise AssertionError("tampered leaf should fail closed during range read")
tests/test_xff_spoof_unit.py +42 -0
@@ -0,0 +1,42 @@
+"""
+test_xff_spoof_unit
+===================
+Regression test for the X-Forwarded-For source-IP spoofing bug in the
+registry rate limiter and beacon source_ip attribution.
+
+Background: _xff_client must return the RIGHTMOST XFF entry (appended by
+the directly-connected trusted proxy, e.g. Caddy), never the leftmost. The
+leftmost is attacker-controlled because a client may send any XFF header
+and the proxy appends rather than replaces. Trusting the leftmost let an
+attacker pick their rate-limit bucket and forge the source_ip written into
+beacon events and the transparency log.
+"""
+
+from __future__ import annotations
+
+import sys
+from pathlib import Path
+
+ROOT = Path(__file__).resolve().parent.parent
+sys.path.insert(0, str(ROOT))
+
+from registry.server import _xff_client
+
+
+def test_xff_ignores_spoofed_left_entries():
+ assert _xff_client("1.2.3.4, 9.9.9.9") == "9.9.9.9"
+ assert _xff_client("fake, fake2, 203.0.113.7") == "203.0.113.7"
+
+
+def test_xff_single_entry_is_returned():
+ assert _xff_client("9.9.9.9") == "9.9.9.9"
+
+
+def test_xff_whitespace_only_entries_dropped():
+ assert _xff_client(" , , 9.9.9.9") == "9.9.9.9"
+
+
+def test_xff_empty_returns_none():
+ assert _xff_client("") is None
+ assert _xff_client(" ") is None
+ assert _xff_client(" , ") is None
tools/gen_hw_p256_sample.py +4 -1
@@ -67,8 +67,11 @@ def xchacha20poly1305_encrypt(key: bytes, nonce24: bytes, plaintext: bytes, aad:
return ChaCha20Poly1305(subkey).encrypt(nonce12, plaintext, aad)
+# RFC 8785 JCS; byte-exact match with serde_jcs and oversight_core.jcs.jcs_dumps.
+# Standalone form (no oversight_core import): sort_keys + ensure_ascii=False is
+# byte-identical to JCS for the no-floats subset this tool emits.
def canonical_bytes(obj: dict) -> bytes:
- return json.dumps(obj, sort_keys=True, separators=(",", ":"), ensure_ascii=True).encode("utf-8")
+ return json.dumps(obj, sort_keys=True, separators=(",", ":"), ensure_ascii=False).encode("utf-8")
def strip_none(obj):
tools/gen_hybrid_sample.py +6 -2
@@ -74,9 +74,13 @@ def xchacha20poly1305_encrypt(key: bytes, nonce24: bytes, plaintext: bytes, aad:
return ChaCha20Poly1305(subkey).encrypt(nonce12, plaintext, aad)
-# ---------- canonical JSON (must match Python json.dumps sort_keys+compact) ----
+# ---------- canonical JSON (RFC 8785 JCS; byte-exact match with serde_jcs) ----
+# Standalone equivalent of oversight_core.jcs.jcs_dumps: this tool runs without
+# importing oversight_core so the sample generator stays self-contained.
+# json.dumps(..., sort_keys=True, ensure_ascii=False) is byte-identical to JCS
+# for the no-floats subset this tool emits.
def canonical_bytes(obj: dict) -> bytes:
- return json.dumps(obj, sort_keys=True, separators=(",", ":"), ensure_ascii=True).encode("utf-8")
+ return json.dumps(obj, sort_keys=True, separators=(",", ":"), ensure_ascii=False).encode("utf-8")
def strip_none(obj):