Zion Boggan zionboggan.com ↗

Harden Python tlog range validation

Match the Rust fail-closed transparency-log path in the Python reference registry. Recovered leaves and /tlog/range now validate index continuity and leaf hashes, new leaves preserve exact bytes via leaf_data_hex, and the registry conformance harness checks /tlog/range shape.

Co-authored-by: Codex (GPT-5.4) <noreply@openai.com>
d8b067c   Zion Boggan committed on May 28, 2026 (3 weeks ago)
CHANGELOG.md +5 -1
@@ -39,7 +39,11 @@
ignoring corrupted lines during startup or validation. `/tlog/range` now
reads through the same validated tlog API, so malformed or hash-mismatched
records fail the range request instead of being silently omitted from
- monitor responses.
+ monitor responses. The Python reference tlog now matches that behavior:
+ startup and `/tlog/range` fail closed on corrupt leaf records, and new
+ leaves carry `leaf_data_hex` so exact leaf bytes survive recovery. The
+ registry v1 conformance harness now checks `/tlog/range` response shape,
+ raising the live/in-process harness to 34 checks.
- **GitHub Actions runtime hygiene.** Main CI workflows opt into the GitHub
Actions Node 24 runtime before the hosted runner default changes.
- **Rust policy test parity.** Fixed the `oversight-policy` crate's manifest
README.md +5 -2
@@ -107,6 +107,9 @@ malformed, out of sequence, or hash-mismatched.
The Rust registry's `/tlog/range` endpoint uses those validated leaf records
too, so federated monitors cannot receive a partial range with corrupted lines
silently skipped.
+The Python reference registry now has matching tlog recovery and range
+validation, including exact `leaf_data_hex` persistence for newly appended
+leaves.
The next Rust-registry gate is operational burn-in: longer-running deployment
tests against real operator databases and a final wire-format stability
@@ -233,7 +236,7 @@ now exposes the full read-only and beacon surface
`/v/{token_id}`, `/candidates/semantic`) and ships strict CORS
restricted to the public browser-inspector origins with GET and
OPTIONS only. The Axum server now passes `tests/test_registry_conformance.py`
-(33/33) in live-URL mode. `oversight-rust/oversight-manifest` learned
+(34/34) in live-URL mode. `oversight-rust/oversight-manifest` learned
to verify Python-signed v0.4.5+ manifests by carrying
`canonical_content_hash` and `l3_policy` in the signed model, with
a fallback path for older manifests that lack those fields.
@@ -439,7 +442,7 @@ current stable line.
| Layer | Checks | Status |
|---|---|---|
-| Python pytest suite | 10 | green |
+| Python pytest suite | 11 | green |
| Rust oversight-container | 17 | green |
| Rust oversight-crypto | 21 | green |
| Rust oversight-formats | 40 | green |
docs/REGISTRY_DEPLOYMENT.md +3 -0
@@ -145,6 +145,9 @@ the Rust service.
The Rust `/tlog/range` route also reads through validated tlog records, so
malformed or hash-mismatched local leaf data blocks the range response instead
of disappearing from monitor output.
+The Python reference registry uses the same fail-closed local tlog validation
+for startup recovery and `/tlog/range`; newly appended records include
+`leaf_data_hex` so exact event bytes can be recomputed by monitors.
## Rust Registry Burn-In Checklist
docs/ROADMAP.md +5 -3
@@ -17,7 +17,7 @@ threat-model honesty, not on a calendar date.
at `docs/spec/registry-v1.md` is aligned against the reference server:
canonical-JSON algorithm, uniform error envelope, normative endpoint and
beacon paths, `/evidence` bundle shape, and `/tlog/head|proof|range` are
- pinned. `tests/test_registry_conformance.py` runs 33 checks in-process
+ pinned. `tests/test_registry_conformance.py` runs 34 checks in-process
or against a live URL. An operator claims v1 compatibility with
`OVERSIGHT_REGISTRY_URL=https://registry.example.org python3 tests/test_registry_conformance.py`.
4. **Browser inspector and classic-suite decrypt** shipped on
@@ -143,7 +143,7 @@ the reference server actually serves. The spec now pins:
- `/evidence/{file_id}` bundle fields
- `/tlog/head|proof|range` for federated verifiers
-`tests/test_registry_conformance.py` is a 33-check harness with two
+`tests/test_registry_conformance.py` is a 34-check harness with two
modes. In-process against a FastAPI TestClient for CI, or against a
live URL when `OVERSIGHT_REGISTRY_URL` is set. An independent operator
who passes the harness claims v1 compatibility.
@@ -235,7 +235,7 @@ require hardware backing for sensitive material.
`oversight-rust/oversight-registry` is scaffolded with all endpoints
implemented under `#![forbid(unsafe_code)]`. As of 2026-05-14, the Axum
-server passes the existing 33-check `tests/test_registry_conformance.py`
+server passes the existing 34-check `tests/test_registry_conformance.py`
harness in live-URL mode against the registry v1 surface with
`OVERSIGHT_OPERATOR_TOKEN` enabled. The Rust registry now matches the Python
reference for write-side operator-token auth and DNS bridge bearer/header
@@ -256,6 +256,8 @@ corrupted lines.
As of 2026-05-28, `/tlog/range` reads through the validated tlog record API
instead of parsing `leaves.jsonl` directly, so monitor responses fail closed
when an on-disk leaf is malformed or hash-mismatched.
+The Python reference registry now mirrors that fail-closed tlog recovery and
+range behavior, with `leaf_data_hex` on newly appended local tlog records.
Remaining work: longer-running deployment tests and a wire-format stability
declaration before declaring v1.0 ready.
oversight_core/tlog.py +61 -4
@@ -80,6 +80,36 @@ def _rfc6962_path(leaf_hashes: list[bytes], m: int) -> list[bytes]:
return _rfc6962_path(leaf_hashes[k:], m - k) + [_rfc6962_mth(leaf_hashes[:k])]
+def _leaf_data_bytes(rec: dict) -> bytes:
+ if rec.get("leaf_data_hex") is not None:
+ return bytes.fromhex(rec["leaf_data_hex"])
+ leaf_data = rec.get("leaf_data")
+ if not isinstance(leaf_data, str):
+ raise ValueError("leaf_data must be a string")
+ return leaf_data.encode("utf-8")
+
+
+def _parse_leaf_record(line: str, expected_index: int) -> tuple[dict, bytes]:
+ rec = json.loads(line)
+ if not isinstance(rec, dict):
+ raise ValueError("leaf record must be an object")
+ found_index = rec.get("index")
+ if type(found_index) is not int:
+ raise ValueError("leaf index must be an integer")
+ if found_index != expected_index:
+ raise ValueError(
+ f"leaf index mismatch: expected {expected_index}, got {found_index}"
+ )
+ leaf_hash = bytes.fromhex(rec["leaf_hash"])
+ if len(leaf_hash) != 32:
+ raise ValueError(
+ f"leaf hash length for index {found_index}: expected 32, got {len(leaf_hash)}"
+ )
+ if leaf_hash != _h(b"\x00" + _leaf_data_bytes(rec)):
+ raise ValueError(f"leaf hash mismatch at index {found_index}")
+ return rec, leaf_hash
+
+
class TransparencyLog:
"""Append-only Merkle log with signed tree heads.
@@ -108,12 +138,13 @@ class TransparencyLog:
if not self.leaves_path.exists():
return
with self.leaves_path.open("r") as f:
+ expected_index = 0
for line in f:
- try:
- rec = json.loads(line)
- self._leaves.append(bytes.fromhex(rec["leaf_hash"]))
- except (ValueError, KeyError):
+ if not line.strip():
continue
+ _, leaf_hash = _parse_leaf_record(line, expected_index)
+ self._leaves.append(leaf_hash)
+ expected_index += 1
def append(self, leaf_data: bytes | str | dict) -> int:
"""Append a leaf. Durable: fsync before return."""
@@ -135,6 +166,7 @@ class TransparencyLog:
"index": index,
"leaf_hash": leaf_hash.hex(),
"leaf_data": leaf_bytes.decode("utf-8", errors="replace"),
+ "leaf_data_hex": leaf_bytes.hex(),
}) + "\n"
with self.leaves_path.open("a") as f:
f.write(record)
@@ -196,6 +228,31 @@ class TransparencyLog:
"tree_size": len(self._leaves),
}
+ def range_records(self, start: int = 0, limit: int = 500) -> list[dict]:
+ if start < 0:
+ raise ValueError("start must be non-negative")
+ if limit <= 0:
+ return []
+ with self._lock:
+ if start >= len(self._leaves):
+ return []
+ end = min(start + limit, len(self._leaves))
+ records: list[dict] = []
+ expected_index = 0
+ with self.leaves_path.open("r") as f:
+ for line in f:
+ if not line.strip():
+ continue
+ if expected_index >= end:
+ break
+ rec, _ = _parse_leaf_record(line, expected_index)
+ if expected_index >= start:
+ records.append(rec)
+ expected_index += 1
+ if expected_index < end:
+ raise ValueError(f"leaf record missing at index {expected_index}")
+ return records
+
def verify_inclusion_proof(
leaf_hash: bytes,
registry/server.py +6 -14
@@ -768,21 +768,13 @@ def tlog_range(start: int = 0, limit: int = 500):
"""Return tlog leaf entries in [start, start+limit). For CanaryKeeper polling."""
if not TLOG:
raise HTTPException(503, "tlog not initialized")
+ if start < 0:
+ raise HTTPException(400, "start must be non-negative")
limit = min(max(1, limit), 1000)
- leaves_path = TLOG.leaves_path
- if not leaves_path.exists():
- return {"start": start, "count": 0, "entries": []}
- entries = []
- with leaves_path.open("r") as f:
- for i, line in enumerate(f):
- if i < start:
- continue
- if len(entries) >= limit:
- break
- try:
- entries.append(json.loads(line))
- except ValueError:
- continue
+ try:
+ entries = TLOG.range_records(start, limit)
+ except ValueError as exc:
+ raise HTTPException(500, f"tlog range validation failed: {exc}") from exc
return {"start": start, "count": len(entries), "entries": entries}
tests/test_registry_conformance.py +20 -0
@@ -272,6 +272,25 @@ def check_tlog_head(cli: Client) -> None:
check("tlog-head-200", r.status_code == 200, f"status={r.status_code}")
+def check_tlog_range(cli: Client) -> None:
+ r = cli.get("/tlog/range?start=0&limit=10")
+ body = r.json() if r.status_code == 200 else {}
+ entries = body.get("entries")
+ range_ok = (
+ r.status_code == 200
+ and isinstance(entries, list)
+ and body.get("count") == len(entries)
+ and all(
+ isinstance(entry, dict)
+ and isinstance(entry.get("index"), int)
+ and isinstance(entry.get("leaf_hash"), str)
+ and isinstance(entry.get("leaf_data"), str)
+ for entry in entries
+ )
+ )
+ check("tlog-range-200-shape", range_ok, f"status={r.status_code} body={body}")
+
+
def check_dns_event_requires_secret(cli: Client) -> None:
token = "t-" + uuid.uuid4().hex
# Non-loopback is the semantic concern. For in-process TestClient the
@@ -346,6 +365,7 @@ def run(cli: Client) -> None:
print("\n[*] Transparency log")
check_tlog_head(cli)
+ check_tlog_range(cli)
print("\n[*] CORS")
check_cors_headers(cli)
tests/test_registry_unit.py +28 -1
@@ -213,6 +213,32 @@ def t5_operator_token_gates_write_side_apis_when_configured():
print(" [PASS] optional operator token gates write-side APIs")
+def t6_tlog_range_fails_closed_on_corrupt_leaf():
+ original_tlog = registry_server.TLOG
+ td = os.path.join(ROOT, ".tmp-tests", f"registry-range-{uuid.uuid4().hex}")
+ os.makedirs(td, exist_ok=False)
+ try:
+ registry_server.TLOG = TransparencyLog(td)
+ registry_server.TLOG.append({"event": "register", "file_id": "f"})
+ out = registry_server.tlog_range(start=0, limit=1)
+ assert out["count"] == 1
+ assert out["entries"][0]["index"] == 0
+
+ with open(os.path.join(td, "leaves.jsonl"), "w", encoding="utf-8") as f:
+ f.write("{not-json}\n")
+ try:
+ registry_server.tlog_range(start=0, limit=1)
+ except HTTPException as exc:
+ assert exc.status_code == 500
+ assert "tlog range validation failed" in exc.detail
+ else:
+ raise AssertionError("corrupt tlog range should fail closed")
+ finally:
+ registry_server.TLOG = original_tlog
+ shutil.rmtree(td, ignore_errors=True)
+ print(" [PASS] tlog range rejects corrupt leaf records")
+
+
def main():
print("=" * 60)
print(" registry.server - focused unit tests")
@@ -222,8 +248,9 @@ def main():
t3_dns_event_requires_secret_for_non_loopback()
t4_evidence_bundle_can_attach_tlog_proofs()
t5_operator_token_gates_write_side_apis_when_configured()
+ t6_tlog_range_fails_closed_on_corrupt_leaf()
print()
- print(" ALL TESTS PASSED - 5/5")
+ print(" ALL TESTS PASSED - 6/6")
if __name__ == "__main__":
tests/test_tlog_unit.py +45 -1
@@ -7,6 +7,7 @@ Focused transparency-log checks around RFC 6962 behavior.
from __future__ import annotations
import hashlib
+import json
import shutil
import sys
import uuid
@@ -34,6 +35,47 @@ def t1_empty_tree_root_matches_rfc6962():
ok("empty transparency log root matches RFC 6962")
+def t2_reopen_rejects_corrupt_leaf_record():
+ td = ROOT / ".tmp-tests" / f"tlog-{uuid.uuid4().hex}"
+ td.mkdir(parents=True, exist_ok=False)
+ try:
+ (td / "leaves.jsonl").write_text("{not-json}\n", encoding="utf-8")
+ try:
+ TransparencyLog(td)
+ except ValueError:
+ pass
+ else:
+ raise AssertionError("corrupt tlog leaf should fail closed on load")
+ finally:
+ shutil.rmtree(td, ignore_errors=True)
+ ok("corrupt transparency log leaf fails closed on load")
+
+
+def t3_range_records_validate_disk_leaf_hashes():
+ td = ROOT / ".tmp-tests" / f"tlog-{uuid.uuid4().hex}"
+ td.mkdir(parents=True, exist_ok=False)
+ try:
+ tlog = TransparencyLog(td)
+ tlog.append({"event": "register", "file_id": "f1"})
+ records = tlog.range_records(0, 1)
+ assert records[0]["index"] == 0
+ assert "leaf_data_hex" in records[0]
+
+ rec = json.loads((td / "leaves.jsonl").read_text(encoding="utf-8"))
+ rec["leaf_data"] = "tampered"
+ rec.pop("leaf_data_hex", None)
+ (td / "leaves.jsonl").write_text(json.dumps(rec) + "\n", encoding="utf-8")
+ try:
+ tlog.range_records(0, 1)
+ except ValueError as exc:
+ assert "leaf hash mismatch" in str(exc)
+ else:
+ raise AssertionError("tampered leaf should fail closed during range read")
+ finally:
+ shutil.rmtree(td, ignore_errors=True)
+ ok("range_records validates leaf payload hashes")
+
+
def main():
tmp_root = ROOT / ".tmp-tests"
tmp_root.mkdir(exist_ok=True)
@@ -41,8 +83,10 @@ def main():
print(" oversight_core.tlog - focused unit tests")
print("=" * 60)
t1_empty_tree_root_matches_rfc6962()
+ t2_reopen_rejects_corrupt_leaf_record()
+ t3_range_records_validate_disk_leaf_hashes()
print()
- print(" ALL TESTS PASSED - 1/1")
+ print(" ALL TESTS PASSED - 3/3")
if __name__ == "__main__":