Zion Boggan zionboggan.com ↗

Harden Rust tlog range reads

Route /tlog/range through validated transparency-log records so malformed or hash-mismatched leaves fail closed instead of being skipped. Add tlog range tests and update registry deployment/spec docs.

Co-authored-by: Codex (GPT-5.4) <noreply@openai.com>
c7c1041   Zion Boggan committed on May 28, 2026 (3 weeks ago)
CHANGELOG.md +4 -1
@@ -36,7 +36,10 @@
tlog leaf payload so an index cannot point at unrelated evidence and still
pass burn-in checks. Local tlog recovery now rejects malformed records,
non-contiguous indexes, and leaf-hash mismatches instead of silently
- ignoring corrupted lines during startup or validation.
+ ignoring corrupted lines during startup or validation. `/tlog/range` now
+ reads through the same validated tlog API, so malformed or hash-mismatched
+ records fail the range request instead of being silently omitted from
+ monitor responses.
- **GitHub Actions runtime hygiene.** Main CI workflows opt into the GitHub
Actions Node 24 runtime before the hosted runner default changes.
- **Rust policy test parity.** Fixed the `oversight-policy` crate's manifest
README.md +5 -2
@@ -104,6 +104,9 @@ The validator also checks that event rows point at matching tlog leaf payloads,
not just in-range indexes.
The local transparency log now fails closed when recovered leaf records are
malformed, out of sequence, or hash-mismatched.
+The Rust registry's `/tlog/range` endpoint uses those validated leaf records
+too, so federated monitors cannot receive a partial range with corrupted lines
+silently skipped.
The next Rust-registry gate is operational burn-in: longer-running deployment
tests against real operator databases and a final wire-format stability
@@ -445,10 +448,10 @@ current stable line.
| Rust oversight-registry | 11 | green |
| Rust oversight-rekor | 10 | green |
| Rust oversight-semantic | 8 | green |
-| Rust oversight-tlog | 12 | green |
+| Rust oversight-tlog | 14 | green |
| Rust oversight-watermark | 4 | green |
| Cross-language conformance | 3 | green |
-| Total automated Rust unit tests | 133 | all green |
+| Total automated Rust unit tests | 135 | all green |
## Design principles (what Oversight never does)
docs/REGISTRY_DEPLOYMENT.md +6 -3
@@ -142,6 +142,9 @@ mismatches, malformed manifest JSON, invalid manifest signatures, and
manifest/file ID divergence. Keep the Python database as a rollback artifact
until validation, live conformance, and evidence-bundle checks pass against
the Rust service.
+The Rust `/tlog/range` route also reads through validated tlog records, so
+malformed or hash-mismatched local leaf data blocks the range response instead
+of disappearing from monitor output.
## Rust Registry Burn-In Checklist
@@ -157,8 +160,8 @@ reference registry to the Rust Axum registry:
5. Start the Rust registry on loopback with `OVERSIGHT_OPERATOR_TOKEN` and
`OVERSIGHT_DNS_EVENT_SECRET` set.
6. Run the live registry v1 conformance harness against the Rust endpoint.
-7. Fetch `/.well-known/oversight-registry`, `/tlog/head`, and at least one
- `/evidence/{file_id}` bundle, then verify the evidence bundle with an
- independent client.
+7. Fetch `/.well-known/oversight-registry`, `/tlog/head`, `/tlog/range`,
+ and at least one `/evidence/{file_id}` bundle, then verify the evidence
+ bundle with an independent client.
8. Keep the Python database and tlog as rollback artifacts until the Rust
service has completed the operator's burn-in window.
docs/ROADMAP.md +3 -0
@@ -253,6 +253,9 @@ matches the event row rather than unrelated evidence.
As of 2026-05-25, local tlog recovery rejects malformed leaf records,
non-contiguous indexes, and leaf-hash mismatches instead of silently ignoring
corrupted lines.
+As of 2026-05-28, `/tlog/range` reads through the validated tlog record API
+instead of parsing `leaves.jsonl` directly, so monitor responses fail closed
+when an on-disk leaf is malformed or hash-mismatched.
Remaining work: longer-running deployment tests and a wire-format stability
declaration before declaring v1.0 ready.
docs/spec/registry-v1.md +2 -0
@@ -285,6 +285,8 @@ carry `leaf_data_hex`. `leaf_data_hex`, when present, is the exact leaf
bytes encoded as lowercase hex. Verifiers MUST recompute
`SHA-256(0x00 || leaf_bytes)` and compare it to `leaf_hash`; legacy
entries without `leaf_data_hex` use the UTF-8 bytes of `leaf_data`.
+Registries MUST fail a range request rather than omit malformed,
+non-contiguous, or hash-mismatched records from the requested window.
## Beacon endpoints
oversight-rust/oversight-registry/src/routes/tlog.rs +4 -30
@@ -1,7 +1,6 @@
use axum::extract::{Path, Query, State};
use axum::Json;
use serde::Deserialize;
-use std::io::{BufRead, BufReader};
use std::sync::Arc;
use crate::error::{RegistryError, Result};
@@ -44,35 +43,10 @@ pub async fn tlog_range(
Query(params): Query<RangeParams>,
) -> Result<Json<serde_json::Value>> {
let limit = params.limit.clamp(1, 1000);
- let leaves_path = state.tlog.data_dir().join("leaves.jsonl");
- if !leaves_path.exists() {
- return Ok(Json(serde_json::json!({
- "start": params.start,
- "count": 0,
- "entries": [],
- })));
- }
-
- let file = std::fs::File::open(&leaves_path)
- .map_err(|e| RegistryError::Internal(format!("could not open tlog leaves: {e}")))?;
- let reader = BufReader::new(file);
- let mut entries = Vec::new();
- for (idx, line) in reader.lines().enumerate() {
- if idx < params.start {
- continue;
- }
- if entries.len() >= limit {
- break;
- }
- let line =
- line.map_err(|e| RegistryError::Internal(format!("could not read tlog leaf: {e}")))?;
- if line.trim().is_empty() {
- continue;
- }
- if let Ok(value) = serde_json::from_str::<serde_json::Value>(&line) {
- entries.push(value);
- }
- }
+ let entries = state
+ .tlog
+ .range_records(params.start, limit)
+ .map_err(|e| RegistryError::Internal(format!("could not read tlog range: {e}")))?;
Ok(Json(serde_json::json!({
"start": params.start,
oversight-rust/oversight-tlog/src/lib.rs +95 -24
@@ -52,6 +52,8 @@ pub enum TlogError {
BadLeafHashLength { index: usize, len: usize },
#[error("leaf hash mismatch at index {0}")]
LeafHashMismatch(usize),
+ #[error("leaf record missing at index {0}")]
+ LeafRecordMissing(usize),
}
pub type Result<T> = std::result::Result<T, TlogError>;
@@ -217,27 +219,8 @@ impl TransparencyLog {
if line.trim().is_empty() {
continue;
}
- let rec: LeafRecord = serde_json::from_str(&line)?;
- let expected_index = leaves.len();
- if rec.index != expected_index {
- return Err(TlogError::LeafIndexMismatch {
- expected: expected_index,
- found: rec.index,
- });
- }
- let bytes = hex::decode(&rec.leaf_hash)?;
- if bytes.len() != 32 {
- return Err(TlogError::BadLeafHashLength {
- index: rec.index,
- len: bytes.len(),
- });
- }
- let mut arr = [0u8; 32];
- arr.copy_from_slice(&bytes);
- if arr != leaf_hash_for_data(&leaf_data_bytes(&rec)?) {
- return Err(TlogError::LeafHashMismatch(rec.index));
- }
- leaves.push(arr);
+ let rec = parse_leaf_record(&line, leaves.len())?;
+ leaves.push(leaf_hash_bytes(&rec)?);
}
}
@@ -365,22 +348,55 @@ impl TransparencyLog {
}
pub fn leaf_record(&self, index: usize) -> Result<Option<LeafRecord>> {
- if index >= self.size() {
+ let leaves = self.leaves.lock().unwrap();
+ if index >= leaves.len() {
return Ok(None);
}
let f = File::open(&self.leaves_path)?;
let reader = BufReader::new(f);
+ let mut expected_index = 0usize;
for line in reader.lines() {
let line = line?;
if line.trim().is_empty() {
continue;
}
- let rec: LeafRecord = serde_json::from_str(&line)?;
+ let rec = parse_leaf_record(&line, expected_index)?;
if rec.index == index {
return Ok(Some(rec));
}
+ expected_index += 1;
+ }
+ Err(TlogError::LeafRecordMissing(index))
+ }
+
+ pub fn range_records(&self, start: usize, limit: usize) -> Result<Vec<LeafRecord>> {
+ let leaves = self.leaves.lock().unwrap();
+ if limit == 0 || start >= leaves.len() {
+ return Ok(Vec::new());
+ }
+ let end = start.saturating_add(limit).min(leaves.len());
+ let f = File::open(&self.leaves_path)?;
+ let reader = BufReader::new(f);
+ let mut expected_index = 0usize;
+ let mut records = Vec::with_capacity(end - start);
+ for line in reader.lines() {
+ let line = line?;
+ if line.trim().is_empty() {
+ continue;
+ }
+ if expected_index >= end {
+ break;
+ }
+ let rec = parse_leaf_record(&line, expected_index)?;
+ if rec.index >= start {
+ records.push(rec);
+ }
+ expected_index += 1;
+ }
+ if expected_index < end {
+ return Err(TlogError::LeafRecordMissing(expected_index));
}
- Ok(None)
+ Ok(records)
}
pub fn data_dir(&self) -> &Path {
@@ -395,6 +411,19 @@ fn leaf_hash_for_data(leaf_data: &[u8]) -> [u8; 32] {
h(&prefixed)
}
+fn leaf_hash_bytes(rec: &LeafRecord) -> Result<[u8; 32]> {
+ let bytes = hex::decode(&rec.leaf_hash)?;
+ if bytes.len() != 32 {
+ return Err(TlogError::BadLeafHashLength {
+ index: rec.index,
+ len: bytes.len(),
+ });
+ }
+ let mut arr = [0u8; 32];
+ arr.copy_from_slice(&bytes);
+ Ok(arr)
+}
+
fn leaf_data_bytes(rec: &LeafRecord) -> Result<Vec<u8>> {
rec.leaf_data_hex
.as_deref()
@@ -404,6 +433,21 @@ fn leaf_data_bytes(rec: &LeafRecord) -> Result<Vec<u8>> {
.map_err(TlogError::Hex)
}
+fn parse_leaf_record(line: &str, expected_index: usize) -> Result<LeafRecord> {
+ let rec: LeafRecord = serde_json::from_str(line)?;
+ if rec.index != expected_index {
+ return Err(TlogError::LeafIndexMismatch {
+ expected: expected_index,
+ found: rec.index,
+ });
+ }
+ let leaf_hash = leaf_hash_bytes(&rec)?;
+ if leaf_hash != leaf_hash_for_data(&leaf_data_bytes(&rec)?) {
+ return Err(TlogError::LeafHashMismatch(rec.index));
+ }
+ Ok(rec)
+}
+
// serde_json needs this little helper for custom errors
trait JsonErrorExt {
fn custom(msg: &'static str) -> Self;
@@ -619,4 +663,31 @@ mod tests {
);
assert!(tl.leaf_record(1).unwrap().is_none());
}
+
+ #[test]
+ fn range_records_returns_requested_records() {
+ let (_d, tl) = mktlog();
+ for event in ["event_a", "event_b", "event_c"] {
+ tl.append(event.as_bytes()).unwrap();
+ }
+ let records = tl.range_records(1, 2).unwrap();
+ assert_eq!(records.len(), 2);
+ assert_eq!(records[0].index, 1);
+ assert_eq!(records[0].leaf_data, "event_b");
+ assert_eq!(records[1].index, 2);
+ assert_eq!(records[1].leaf_data, "event_c");
+ assert!(tl.range_records(3, 10).unwrap().is_empty());
+ }
+
+ #[test]
+ fn range_records_rejects_corrupted_disk_record() {
+ let (dir, tl) = mktlog();
+ tl.append(b"event_a").unwrap();
+ std::fs::write(dir.path().join("leaves.jsonl"), "{not-json}\n").unwrap();
+ let err = match tl.range_records(0, 1) {
+ Ok(_) => panic!("corrupted tlog range succeeded"),
+ Err(err) => err,
+ };
+ assert!(matches!(err, TlogError::Json(_)));
+ }
}