Zion Boggan zionboggan.com ↗

Extend Rust registry validation

Co-authored-by: Codex (GPT-5.4) <noreply@openai.com>
db85803   Zion Boggan committed on May 21, 2026 (1 month ago)
CHANGELOG.md +2 -1
@@ -28,7 +28,8 @@
now checks migrated Rust registry databases for orphaned attribution rows,
identity mismatches, malformed manifest JSON, invalid manifest signatures,
and manifest/file ID divergence before operators declare migration burn-in
- complete.
+ complete. It also validates event/corpus JSON sidecars and tlog index
+ uniqueness so corrupted migrated evidence cannot look clean.
- **Rust policy test parity.** Fixed the `oversight-policy` crate's manifest
fixture after the v0.4.11 `Recipient.p256_pub` schema addition so the full
Rust workspace test suite compiles again.
README.md +6 -5
@@ -95,8 +95,8 @@ secrets, and shared write-side token enforcement across the Python FastAPI
and Rust Axum registries. The Rust registry also has Python-to-Rust SQLite
migration tooling (`--migrate-from`, `--migrate-dry-run`) and a native
`--validate-db` integrity report so operators can preflight, copy, and verify
-attribution rows without treating the Python reference as a permanent
-production dependency.
+attribution rows, event metadata, corpus metadata, and tlog indexes without
+treating the Python reference as a permanent production dependency.
The next Rust-registry gate is operational burn-in: longer-running deployment
tests against real operator databases and a final wire-format stability
@@ -420,9 +420,10 @@ a desktop opens the same way on a phone with the same answer.
The full integration contract, including the seven verifier-safe crates,
the crates that are explicitly out of scope for downstream embedding, the
git-plus-tag pin pattern, and the minimum versions for 32-bit mobile
-support, is documented at [`docs/EMBEDDING.md`](docs/EMBEDDING.md). v0.4.8
-is the recommended pin for any new embedder; older tags work but the
-project does not backport fixes below the current stable line.
+support, is documented at [`docs/EMBEDDING.md`](docs/EMBEDDING.md). v0.4.11
+is the recommended pin for any new embedder; v0.4.8 remains the minimum for
+32-bit Android portability, but the project does not backport fixes below the
+current stable line.
## Test coverage
docs/EMBEDDING.md +2 -1
@@ -122,4 +122,5 @@ to four Android ABIs and one iOS ABI.
- [`oversight-protocol/oversight-mobile`](https://github.com/oversight-protocol/oversight-mobile)
- Flutter + Rust verifier; embeds the seven verifier-safe crates via
- `flutter_rust_bridge`. As of mobile `v0.1.12`, pinned to oversight `v0.4.8`.
+ `flutter_rust_bridge`. Mobile `v0.1.13` tagged the `v0.4.9` pin; current
+ mobile `main` pins the same seven crates to oversight `v0.4.11`.
docs/REGISTRY_DEPLOYMENT.md +25 -3
@@ -134,6 +134,28 @@ oversight-registry \
The validation command prints JSON counts plus integrity failures for orphaned
beacons, watermarks, events, corpus rows, identity mismatches, malformed
-manifest JSON, invalid manifest signatures, and manifest/file ID divergence.
-Keep the Python database as a rollback artifact until validation, live
-conformance, and evidence-bundle checks pass against the Rust service.
+event `extra` JSON, malformed corpus metadata JSON, duplicate or negative
+tlog indexes, malformed manifest JSON, invalid manifest signatures, and
+manifest/file ID divergence. Keep the Python database as a rollback artifact
+until validation, live conformance, and evidence-bundle checks pass against
+the Rust service.
+
+## Rust Registry Burn-In Checklist
+
+Run this checklist before switching production traffic from the Python
+reference registry to the Rust Axum registry:
+
+1. Take a cold copy of the Python SQLite database and keep the original
+ mounted read-only during migration testing.
+2. Run `--migrate-dry-run` and compare all row counts against the source
+ database.
+3. Run the real `--migrate-from` into a fresh Rust database.
+4. Run `--validate-db` and treat any nonzero field as a deployment blocker.
+5. Start the Rust registry on loopback with `OVERSIGHT_OPERATOR_TOKEN` and
+ `OVERSIGHT_DNS_EVENT_SECRET` set.
+6. Run the live registry v1 conformance harness against the Rust endpoint.
+7. Fetch `/.well-known/oversight-registry`, `/tlog/head`, and at least one
+ `/evidence/{file_id}` bundle, then verify the evidence bundle with an
+ independent client.
+8. Keep the Python database and tlog as rollback artifacts until the Rust
+ service has completed the operator's burn-in window.
docs/ROADMAP.md +4 -3
@@ -242,9 +242,10 @@ Python registry's manifests, beacons, watermarks, events, and corpus rows
into the Rust SQLite schema, with `--migrate-dry-run` for count-only
preflight. As of 2026-05-20, `--validate-db` checks the copied Rust database
for orphan rows, identity mismatches, malformed manifest JSON, invalid
-manifest signatures, and manifest/file ID divergence. Remaining work:
-longer-running deployment tests and a wire-format stability declaration before
-declaring v1.0 ready.
+manifest signatures, and manifest/file ID divergence. As of 2026-05-21, that
+validation also covers event/corpus JSON sidecars and tlog index uniqueness.
+Remaining work: longer-running deployment tests and a wire-format stability
+declaration before declaring v1.0 ready.
---
oversight-rust/oversight-registry/src/db.rs +68 -2
@@ -38,6 +38,10 @@ pub struct RegistryIntegrityReport {
pub beacon_identity_mismatches: i64,
pub watermark_identity_mismatches: i64,
pub event_identity_mismatches: i64,
+ pub malformed_event_extra_json: i64,
+ pub malformed_corpus_metadata_json: i64,
+ pub duplicate_event_tlog_indexes: i64,
+ pub negative_event_tlog_indexes: i64,
pub malformed_manifest_json: i64,
pub invalid_manifest_signatures: i64,
pub mismatched_manifest_file_ids: i64,
@@ -260,6 +264,36 @@ pub async fn validate_registry_integrity(pool: &SqlitePool) -> Result<RegistryIn
"SELECT COUNT(*) FROM events e JOIN manifests m ON e.file_id = m.file_id WHERE (e.recipient_id IS NOT NULL AND e.recipient_id != m.recipient_id) OR (e.issuer_id IS NOT NULL AND e.issuer_id != m.issuer_id)",
)
.await?;
+ let duplicate_event_tlog_indexes = count_query(
+ pool,
+ "SELECT COALESCE(SUM(cnt - 1), 0) FROM (SELECT COUNT(*) AS cnt FROM events WHERE tlog_index IS NOT NULL GROUP BY tlog_index HAVING COUNT(*) > 1)",
+ )
+ .await?;
+ let negative_event_tlog_indexes = count_query(
+ pool,
+ "SELECT COUNT(*) FROM events WHERE tlog_index IS NOT NULL AND tlog_index < 0",
+ )
+ .await?;
+
+ let event_extra_rows: Vec<String> = sqlx::query_scalar(
+ "SELECT extra FROM events WHERE extra IS NOT NULL AND TRIM(extra) != ''",
+ )
+ .fetch_all(pool)
+ .await?;
+ let malformed_event_extra_json = event_extra_rows
+ .iter()
+ .filter(|extra| serde_json::from_str::<serde_json::Value>(extra).is_err())
+ .count() as i64;
+
+ let corpus_metadata_rows: Vec<String> = sqlx::query_scalar(
+ "SELECT metadata FROM corpus WHERE metadata IS NOT NULL AND TRIM(metadata) != ''",
+ )
+ .fetch_all(pool)
+ .await?;
+ let malformed_corpus_metadata_json = corpus_metadata_rows
+ .iter()
+ .filter(|metadata| serde_json::from_str::<serde_json::Value>(metadata).is_err())
+ .count() as i64;
let mut malformed_manifest_json = 0;
let mut invalid_manifest_signatures = 0;
@@ -292,6 +326,10 @@ pub async fn validate_registry_integrity(pool: &SqlitePool) -> Result<RegistryIn
&& beacon_identity_mismatches == 0
&& watermark_identity_mismatches == 0
&& event_identity_mismatches == 0
+ && malformed_event_extra_json == 0
+ && malformed_corpus_metadata_json == 0
+ && duplicate_event_tlog_indexes == 0
+ && negative_event_tlog_indexes == 0
&& malformed_manifest_json == 0
&& invalid_manifest_signatures == 0
&& mismatched_manifest_file_ids == 0;
@@ -306,6 +344,10 @@ pub async fn validate_registry_integrity(pool: &SqlitePool) -> Result<RegistryIn
beacon_identity_mismatches,
watermark_identity_mismatches,
event_identity_mismatches,
+ malformed_event_extra_json,
+ malformed_corpus_metadata_json,
+ duplicate_event_tlog_indexes,
+ negative_event_tlog_indexes,
malformed_manifest_json,
invalid_manifest_signatures,
mismatched_manifest_file_ids,
@@ -905,6 +947,10 @@ mod tests {
assert_eq!(report.counts.beacons, 1);
assert_eq!(report.malformed_manifest_json, 0);
assert_eq!(report.invalid_manifest_signatures, 0);
+ assert_eq!(report.malformed_event_extra_json, 0);
+ assert_eq!(report.malformed_corpus_metadata_json, 0);
+ assert_eq!(report.duplicate_event_tlog_indexes, 0);
+ assert_eq!(report.negative_event_tlog_indexes, 0);
pool.close().await;
let _ = std::fs::remove_dir_all(dir);
@@ -946,10 +992,26 @@ mod tests {
"dns",
None,
None,
- None,
+ Some("{"),
21,
None,
+ Some(-1),
+ )
+ .await
+ .unwrap();
+ insert_event(
+ &pool,
+ "token-1",
+ Some("file-1"),
+ Some("recipient-1"),
+ Some("issuer-1"),
+ "dns",
+ None,
None,
+ Some(r#"{"ok":true}"#),
+ 22,
+ None,
+ Some(7),
)
.await
.unwrap();
@@ -959,7 +1021,7 @@ mod tests {
.bind("missing-file")
.bind("perceptual")
.bind("phash-missing")
- .bind(None::<String>)
+ .bind("{")
.bind(21_i64)
.execute(&pool)
.await
@@ -972,6 +1034,10 @@ mod tests {
assert_eq!(report.orphan_events, 1);
assert_eq!(report.orphan_corpus, 1);
assert_eq!(report.malformed_manifest_json, 1);
+ assert_eq!(report.malformed_event_extra_json, 1);
+ assert_eq!(report.malformed_corpus_metadata_json, 1);
+ assert_eq!(report.duplicate_event_tlog_indexes, 1);
+ assert_eq!(report.negative_event_tlog_indexes, 1);
pool.close().await;
let _ = std::fs::remove_dir_all(dir);