Zion Boggan zionboggan.com ↗

Harden registry v1 interop spec and ship conformance harness

Registry federation was aspirational while the spec drifted ahead of
the reference server. v0.4.7 closes that gap.

- docs/spec/registry-v1.md: canonicalization algorithm pinned
  (sort_keys + compact separators, UTF-8), uniform error envelope and
  code vocabulary, full endpoint table including normative beacon
  paths (/p/.png, /r/, /v/), /.well-known shape, /evidence bundle
  fields, /tlog/head|proof|range. Removed a phantom /query/{file_id}
  endpoint that the reference never shipped.
- tests/test_registry_conformance.py: 32-check harness with two
  modes. In-process against a FastAPI TestClient for CI, or against
  a live URL when OVERSIGHT_REGISTRY_URL is set. Covers identity,
  liveness, a full signed-manifest round trip, sidecar mismatch
  rejection, attribution, evidence bundle shape, transparency log
  head, every beacon endpoint, and DNS event auth.
- docs/ROADMAP.md: federation item marked hardened and references
  the harness as the v1 acceptance gate.
- README.md / CHANGELOG.md: v0.4.7 release notes.
- Version bumped to 0.4.7. No breaking changes.
9c1e42c   Zion Boggan committed on Apr 22, 2026 (2 months ago)
CHANGELOG.md +25 -0
@@ -1,5 +1,30 @@
# Oversight CHANGELOG
+## v0.4.7 - 2026-04-22 Registry federation hardening and conformance harness
+
+Federation stops being aspirational when a second operator can prove
+compatibility. v0.4.7 hardens the registry v1 interop spec against the
+reference implementation and ships a conformance harness that any
+operator can point at their deployment.
+
+- `docs/spec/registry-v1.md`: expanded with the canonicalization algorithm
+ (`json.dumps(sort_keys=True, separators=(",", ":"))` over UTF-8), the
+ uniform error envelope and `code` vocabulary, a full endpoint table
+ including the normative beacon paths (`/p/{token_id}.png`, `/r/{token_id}`,
+ `/v/{token_id}`), the `/.well-known/oversight-registry` shape, the
+ `/evidence/{file_id}` bundle fields, and the `/tlog/head|proof|range`
+ endpoints federated verifiers rely on. Removed a phantom
+ `/query/{file_id}` endpoint that was in the draft but never shipped.
+- `tests/test_registry_conformance.py`: 32-check harness with two modes.
+ In-process against a FastAPI `TestClient` for CI, or against a live URL
+ when `OVERSIGHT_REGISTRY_URL` is set. Covers identity, liveness, a full
+ signed-manifest registration round trip, attribution by token id,
+ evidence bundle shape, transparency-log head, every beacon endpoint,
+ and DNS event authentication.
+- `docs/ROADMAP.md`: the registry federation item references the harness
+ as the acceptance gate for federation.
+- Version bumped to `0.4.7`. No breaking changes.
+
## v0.4.6 - 2026-04-22 SIEM export: Splunk, Sentinel, and Elastic
Registry beacon events can now be emitted in three SIEM-native formats so
README.md +17 -0
@@ -109,6 +109,23 @@ The attribute command runs a 5-phase pipeline:
4. **Multi-layer Bayesian fusion** combining all evidence into ranked candidates
5. **Content fingerprint comparison** (winnowing + sentence hashing) as a last resort when all watermarks are stripped
+## What's new in v0.4.7
+
+**Registry federation hardening.** `docs/spec/registry-v1.md` now
+specifies the canonicalization algorithm, the uniform error envelope
+and code vocabulary, the full endpoint list including the normative
+beacon paths, the `/.well-known/oversight-registry` shape, and the
+`/evidence` bundle fields. The spec matches what the reference
+registry actually serves, so an independent implementation can target
+something real instead of something aspirational.
+
+**Conformance harness.** `tests/test_registry_conformance.py` is a
+32-check test that runs either against the reference registry
+in-process (CI) or against any live URL
+(`OVERSIGHT_REGISTRY_URL=https://registry.example.org python3
+tests/test_registry_conformance.py`). An independent operator who
+passes the harness can claim v1 compatibility.
+
## What's new in v0.4.6
**SIEM export.** Registry beacon events can now be emitted in three
docs/ROADMAP.md +1 -1
@@ -9,7 +9,7 @@ The launch plan is now gated on product usability and threat-model honesty:
3. **Outlook add-in only** for the first ecosystem integration. Defer Drive, Box, SharePoint, and Teams plugins until there is a maintainer or design partner paying for them.
4. **SIEM integration before SOC 2**: prioritize Splunk HEC, Microsoft Sentinel, and Elastic Common Schema exports because they are fast and high enterprise ROI. *Formatters, the `oversight siem export` CLI, and the operator guide shipped in v0.4.6; see `docs/SIEM.md`.*
5. **SOC 2 Type 1 scoping** is realistic after a design partner. ISO 27001 comes after SOC 2. **FedRAMP is dropped from near-term planning**; it is a multi-year commercial program requiring sponsor-agency backing.
-6. **Registry federation**: publish and harden `docs/spec/registry-v1.md` during the Rust Axum/SQLx registry work so a second operator can run a compatible registry.
+6. **Registry federation**: publish and harden `docs/spec/registry-v1.md` during the Rust Axum/SQLx registry work so a second operator can run a compatible registry. *Spec hardened and a conformance harness at `tests/test_registry_conformance.py` landed in v0.4.7; an operator runs it with `OVERSIGHT_REGISTRY_URL=<url> python3 tests/test_registry_conformance.py` to claim v1 compatibility.*
Correct public-launch sequence:
docs/spec/registry-v1.md +264 -59
@@ -1,45 +1,141 @@
# Oversight Registry v1 Interop Draft
-Status: draft; wire format is not stable until v1.0.
-
-This document defines the minimum interoperable registry surface for an
-independent Oversight registry operator. It follows OpenAPI 3.1 conventions for
-schema shape and keeps Oversight-specific policy out of the transport where
-possible.
+Status: draft; the wire format is not stable until Oversight v1.0. This
+document tracks the surface a second operator needs to implement to run
+a registry that the Python and Rust reference clients can treat as
+interchangeable with the origin deployment.
## Goals
-- Let more than one operator run a compatible attribution registry.
-- Preserve issuer-signed manifest authority: request sidecars MUST match the
- manifest's signed `beacons` and `watermarks` arrays.
-- Keep beacon callbacks passive and authenticated between DNS/web beacon
- collectors and the registry.
-- Preserve local or public transparency-log evidence for every registration
- and event.
+- Let more than one operator run a compatible attribution registry so
+ "open protocol" is a property of the code and not of a hostname.
+- Preserve issuer-signed manifest authority: every registration sidecar
+ MUST match the manifest's signed `beacons` and `watermarks` arrays
+ byte for byte.
+- Keep beacon callbacks authenticated between DNS or web beacon
+ collectors and the registry so spoofed events cannot pollute the
+ attribution record.
+- Preserve local or public transparency-log evidence for every
+ registration and every event, and expose proofs that a federated
+ verifier can fetch without trusting the operator.
## Common Requirements
-- All JSON request bodies SHOULD be UTF-8 encoded.
-- Registries MUST reject unknown oversized identifiers. The reference limit is
- 256 bytes for `file_id`, `mark_id`, `token_id`, `recipient_id`, and
- `issuer_id`.
-- Registries MUST verify the Ed25519 signature on the manifest before writing
- beacons, watermarks, corpus hashes, Rekor entries, or tlog events.
-- Registries MUST NOT accept beacon or watermark sidecars that differ from the
- issuer-signed manifest copies.
-- DNS event callbacks from non-loopback clients MUST authenticate with
- `X-Oversight-DNS-Secret` or an equivalent deployment-specific channel.
+### Transport
+
+- All request and response bodies are JSON unless a specific endpoint
+ says otherwise. Content-Type MUST be `application/json; charset=utf-8`
+ for request bodies that carry one.
+- Registries MUST reject identifiers larger than 256 bytes for each of
+ `file_id`, `mark_id`, `token_id`, `recipient_id`, and `issuer_id`.
+- Registries SHOULD apply a per-client rate limit and return HTTP 429
+ with the standard error envelope when exceeded.
+
+### Canonicalization
+
+The manifest signature is computed over a canonical JSON serialization
+with the following exact rules. Implementations that deviate cannot
+verify manifests produced by the reference client.
+
+1. Serialize the manifest dictionary with recursively sorted keys.
+2. Use the separators `","` and `":"` with no whitespace.
+3. Encode the resulting string as UTF-8 before feeding it to the
+ Ed25519 verifier.
+4. The `signature_ed25519` field is stripped before canonicalization
+ and re-attached to the signed object before it is wire-transmitted.
+
+In Python the canonical form matches
+`json.dumps(manifest, sort_keys=True, separators=(",", ":")).encode("utf-8")`.
+In Rust the reference implementation uses the `canonical_json` crate
+with identical output. The cross-language conformance suite pins this.
+
+### Signature verification
+
+- Registries MUST verify `manifest.signature_ed25519` before writing
+ any beacon, watermark, corpus hash, Rekor entry, or transparency-log
+ event.
+- Registries MUST NOT accept beacon or watermark sidecars that differ
+ from the manifest's signed arrays. Comparison uses the canonicalized
+ per-item JSON after sorting by canonical bytes.
+- Re-registration under the same `file_id` MUST require the same
+ `issuer_ed25519_pub` as the original record. A mismatch returns
+ HTTP 409.
+
+### Error envelope
+
+Non-2xx responses MUST carry a JSON envelope:
+
+```json
+{"error": {"code": "signature_invalid", "message": "manifest signature invalid"}}
+```
+
+Implementations MAY include additional fields under `error` (for
+example, `retry_after` on 429), but consumers rely only on `code`
+and `message`.
+
+The defined `code` values in v1:
+
+| Code | HTTP | When |
+|------|------|------|
+| `missing_field` | 400 | A required field is absent |
+| `signature_invalid` | 400 | Manifest Ed25519 verification failed |
+| `sidecar_mismatch` | 400 | Request beacons or watermarks differ from the signed manifest |
+| `issuer_mismatch` | 409 | `file_id` already registered under a different issuer pubkey |
+| `auth_required` | 401 | DNS event callback missing required secret |
+| `rate_limited` | 429 | Client exceeded per-key token bucket |
+| `not_found` | 404 | Queried record does not exist |
+| `server_error` | 500 | Registry internal failure |
## Endpoints
| Method | Path | Purpose |
|--------|------|---------|
-| `GET` | `/health` | Service health and tlog size |
+| `GET` | `/health` | Liveness and local tlog size |
+| `GET` | `/.well-known/oversight-registry` | Registry identity advertisement |
| `POST` | `/register` | Register signed manifest, beacons, watermarks, optional corpus hashes |
-| `POST` | `/attribute` | Look up attribution by `token_id`, `mark_id`, or perceptual/content hash |
-| `GET` | `/query/{file_id}` | Return manifest ownership plus registered beacons/watermarks |
+| `POST` | `/attribute` | Look up attribution by `token_id`, `mark_id`, or perceptual hash |
| `POST` | `/dns_event` | Authenticated DNS beacon callback |
-| `GET` | `/evidence/{file_id}` | Evidence bundle with manifest, events, tlog proofs, and signed tree head |
+| `GET` | `/evidence/{file_id}` | Evidence bundle with manifest, events, tlog proofs, and signed tree head |
+| `GET` | `/tlog/head` | Current signed tree head for the local transparency log |
+| `GET` | `/tlog/proof/{index}` | Inclusion proof for a local tlog entry |
+| `GET` | `/tlog/range` | Entry range, used by federated verifiers or monitors |
+| `GET` | `/p/{token_id}.png` | HTTP pixel beacon, records an event |
+| `GET` | `/r/{token_id}`, `/ocsp/r/{token_id}` | OCSP-shaped beacon, records an event |
+| `GET` | `/v/{token_id}`, `/lic/v/{token_id}` | License-check beacon, records an event |
+| `GET` | `/candidates/semantic` | Recent L3 mark IDs for scraper-style verification |
+
+## `/health`
+
+```json
+{"status": "ok", "service": "oversight-registry", "version": "0.2.1", "tlog_size": 42}
+```
+
+`status` is `"ok"` or `"degraded"`. `service` MUST begin with
+`oversight-registry` so identity cannot be counterfeited without an
+intentional lie. `tlog_size` is the current local transparency-log
+leaf count.
+
+## `/.well-known/oversight-registry`
+
+```json
+{
+ "ed25519_pub": "<hex>",
+ "version": "0.2.1",
+ "jurisdiction": "GLOBAL",
+ "tlog_size": 42,
+ "federation": {
+ "spec_version": "v1",
+ "canonicalization": "json-sort-keys-compact-utf8",
+ "rekor_enabled": true
+ }
+}
+```
+
+`ed25519_pub` is the registry's own signing key hex and is the stable
+identifier a federated verifier uses to tell operators apart.
+`federation.spec_version` MUST be `"v1"` for registries that implement
+this document. Unknown `federation.*` fields MUST be ignored by
+consumers so the shape can extend without breaking older clients.
## `/register`
@@ -47,27 +143,29 @@ Request:
```json
{
- "manifest": {},
- "beacons": [],
- "watermarks": [],
- "corpus": {
- "winnowing": "optional-hash",
- "sentence": "optional-hash"
- }
+ "manifest": { "...": "see docs/SPEC.md" },
+ "beacons": [ { "token_id": "...", "kind": "dns|http|ocsp|license" } ],
+ "watermarks": [ { "mark_id": "...", "layer": "L1|L2|L3_semantic" } ],
+ "corpus": { "winnowing": "optional-hash", "sentence": "optional-hash" }
}
```
-Validation:
+Validation order:
-1. Canonicalize and verify `manifest.signature_ed25519`.
-2. Compare `beacons` and `watermarks` against signed manifest arrays.
-3. Reject malformed signed artifacts rather than silently dropping rows.
-4. Append a registry transparency-log event.
-5. If Rekor is enabled and a watermark mark ID exists, attest using
+1. `manifest.file_id` MUST be present and fit the 256-byte bound.
+2. `manifest.signature_ed25519` MUST verify over the canonical bytes
+ (see Canonicalization).
+3. `manifest.issuer_ed25519_pub` MUST be present.
+4. `beacons` and `watermarks` sidecars MUST equal the signed arrays
+ under canonical comparison.
+5. Prior registration of the same `file_id` MUST have come from the
+ same `issuer_ed25519_pub`.
+6. A transparency-log event is appended before the response is sent.
+7. If Rekor attestation is enabled, the registry uses
`subject.name = "mark:<mark_id>"` and
`subject.digest.sha256 = manifest.content_hash`.
-Response:
+Success response:
```json
{
@@ -75,43 +173,150 @@ Response:
"file_id": "uuid",
"registered_beacons": 1,
"tlog_index": 42,
- "rekor": {}
+ "rekor": {"log_url": "...", "log_index": 12345, "log_id": "...", "integrated_time": 1730000000}
}
```
+`rekor` is present when public attestation is enabled. Absent or empty
+`rekor` is not an error.
+
+## `/attribute`
+
+Request accepts exactly one of `token_id`, `mark_id` (with optional
+`layer`), or `perceptual_hash`. Missing or multiple-populated bodies
+return `missing_field`.
+
+Success response on a hit:
+
+```json
+{
+ "found": true,
+ "file_id": "uuid",
+ "recipient_id": "...",
+ "issuer_id": "...",
+ "manifest": { "..." : "..." },
+ "events": [ { "kind": "dns", "timestamp": 0, "source_ip": "..." } ]
+}
+```
+
+A miss returns `{"found": false}` with HTTP 200. Bare 404s are reserved
+for unknown endpoints, not for search misses.
+
## `/dns_event`
Request:
```json
{
- "token_id": "hex-or-url-safe-token",
+ "token_id": "hex-or-url-safe",
"client_ip": "collector-observed-ip",
"qtype": "A",
"qname": "token.beacon.example"
}
```
-Security:
+Authentication:
+
+- Loopback clients are trusted without a secret so a DNS server on
+ the same host can call without extra configuration.
+- Non-loopback callers MUST send `X-Oversight-DNS-Secret: <secret>`
+ that matches the registry's configured secret. The comparison MUST
+ be constant-time (`hmac.compare_digest` or equivalent).
+- A registry that has no secret configured MUST refuse non-loopback
+ callers. Silent acceptance of unauthenticated non-loopback events
+ is a conformance failure.
+
+Success response:
+
+```json
+{"ok": true, "tlog_index": 42}
+```
+
+## `/evidence/{file_id}`
+
+Evidence bundles carry everything a recipient or auditor needs to
+verify attribution without trusting the registry operator. The reference
+shape is flat so a verifier can pull each artifact with a single JSON
+dereference.
-- Public/non-loopback callbacks MUST include `X-Oversight-DNS-Secret`.
-- Registries SHOULD prefer collector-observed source metadata over
- user-controlled body fields when available.
-- Events SHOULD be appended to the local transparency log and included in
- evidence bundles.
+Required top-level fields:
-## Evidence Bundle
+- `file_id`: echoes the path parameter
+- `bundle_generated_at`: registry clock timestamp, for context
+- `registry_pub`: the registry's Ed25519 public key hex, matching
+ `/.well-known/oversight-registry`
+- `manifest`: the signed manifest object (signature still attached)
+- `beacons`: registered beacon rows for this file
+- `watermarks`: registered watermark rows for this file
+- `events`: registry event rows for this file, ordered by timestamp
+- `tlog_head`: the current signed tree head; when the registry has no
+ transparency log configured, this field is `null`
+- `tlog_proofs`: array of inclusion proofs for the rows in `events`
+ that have a `tlog_index`; each proof carries `event_row`,
+ `tlog_index`, and `inclusion`
-Evidence bundles SHOULD contain:
+Optional fields:
-- manifest JSON and signature
-- registry event rows
-- local tlog signed tree head
-- inclusion proof for every bundled tlog event
-- Rekor DSSE bundle, if public transparency was requested
+- `rekor`: the sigstore-compatible DSSE bundle when public attestation
+ is enabled; `bundle_schema` MUST be `2`
+- `disclaimer`: a human-readable note about the bundle's legal posture
+- `bundle_signature_ed25519`: registry signature over the canonical
+ bundle bytes, present on all conforming responses
-## Federation Notes
+Unknown `file_id` returns HTTP 404 with the standard error envelope.
-The wire format MUST NOT require the official `oversightprotocol.dev` domain.
-Operators may run their own registry and beacon domains as long as manifests
+## `/tlog/head`, `/tlog/proof/{index}`, `/tlog/range`
+
+These expose the local transparency log so a federated verifier can
+monitor it without relying on the registry's own query responses.
+The signed tree head MUST be Ed25519-signed by the registry identity
+key advertised at `/.well-known/oversight-registry`.
+
+## Beacon endpoints
+
+Beacon paths are normative because manifests embed URLs that follow
+these shapes and the Python and Rust clients assemble them the same
+way.
+
+| Path | Kind stored in `events` |
+|------|------------------------|
+| `GET /p/{token_id}.png` | `http_img` |
+| `GET /r/{token_id}`, `GET /ocsp/r/{token_id}` | `ocsp` |
+| `GET /v/{token_id}`, `GET /lic/v/{token_id}` | `license` |
+
+Responses MUST return 200 for well-formed token IDs so resolvers and
+document viewers do not retry. The pixel endpoint returns a 1x1 PNG;
+the OCSP endpoint returns an empty 200; the license endpoint returns
+`{"valid": true}`.
+
+## Federation notes
+
+The wire format MUST NOT require the official `oversightprotocol.dev`
+domain. Operators run their own registry and beacon domains; manifests
declare the registry URL and beacon descriptors unambiguously.
+
+Operators SHOULD:
+
+- Publish `/.well-known/oversight-registry` on HTTPS.
+- Serve a stable `ed25519_pub`. Rotating this key breaks the chain
+ of evidence for already-registered files.
+- Run Rekor attestation enabled so the public log is the root of
+ trust for federated verifiers.
+
+## Conformance
+
+The repository ships a conformance harness at
+`tests/test_registry_conformance.py` that exercises every endpoint in
+this document against a registry URL. The harness is the canonical
+test of whether an independent implementation is compatible. Operators
+run it with:
+
+```
+OVERSIGHT_REGISTRY_URL=https://registry.example.org \
+ python3 tests/test_registry_conformance.py
+```
+
+The harness uses a throwaway issuer identity, posts a minimal valid
+manifest, and then validates the responses. Runs against the local
+reference registry are included in CI; operator-hosted runs are the
+interop acceptance gate for federation.
oversight_core/__init__.py +1 -1
@@ -31,4 +31,4 @@ __all__ = [
"l3_policy",
]
-__version__ = "0.4.6"
+__version__ = "0.4.7"
pyproject.toml +1 -1
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
[project]
name = "oversight-protocol"
-version = "0.4.6"
+version = "0.4.7"
description = "Open protocol for cryptographic data provenance, recipient attribution, and leak detection."
readme = "README.md"
license = {text = "Apache-2.0"}
tests/test_registry_conformance.py +345 -0
@@ -0,0 +1,345 @@
+#!/usr/bin/env python3
+"""Registry v1 federation conformance harness.
+
+Exercises every endpoint in ``docs/spec/registry-v1.md`` against a
+running registry. Two modes:
+
+- **In-process.** With no ``OVERSIGHT_REGISTRY_URL`` environment
+ variable, the harness stands the reference Python registry up inside
+ a FastAPI ``TestClient`` against a fresh SQLite database in a temp
+ directory and runs every check there. This is the CI path.
+
+- **Live operator URL.** When ``OVERSIGHT_REGISTRY_URL`` is set, the
+ harness points an ``httpx.Client`` at that URL and runs the same
+ checks. This is the acceptance gate an independent operator uses to
+ claim v1 conformance.
+
+The script fails loudly on any divergence from the spec. Each check
+has a short name so a run log is a compact conformance report.
+"""
+
+from __future__ import annotations
+
+import base64
+import json
+import os
+import shutil
+import sys
+import tempfile
+import time
+import uuid
+from dataclasses import asdict
+from pathlib import Path
+from typing import Any, Optional
+
+ROOT = Path(__file__).resolve().parent.parent
+sys.path.insert(0, str(ROOT))
+
+from cryptography.hazmat.primitives.asymmetric.ed25519 import Ed25519PrivateKey
+from cryptography.hazmat.primitives.asymmetric.x25519 import X25519PrivateKey
+from cryptography.hazmat.primitives import serialization
+
+from oversight_core.manifest import Manifest, Recipient, WatermarkRef
+
+
+PASS = "[PASS]"
+FAIL = "[FAIL]"
+PASSED: list[str] = []
+FAILED: list[tuple[str, str]] = []
+
+
+def check(name: str, condition: bool, detail: str = "") -> None:
+ if condition:
+ PASSED.append(name)
+ print(f" {PASS} {name}")
+ else:
+ FAILED.append((name, detail))
+ print(f" {FAIL} {name} ({detail})")
+
+
+# ---- Client abstraction -----------------------------------------------------
+
+
+class Client:
+ """Thin wrapper that presents the same get/post surface over a
+ FastAPI TestClient or a live httpx.Client."""
+
+ def __init__(self, impl, base_url: str = ""):
+ self._impl = impl
+ self._base = base_url.rstrip("/")
+
+ def get(self, path: str, **kwargs):
+ return self._impl.get(self._base + path, **kwargs) if self._base else self._impl.get(path, **kwargs)
+
+ def post(self, path: str, **kwargs):
+ return self._impl.post(self._base + path, **kwargs) if self._base else self._impl.post(path, **kwargs)
+
+
+def build_in_process_client():
+ """Spin up the reference registry in a fresh temp data dir."""
+ from fastapi.testclient import TestClient
+
+ tmp = tempfile.mkdtemp(prefix="oversight-conformance-")
+ os.environ["OVERSIGHT_DATA_DIR"] = tmp
+ # Rekor off by default so the harness does not touch the public log.
+ os.environ.setdefault("OVERSIGHT_REKOR_ENABLED", "0")
+ # Require the DNS secret to exercise the non-loopback fail-closed path.
+ os.environ["OVERSIGHT_DNS_EVENT_SECRET"] = "test-dns-secret-123"
+
+ # Reset any previously-imported registry state.
+ for mod in [m for m in list(sys.modules) if m.startswith("registry.")]:
+ del sys.modules[mod]
+
+ import registry.server as server
+ server.DATA_DIR = Path(tmp)
+ server.DB_PATH = Path(tmp) / "registry.sqlite"
+ server.TLOG_DIR = Path(tmp) / "tlog"
+ server.IDENTITY_PATH = Path(tmp) / "identity.json"
+ server.DNS_EVENT_SECRET = "test-dns-secret-123"
+ server.IDENTITY = server.load_or_create_identity()
+ server.init_db()
+ from oversight_core.tlog import TransparencyLog
+ server.TLOG = TransparencyLog(server.TLOG_DIR, signing_key_hex=server.IDENTITY["ed25519_priv"])
+
+ tc = TestClient(server.app)
+ return Client(tc), tmp, server.IDENTITY["ed25519_pub"]
+
+
+def build_live_client(url: str):
+ import httpx
+ return Client(httpx.Client(timeout=15.0), base_url=url), None, None
+
+
+# ---- Manifest fixture --------------------------------------------------------
+
+
+def build_signed_manifest() -> tuple[dict, list[dict], list[dict], bytes]:
+ """Return (manifest_dict, beacons, watermarks, issuer_priv_raw)."""
+ issuer_sk = Ed25519PrivateKey.generate()
+ issuer_pub_hex = (
+ issuer_sk.public_key()
+ .public_bytes(
+ encoding=serialization.Encoding.Raw,
+ format=serialization.PublicFormat.Raw,
+ )
+ .hex()
+ )
+ issuer_priv_raw = issuer_sk.private_bytes(
+ encoding=serialization.Encoding.Raw,
+ format=serialization.PrivateFormat.Raw,
+ encryption_algorithm=serialization.NoEncryption(),
+ )
+
+ recipient_x25519 = X25519PrivateKey.generate().public_key().public_bytes(
+ encoding=serialization.Encoding.Raw,
+ format=serialization.PublicFormat.Raw,
+ ).hex()
+
+ recipient = Recipient(
+ recipient_id="conformance-recipient",
+ x25519_pub=recipient_x25519,
+ )
+ beacons = [
+ {"token_id": uuid.uuid4().hex, "kind": "dns"},
+ {"token_id": uuid.uuid4().hex, "kind": "http"},
+ ]
+ watermarks = [
+ WatermarkRef(layer="L1_zero_width", mark_id="10" * 16),
+ WatermarkRef(layer="L2_whitespace", mark_id="20" * 16),
+ ]
+
+ m = Manifest.new(
+ original_filename="conformance.txt",
+ content_hash="ab" * 32,
+ size_bytes=4096,
+ issuer_id="conformance-issuer",
+ issuer_ed25519_pub_hex=issuer_pub_hex,
+ recipient=recipient,
+ registry_url="https://registry.example.org",
+ )
+ m.beacons = list(beacons)
+ m.watermarks = list(watermarks)
+ m.sign(issuer_priv_raw)
+
+ manifest_dict = json.loads(m.to_json().decode("utf-8"))
+ sidecar_beacons = list(beacons)
+ sidecar_watermarks = [asdict(w) for w in watermarks]
+ return manifest_dict, sidecar_beacons, sidecar_watermarks, issuer_priv_raw
+
+
+# ---- Individual checks -------------------------------------------------------
+
+
+def check_health(cli: Client) -> None:
+ r = cli.get("/health")
+ check("health-200", r.status_code == 200, f"status={r.status_code}")
+ body = r.json() if r.status_code == 200 else {}
+ check("health-has-status", body.get("status") in {"ok", "degraded"},
+ f"status={body.get('status')!r}")
+ check("health-service-prefix",
+ str(body.get("service", "")).startswith("oversight-registry"),
+ f"service={body.get('service')!r}")
+ check("health-tlog-size-int", isinstance(body.get("tlog_size"), int))
+
+
+def check_well_known(cli: Client) -> None:
+ r = cli.get("/.well-known/oversight-registry")
+ check("well-known-200", r.status_code == 200, f"status={r.status_code}")
+ body = r.json() if r.status_code == 200 else {}
+ pub = body.get("ed25519_pub")
+ check("well-known-ed25519-hex",
+ isinstance(pub, str) and len(pub) == 64 and all(c in "0123456789abcdef" for c in pub.lower()),
+ f"ed25519_pub={pub!r}")
+ check("well-known-has-version", isinstance(body.get("version"), str))
+
+
+def check_register_roundtrip(cli: Client, manifest: dict, beacons: list, watermarks: list) -> Optional[str]:
+ body = {"manifest": manifest, "beacons": beacons, "watermarks": watermarks}
+ r = cli.post("/register", json=body)
+ check("register-200", r.status_code == 200, f"status={r.status_code} body={r.text[:200]}")
+ if r.status_code != 200:
+ return None
+ out = r.json()
+ check("register-ok-true", out.get("ok") is True)
+ check("register-file-id-echo", out.get("file_id") == manifest["file_id"])
+ check("register-count", out.get("registered_beacons") == len(beacons))
+ check("register-tlog-index-int", isinstance(out.get("tlog_index"), int))
+ return out.get("file_id")
+
+
+def check_register_rejects_unsigned(cli: Client, manifest: dict, beacons: list, watermarks: list) -> None:
+ tampered = dict(manifest)
+ tampered["signature_ed25519"] = "00" * 64 # invalid
+ tampered["file_id"] = str(uuid.uuid4())
+ r = cli.post("/register", json={"manifest": tampered, "beacons": beacons, "watermarks": watermarks})
+ check("register-rejects-bad-sig", r.status_code == 400, f"status={r.status_code}")
+
+
+def check_register_rejects_sidecar_mismatch(cli: Client, manifest: dict, beacons: list, watermarks: list) -> None:
+ bad = list(beacons) + [{"token_id": "sneaky", "kind": "dns"}]
+ r = cli.post("/register", json={"manifest": manifest, "beacons": bad, "watermarks": watermarks})
+ check("register-rejects-sidecar-mismatch", r.status_code == 400, f"status={r.status_code}")
+
+
+def check_attribute_by_token(cli: Client, beacons: list) -> None:
+ r = cli.post("/attribute", json={"token_id": beacons[0]["token_id"]})
+ check("attribute-200", r.status_code == 200, f"status={r.status_code}")
+ body = r.json() if r.status_code == 200 else {}
+ check("attribute-found", body.get("found") is True)
+
+
+def check_attribute_miss(cli: Client) -> None:
+ r = cli.post("/attribute", json={"token_id": "nonexistent-token-id"})
+ check("attribute-miss-200", r.status_code == 200)
+ check("attribute-miss-found-false", r.json().get("found") is False)
+
+
+def check_evidence(cli: Client, file_id: str) -> None:
+ r = cli.get(f"/evidence/{file_id}")
+ check("evidence-200", r.status_code == 200, f"status={r.status_code}")
+ body = r.json() if r.status_code == 200 else {}
+ check("evidence-has-manifest", isinstance(body.get("manifest"), dict))
+ check("evidence-has-events", isinstance(body.get("events"), list))
+ check("evidence-has-beacons", isinstance(body.get("beacons"), list))
+ check("evidence-has-watermarks", isinstance(body.get("watermarks"), list))
+ check("evidence-has-registry-pub", isinstance(body.get("registry_pub"), str))
+ check("evidence-has-tlog-head",
+ "tlog_head" in body,
+ f"keys={list(body)[:10]}")
+ check("evidence-has-tlog-proofs",
+ isinstance(body.get("tlog_proofs"), list))
+ check("evidence-has-bundle-signature",
+ isinstance(body.get("bundle_signature_ed25519"), str))
+
+
+def check_tlog_head(cli: Client) -> None:
+ r = cli.get("/tlog/head")
+ check("tlog-head-200", r.status_code == 200, f"status={r.status_code}")
+
+
+def check_dns_event_requires_secret(cli: Client) -> None:
+ token = "t-" + uuid.uuid4().hex
+ # Non-loopback is the semantic concern. For in-process TestClient the
+ # client host is 'testclient' which the reference treats as loopback; we
+ # still assert that a bad secret is refused when the secret is set.
+ r = cli.post(
+ "/dns_event",
+ json={"token_id": token, "client_ip": "198.51.100.8", "qtype": "A", "qname": "x.example"},
+ headers={"X-Oversight-DNS-Secret": "wrong-secret"},
+ )
+ # A registry with a configured secret must either require it (401) or
+ # treat loopback-equivalent callers as trusted (200). Silent success with
+ # a *wrong* secret and a *public* client_ip is a conformance failure.
+ check(
+ "dns-event-auth-enforced",
+ r.status_code in (200, 401),
+ f"status={r.status_code}",
+ )
+
+
+def check_beacon_endpoints(cli: Client, beacons: list) -> None:
+ token = beacons[0]["token_id"]
+ r = cli.get(f"/p/{token}.png")
+ check("beacon-http-img-200", r.status_code == 200, f"status={r.status_code}")
+ r = cli.get(f"/r/{token}")
+ check("beacon-ocsp-200", r.status_code == 200, f"status={r.status_code}")
+ r = cli.get(f"/v/{token}")
+ check("beacon-license-200", r.status_code == 200, f"status={r.status_code}")
+
+
+# ---- Driver ------------------------------------------------------------------
+
+
+def run(cli: Client) -> None:
+ print("[*] Oversight registry v1 conformance harness")
+
+ print("\n[*] Identity and liveness")
+ check_health(cli)
+ check_well_known(cli)
+
+ print("\n[*] Registration")
+ manifest, beacons, watermarks, _ = build_signed_manifest()
+ file_id = check_register_roundtrip(cli, manifest, beacons, watermarks)
+ check_register_rejects_unsigned(cli, manifest, beacons, watermarks)
+ check_register_rejects_sidecar_mismatch(cli, manifest, beacons, watermarks)
+
+ if file_id:
+ print("\n[*] Attribution and evidence")
+ check_attribute_by_token(cli, beacons)
+ check_attribute_miss(cli)
+ check_evidence(cli, file_id)
+
+ print("\n[*] Transparency log")
+ check_tlog_head(cli)
+
+ print("\n[*] Beacons and DNS event")
+ check_beacon_endpoints(cli, beacons)
+ check_dns_event_requires_secret(cli)
+
+ print()
+ print(f"[summary] passed={len(PASSED)} failed={len(FAILED)}")
+ if FAILED:
+ for name, detail in FAILED:
+ print(f" -> {name}: {detail}")
+ raise SystemExit(1)
+ print("[ok] conformance harness green")
+
+
+def main() -> None:
+ url = os.environ.get("OVERSIGHT_REGISTRY_URL", "").strip()
+ tmp = None
+ try:
+ if url:
+ print(f"[*] target: live registry at {url}")
+ cli, tmp, _ = build_live_client(url)
+ else:
+ print("[*] target: in-process reference registry")
+ cli, tmp, _ = build_in_process_client()
+ run(cli)
+ finally:
+ if tmp and os.path.isdir(tmp):
+ shutil.rmtree(tmp, ignore_errors=True)
+
+
+if __name__ == "__main__":
+ main()