An agentic regulatory attestation pipeline

Four questions every compliance team gets asked.

Attestloop ingests a regulator's publication, classifies it, extracts the binding obligations from the text, and maps each obligation to controls in a named framework. Each run writes a self-contained, auditable log to disk — no database, no orchestrator, no web framework. v1 demonstrates the approach against the EU AI Act and the NIST AI Risk Management Framework.

Currently a research artefact built by Simon Newton. Not for production use.

Question 01 / 04

Are we covered?

When a regulator publishes something new, leadership wants to know whether it affects us — and whether our existing controls already handle it. The honest answer is usually "we don't know yet, give us two weeks." Attestloop closes that gap to about fifteen minutes.

Source: Commission Guidelines on prohibited AI practices · Mapped against: NIST AI Risk Management Framework 1.0

71

binding obligations identified

58

mapped to existing controls

13

framework gaps surfaced

Run completed 2026-04-30 in 17 minutes 17 seconds · $1.30 total cost

Classifier agent confirms scope; Extractor agent identifies binding obligations from the source PDF; Mapper agent compares each to the active control framework with explicit confidence floors.

Question 02 / 04

What do we need to do, and by when?

Once a regulation is in scope, compliance teams need a list of concrete actions, owners, and deadlines — not "review your AI governance posture" but actual tasks tied to actual control IDs.

ID Article Requirement Scope Deadline
EUAIA-OBL-001 Article 5(1)(a) Providers and deployers shall not place on the market, put into service, or use AI systems that deploy subliminal techn… Providers and deployers of AI systems placed on the EU market, put into service… 2025-02-02
EUAIA-OBL-008 Section 2.4, paragraph (16) Providers must ensure their AI systems meet all relevant requirements before placing them on the market or putting them… Providers of AI systems placed on the EU market or put into service in the Union before placing on the market or putting into service
EUAIA-OBL-020 Section 2.9.1, paragraph (53) Member States must designate their competent market surveillance authorities by 2 August 2025. Member States 2025-08-02
EUAIA-OBL-051 Article 26(10) The retrospective use of RBI systems for law enforcement purposes is subject to additional conditions and safeguards in… Deployers using retrospective remote biometric identification systems for law e… 2026-08-02

Excerpt from the v5 run. The full report contains 71 obligations with mapped control IDs and proposed actions. View the full report on GitHub.

Each obligation extracted with source paragraph, regulator-defined scope, deadline where specified, and evidence required. Mapper produces 1–3 control mappings per obligation, each with confidence score and reasoning anchored in specific control text.

Question 03 / 04

Prove it.

At audit or board review time, compliance teams need evidence: which obligations were assessed, when, by whom, against which version of the regulation, with what conclusion. This is what most existing tools do badly because they weren't designed for AI-pace regulatory change.

Provenance footer · v5 run

- Regulation: EU Artificial Intelligence Act (Regulation 2024/1689) (`eu_ai_act`, EU)
- Framework: NIST AI Risk Management Framework 1.0 (`nist_ai_rmf`, 72 controls)
- Classifier model: `claude-haiku-4-5-20251001`
- Extractor model: `claude-sonnet-4-6`
- Mapper model: `claude-sonnet-4-6`
- Classifier prompt SHA-256: `b59962514c4342fc1d6181fb3964dd366c8f6e450218d4e4ff3b02c50038b099`
- Extractor prompt SHA-256: `0828eebb6dd8ad34d769f36773f14888bb048bcdc5ca02e940509fd42701b7ba`
- Mapper prompt SHA-256: `9090c11e1e4b04f07ab617e765a4d0342497ebdccdc2faa88410b8d2424d9cfd`
- Started at: 2026-04-30T11:59:05.511170+00:00
- Total cost: $1.3049
- Total tokens: 174,911 input / 39,245 output

Every LLM call also logs input, output, model, prompt version, cost, and latency to immutable JSON. Each run produces a hashed provenance footer that survives audit.

  • · Per-call audit trail
  • · Hashed prompt versions
  • · Sourced control library
  • · Reproducible against same source

Provenance is a first-class output, not an afterthought. The system is designed so that every claim in the report links back through a chain of inputs, prompts, model versions, and timestamps that an auditor can verify.

Question 04 / 04

What's coming next?

Boards ask about the regulatory pipeline, not just last week's publication. What's in flight at the regulator that we should be preparing for? This is the strategic question — and the one v1 doesn't yet answer.

Live in v1

Single-document attestation against EU AI Act and NIST AI RMF, on demand.

Architected, not yet shipping

Watcher agent polls regulator sources (EUR-Lex, FCA Handbook, EBA, ICO) on a schedule, deduplicates against history, alerts on new in-scope publications.

v2 backlog

Multi-framework mapping (ISO 42001, SOC 2 AI Trust Criteria), customer-supplied control libraries, mapper batching for cost optimisation, longer-TTL cache for sustained throughput.

Honesty about scope is itself an audit-trail feature. Where v1 doesn't reach, the system says so.

The architecture supports the watcher and the multi-framework registry already; v1 ships with the on-demand path enabled. Adding a new regulation is a config plus a small set of regulator-specific prompts.

Pipeline

How the pipeline works

Seven stages, plain Python, no orchestrator. v1 ships three of them as live LLM agents; the others are implemented as code that the same typed schemas pass through. Click any box for the role, prompt, exact v5 input and output, and per-call metrics.

Output

What it produces

v1 through v5 on the same source document. Each iteration changed exactly one thing.

Version Approach Obligations Mappings Unmapped Cost (USD) Runtime
v1 Truncated extractor (50 K char cap), mapper unconstrained 18 54 0 $0.62 5m 17s
v2 Chunked extractor (12 chunks), mapper unconstrained 68 203 0 $2.61 21m 22s
v3 Mapper confidence floor 0.75, no slot-filling 72 164 12 $2.78 41m 35s
v4 Anthropic prompt caching on mapper controls list 69 124 24 $1.19 14m 51s
v5 Fuzzy dedup, title fallback, null rendering, mapper nudge 71 154 13 $1.31 17m 17s

Five iterations from a 50,000-character truncated baseline to a full-document run with confidence-floored mappings, prompt caching, and fuzzy deduplication. Each step kept the previous quality while changing one variable. The v5 cost shape — $1.31 per run, 17 minutes wall-clock — is what production-grade attestation against a single regulation looks like at this scale.

v3 → v4 caching delivered a 30× return on the cache write cost. v4 → v5 dedup removed 12 paraphrased duplicates that substring-match missed. Both changes informed the architecture writeup. Read it for the engineering detail.

Read the writeup →

Reproducibility

Run it yourself

Live re-run capability ships in v2. View the cached v5 run above for the full output — every prompt, every LLM response, every cost line, every obligation, every mapping. The Python source is on GitHub; the pipeline runs end-to-end on a single machine with an Anthropic API key.

v2 feature