Argus + Garrison Design partner
v0.1.0-dev · 269 lib tests / 0 failed · 9 Rust crates · engineering-complete · 2026·05·23

Automated MCP, A2A, and posture compliance scanning for AI runtimes. Argus runs continuously on Garrison — the hardened operator console it pairs with — emitting Ed25519-signed, SHA-384 hash-chained evidence that auditors verify offline a year later.

Argus + Garrison.

Three scanners run on a fixed cadence — MCP server surface, agent-to-agent flows, and Garrison's signed posture manifest. Findings are Ed25519-signed and SHA-384 hash-chained into a SQLite WAL store. Every period summary chains to the last. Auditors verify the chain offline with public keys alone — no Argus dependency at audit time.

The category thesis AI compliance is the next acquisition category. Existing GRC platforms own SaaS, cloud, and identity compliance — but cannot reach the AI runtime layer. Regulators are codifying continuous-evidence requirements through 2026–2027. Argus is the runtime overlay every CISO will need and every GRC vendor will rationally try to acquire rather than build.
§01 · Pressure

Five regulatory waves converging on the same procurement question. Continuous evidence of AI-runtime compliance — not point-in-time attestation.

The pattern is uniform across jurisdictions: regulators want a mathematically continuous record of what the AI system actually did, not a quarterly questionnaire about what it was supposed to do. Today's GRC platforms can't deliver that — by design.

Wave 01

EU AI Act high-risk enforcement

Annex III high-risk systems: continuous record-keeping (Art 12), human oversight (Art 14), accuracy and cybersecurity (Art 15), post-market monitoring (Art 72).

In force · 2027-08-02
Wave 02

NYDFS 23 NYCRR 500 amendments

72-hour cybersecurity event notification · Tier-3 autonomy controls on AI handling NPI · Class A Company supplemental requirements.

In force · AI guidance 2026
Wave 03

OCC SR 11-7 — AI / ML extensions

Effective challenge for AI-driven decisions. Ongoing monitoring of AI model behaviour. Outcomes analysis as a continuous control.

Continuous · AI guidance 2026
Wave 04

MAS TRM + FEAT (Singapore)

Fairness, Ethics, Accountability, Transparency principles applied to AI / ML in regulated financial services on a continuous basis.

In force · ongoing
Wave 05

FedRAMP + DoD AI baseline overlays

Federal AI procurement gating on FedRAMP-pattern continuous monitoring. DoD AI authorization requirements landing through the period.

Rolling · 2026–2027
§02 · The gap

Six runtime properties your GRC platform cannot reach. And that auditors in 2027 will probe.

Drata, Vanta, OneTrust, ServiceNow GRC, Hyperproof, AuditBoard — all excellent at user access reviews, SOC 2 control attestations, cloud configuration audits, identity governance. None of them reach the AI runtime layer. That's the wedge.

Gap·01

MCP tool integrity

Whether a tool's description was silently edited post-deployment. The "rug-pull" — a tool's surface promises one thing, the live string says another. mcp_description_integrity compares live SHA-384 against the attested manifest.

Gap·02

A2A flow attestation

Whether inter-agent crossings carry valid attestation chains. Config-time policies don't capture what actually traversed. Argus reads Drawbridge's CrossingChain at v0.3; Chronicle event tail at v0.5+.

Gap·03

Posture drift at the AI runtime layer

Whether the FIPS-validated crypto module declared at deployment is still the one in use. Whether key-rotation age is still inside the framework maximum. Whether scan cadence still meets the framework's gap ceiling.

Gap·04

Autonomy ceiling enforcement

Whether AI agents stay inside operator-set authority bounds. Tier-3 ceilings for FedRAMP High, CMMC L3, SR 11-7, EU AI Act high-risk. Verified continuously, not as a one-time attestation.

Gap·05

Period chain attestation

Whether the audit log timeline is mathematically continuous, not just present. The findings exist question is easy. The nothing has been deleted since question requires hash chaining and signature verification — not record-keeping.

Gap·06

Cryptographic survivability

Whether the evidence trail can be verified a year later by a third party who doesn't trust you. Auditors don't run your code. Verification must work offline with only published public keys, any FIPS EdDSA implementation, and tar+gzip.

§03 · Scanners

Three deterministic scanners — MCP, A2A, posture. Zero LLM calls at runtime, by design.

Per ADR-0028, Argus has no Anthropic SDK in the workspace. Scanners are pure regex, SQL, and hashing. Entire categories of agentic risk — prompt injection, cascading hallucination — are eliminated by construction. #![forbid(unsafe_code)] at every crate root.

Scanner 01 · MCP

MCP server surfacetool descriptions, governance fields, integrity hashes

  • 22-pattern poisoning detector — instruction overrides, hidden directives, exfiltration URLs, cross-server shadowing, self-promoting language.
  • SHA-384 integrity — live tool description compared against the attested manifest. The rug-pull detector.
  • Governance fields — verifies operator-declared metadata (data_classification, tool_owner, framework-specific fields).
mcp_description_integrity — oracle@9000 · sha-384 match · pass
Scanner 02 · A2A

Agent-to-agent flowsinter-agent crossings, trust scores, boundary protection

  • v0.3 backend — reads Drawbridge CrossingChain SQLite directly (Path C, ADR-0009).
  • v0.5+ backend — Chronicle event tail. Pure backend cutover. No operator action.
  • Per-peer trust evolution · boundary-protection violations (SC-7, HIPAA § 164.312(e)(1), CMMC SC.L2-3.13).
a2a_trust_score_floor — peer:meridian · 0.91 ≥ 0.90 · pass
Scanner 03 · Posture

Garrison posture manifestsigned declaration of compliance posture

  • FIPS module · SHA-384 · Ed25519 · HKDF-SHA384 (CNSA 2.0) declarations.
  • Key-rotation age tracked against framework maxima. Scan cadence verified against framework gap ceiling.
  • PQ mode — off / co-signature / primary. Framework-specific governance fields (impact_level, clearance_required, chd_handling, feat_assessment_reference, ai_handling).
posture_manifest_field — crypto.fips_module = aws-lc-rs:4796 · pass
§04 · Evaluator

Nine RuleCheck variants. The entire v1.0 evaluator surface.

New check types require coordinated changes across the SDK (core/compliance-rules), Citadel's emitter, and argus-rules's dispatcher. The dispatch surface is small on purpose — small surfaces stay falsifiable.

# RuleCheck variant What it evaluates
01McpDescriptionIntegrityLive SHA-384 of a tool's description compared against the attested manifest entry. The rug-pull detector — fires the instant a tool description drifts from what was attested at deployment.
02McpDescriptionPoisoningScanTwenty-two patterns covering instruction overrides, hidden directives, exfiltration URLs, cross-server shadowing, self-promoting language. Each pattern carries its own severity tier.
03GovernanceFieldPresence { required_fields }Per-tool governance metadata check. Operators declare what each tool's metadata must carry; Argus verifies those fields are present and non-empty.
04AutonomyCeiling { max_level }Agent autonomy level ≤ ceiling. FedRAMP High and CMMC L3 enforce Tier-3 ceilings; EU AI Act high-risk and SR 11-7 enforce per-domain ceilings.
05A2aCrossingPolicy { allowed_event_types, max_rate_per_minute, require_attestation }Per-crossing policy: only declared event types, at-or-below the per-minute rate, with valid attestation chain. Boundary-protection findings fire on policy violation.
06A2aTrustScoreFloor { min_score }Peer trust score must remain ≥ floor. IL5 enforces 0.90; FedRAMP High 0.85; CMMC L3 0.85. Trust scores erode with policy violations and recover with observation.
07PostureManifestField { field_path, expected_value }Declared field equality. Operators declare what their crypto provider, FIPS module, hash algorithm, signature algorithm, KDF should be; Argus verifies the manifest matches.
08KeyRotationAge { max_age_days }Signing-key age check. IL5: 60 days. FedRAMP High: 90 days. FedRAMP Moderate: 365 days. SOC 2: per operator policy.
09ChronicleChainContinuity { max_gap_seconds }Period chain gap ceiling. The mathematical continuity check — no period gap larger than the framework's ceiling (IL5: 12h; FedRAMP High: 24h; commercial: 7d).
§05 · Cryptography

Every artifact signed. Every chain falsifiable.

An auditor's question is rarely "did you scan?" The question is whether the scan you ran on this date was complete, and whether anything has been deleted since. The hash chain is the answer. The two-tier canonicalization discipline is what makes verification reproducible.

Every finding emitted by Argus is signed with the deployment Ed25519 key and references the SHA-384 of the previous finding in its framework partition. Periods chain on top of findings the same way. Genesis is 96 zero hex characters.

Verification walks every entry in sequence: reconstruct the expected prev_hash from the previous entry's full signed bytes; compare against stored — mismatch surfaces as ChainBroken { sequence }. Then rebuild signable bytes (signature blanked symmetrically) and verify Ed25519 — failure surfaces as SignatureInvalid { sequence }. Chain-break and signature-tampering surface as distinct errors.

Two-tier canonicalization came out of a real bug: when an artifact's canonical bytes are computed before the signature is populated, and a derived hash field is computed from those bytes and stored on the artifact, the signable bytes must blank both the signature field and the derived hash field. Otherwise draft-time bytes differ from verify-time bytes and verification fails. Periods, fixtures, and bundles all apply this discipline prospectively.

The auditor needs three things: the operator's deployment public key, the vendor public key for the rules pack, and the Auditor Evidence Bundle. They verify the entire posture offline, a year later, with any FIPS EdDSA, any SHA-384, any tar+gzip extractor. No Argus dependency.

Artifact class Signer Algorithm Canonical bytes
Rules packVendor (Argus release team)Ed25519 / manifest JSONJSON · manifest_signature blanked
FindingOperator (deployment key)Ed25519 / canonical CBORCBOR · signature blanked
Period summaryOperatorEd25519 / canonical CBORCBOR · signature AND canonical_hash blanked
Auditor Evidence BundleOperatorEd25519 / manifest JSONJSON · signature AND bundle_canonical_hash_sha384 blanked
Accuracy measurementOperatorEd25519 / JSONJSON · measurement_signature blanked
Gold-set fixtureFixture release team (separate key)Ed25519 / manifest JSONJSON · manifest signature AND bundle hash blanked
§06 · Frameworks

Seventeen rules-packs at v1.0. 109 controls. Three more frameworks in v1.x preview.

Per-framework scan cadence is set in the posture manifest. Tier-3 autonomy ceilings, 60-day key rotation, 30-second cadence — the DoD floor — is shipped to every customer, regardless of SKU. The Cadillac frame at compact-car price.

01
SOC 2 Type 2AICPA TSC 2017 · 6 rules
60s
commercial
02
HIPAA Security Rule45 CFR § 164.3xx · 8 rules
60s
commercial
03
GDPR Art 32Art 32 + Recital 76 · 6 rules
60s
commercial
04
PCI-DSS v4PCI SSC v4.0 · 6 rules
60s
commercial
05
SR 11-7OCC / Federal Reserve · 5 rules
60s
commercial
06
NIST 800-171 Rev 2CUI controls · 7 rules
45s
federal
07
FedRAMP LowLow baseline · 5 rules
60s
federal
08
FedRAMP ModerateMod baseline · 8 rules
45s
federal
09
FedRAMP HighHigh baseline · 9 rules · Tier-3
60s
federal
10
CMMC L1FAR 52.204-21 + L1 · 3 rules
60s
federal
11
CMMC L2NIST 800-171 parity · 7 rules
45s
federal
12
CMMC L3NIST 800-172 · 6 rules · Tier-3
60s
federal
13
DoD IL4DoD CC SRG IL4 · 7 rules
60s
DoD
14
DoD IL5IL5 · 9 rules · Tier-3 · 12h gap
30s
DoD
15
DISA STIGV-AS-001..006 · 6 rules
30s
DoD
16
FIPS 140-3CMVP · 5 rules
30s
crypto
17
CNSSP-12NSA CNSA 2.0 · 6 rules
30s
sovereign
v1.x preview · design-locked · production-ready @ v1.2 EU AI Act high-risk (Annex III · enforcement 2027-08-02) · NYDFS 23 NYCRR 500 (post-2024 amendments · 72h notification) · MAS TRM + FEAT (Singapore financial services). Geographic coverage: EU + US-NY + APAC-SG.
§07 · Accuracy

A structurally falsifiable accuracy claim. Measured against a signed gold-set fixture.

Argus must never wake an operator at 3am for a finding that turns out to be incorrect. The critical-FP gate is the operator-protection invariant. The methodology is mechanical: SME-authored oracle labels, signed fixture, reproducible measurement, signed accuracy report.

5% · 0% Per-framework false-positive rate at or below 5%. Zero false positives on critical-severity findings. The v1.0 GA acceptance gate.
Step 01 · SME authoring
Framework subject-matter experts author oracle.yaml entries. Per ADR-0028, oracle labels cannot be LLM-generated. Each label signed with the SME team's Ed25519 key.
Step 02 · Fixture release
Fixture release team packages the monthly release. Signed with a fixture-team key, separate from any deployment key. Tarball with manifest + per-file SHA-384.
Step 03 · Operator measurement
Operator runs argus accuracy run --rules-pack <pack> --fixture <tar.gz>. Output: signed measurement JSON. CI gates release on --fail-on-fp-exceeds 0.05 and --fail-on-critical-fp-exceeds 0.
Step 04 · Auditor reproduction
Auditor with fixture + rules pack + operator's accuracy report reproduces the measurement byte-for-byte. Verifies all three signatures offline. The audit chain is mechanical, not narrative.

The gold-set fixture targets ~400 entries across 3 substrates × 17 frameworks. MCP server fixtures (~200 clean + ~150 known-bad + ~50 boundary); A2A flow fixtures (~100 clean + ~80 known-bad + ~30 boundary); posture manifest fixtures (~50 compliant + ~80 non-compliant + ~20 hybrid-mode v1.1 PQ entries).

Per-framework subsets are calibrated for each framework's specific controls — not "one universal fleet measured against all 17." The v1.0 seed scaffold ships with 10 entries spanning 8 frameworks; the GA fixture requires SME-authored oracle labels.

The measurement is the formal definition: for each fixture entry × framework × rule, evaluate the rule against the entry's instantiated InMemoryFacts, then classify against the oracle — true positive, false positive, true negative, false negative. fp_rate = FP / (FP + TN). critical_fp_rate = critical_FP / (critical_FP + critical_TN).

No other compliance product in this category publishes a structurally falsifiable accuracy claim. The methodology is the moat.

§08 · Reports

Seven report types. Each one signed. Each one auditor-verifiable.

The reporting engine is Argus's fourth major subsystem, not a feature. ADR-0019: 80% of compliance evidence value is delivered through reports humans read. All HTML reports are single-file — CSS, JS, images inline as data URIs. Auditors archive one .html. PDF lands at v1.1 via native Rust generation (printpdf / pdf-writer / lopdf) — no Chromium dependency.

R.01

Trial-Mode Scan

Five-minute watermarked posture snapshot for prospect evaluation. "TRIAL — not for audit submission" overlay. Designed to fit inside an evaluation call and surface real findings before procurement.

HTML · MD · watermarked · signed
R.02

Executive Briefing

Cross-framework posture grid for CISOs, CFOs, audit committees, boards. Period coverage, top findings, posture change log. Written for the boardroom, not the engineer.

HTML · MD · PDF @ v1.1 · signed
R.03

Per-Framework Detailed

Per-control verdict breakdown for one framework with anchor navigation. The artifact 3PAOs, DPOs, MRM officers actually read during authorization review.

HTML · MD · PDF @ v1.1 · signed
R.04

Auditor Evidence Bundle

Signed tarball: manifest + vendor pubkey + findings.json + periods.json + OSCAL Assessment Results + rendered reports. Auditor verifies offline against the framework-specific dossier.

.tar.gz · OSCAL 1.1.2 · signed · offline-verifiable
R.05

Drift / Delta

Period-over-period change: new findings, resolved findings, verdict shifts, net posture movement. The diff view for compliance.

HTML · MD · signed
R.06

Period Summary

Chronicle attestation chain renderer with chain-status indicator. The artifact the GRC platform polls on its own cadence.

HTML · MD · JSON · signed
R.07

Finding Deep-Dive

Single-finding forensic detail: detection card, evidence-chain timeline, remediation steps, related findings, HITL disposition log. Pulled during incident response.

HTML · MD · PDF @ v1.1 · signed
§09 · SLOs

Targets are conservative. Measured numbers beat them comfortably.

A typical v1.0 deployment running 10 frameworks at 300-second cadence accumulates ~18 MB of findings per year and ~2 MB of periods. Seven-year retention default: ~140 MB total. Memory: 30–60 MB resident. CPU: single-digit percent during scan cycles, idle otherwise.

Operation p95 target Measured (v0.1.0-dev)
Scan-cycle latency
≤10 MCP servers≤ 5s~1s
≤50 MCP servers (v1.0 GA acceptance · AC-012)≤ 30s~6s
≤200 MCP servers≤ 2minTBD
Report generation
Trial-Mode Scan (cold)≤ 5min~30s
Executive Briefing (AC-015)≤ 10s~2s
Per-Framework Detailed≤ 15s~3s
Auditor Evidence Bundle · 30-day period (AC-016)≤ 30s~8s
External /external/v1/* API latency
GET /health≤ 50msmeasured
GET /posture≤ 200msmeasured
GET /findings (paginated)≤ 400msmeasured
GET /reports/{id}/download (303 redirect)≤ 100msmeasured
§10 · Admission

How Argus gets admitted. Same four legs as every Garrison workload.

Argus passes Garrison's closed-allowlist policy end-to-end. No special case. No soft-fail. No warn-mode. No override. The deterministic Castellan envelope (zero LLM, empty prompts) makes Argus the easiest possible binary to admit.

Leg 01

Charlotte MCP spec

Argus's MCP surface declared up front — tool catalog, knowledge bundles, Herald C2-protected fields.

specs/argus-mcp-server.yaml
Leg 02

Castellan envelope

provider: deterministic · empty prompts, tools, mcp_servers. ASI01 and ASI06 eliminated by construction.

specs/argus-agent.yaml
Leg 03

Aegis pre-deploy

OWASP-Agentic ASI01–ASI10 scorecard · SHIP verdict · 0 FAIL · 2 WARN · 0 critical.

specs/aegis-self-audit.md
Leg 04

KeyServer + cosign

Keyless OIDC release signature + per-customer KeyServer entitlement. No license · no boot.

cosign verify · KeyServer token
§11 · The substrate

Argus rides on Garrison.

Garrison is the hardened operator console Argus runs inside — closed allowlist, single-operator authority, sandboxed execution, KeyServer-gated. Sister workloads include Oracle (signed domain expertise) and Meridian (governed time intelligence). Argus is the fourth.

If you already operate a different AI runtime, Argus can still observe it — provided your runtime emits an attested posture manifest and reachable MCP endpoints. The evidence chain works. The closed-allowlist invariant does not transfer.

Same Argus binary in every Garrison SKU. Same crypto chokepoint. Same supply-chain attestation. The IL5 deployment runs the same engineering artifact as the SOC 2 deployment.

Crypto
aws-lc-rs · FIPS 140-3 #4796
Hardening
Level 6 · static musl · panic=abort · LTO
Banned
ring · openssl · native-tls (deny.toml)
SKUs
Hosted · Stronghold · Enclave · Sovereign
Artifacts
6 per release · same in every tier
PQ
ML-DSA-87 hybrid co-sig @ v1.1 Sovereign
§12 · Category

No direct competitor. AI continuous compliance is a new category.

The closest analog is the original GRC tools — Drata, Vanta, OneTrust — applied to a runtime layer they don't reach. Argus composes with them via the read-only /external/v1/* API. The pitch is additive, not replacement. Per ADR-0030, the platform-neutral positioning is deliberate.

Adjacent player Category Overlap with Argus
Drata · Vanta · OneTrustGRC platformsArgus integrates via /external/v1/* — doesn't replace
Hyperproof · AuditBoard · ServiceNow GRCEnterprise audit automationSame — Argus feeds AI-runtime evidence in
Lakera · Robust Intelligence · CalypsoAIAI security / red-teamingDifferent problem. Argus measures continuous compliance; they measure model attack surface
CredoAI · Holistic AI · Fairly AIAI model governance / fairnessDifferent problem. Argus is runtime evidence; they are model-level governance
Anthropic Trust Center · OpenAI ComplianceFoundation model vendor complianceDifferent problem. Vendor-side compliance attestations
OWASP-LLM project · MITRE ATLASOpen frameworksArgus checks against frameworks; these define them
The build-vs-buy choice for Drata, Vanta, and OneTrust is the same: 18–24 months of engineering for scanners, signed evidence storage, hash chains, and per-framework rules content — losing two years of category ramp — or acquire the category leader. The first one to acquire owns the category permanently. The others rationally bid up the price.
§13 · Pricing

Interim model, subject to design-partner validation. Final lock at v0.5 per PRD DL-007.

Pricing levers under consideration: per-framework licensing vs all-inclusive · per-deployment vs per-component metering · annual vs multi-year discount structure · GSA / SEWP federal price-list eligibility · reseller and system-integrator margins.

Tier Customer profile Annual range
TrialProspects evaluating Argus · watermarked Trial-Mode reports only · no license required$0
Commercial StarterSingle SOC 2 / HIPAA / GDPR deployment$25K – $50K
Commercial StandardUp to 5 commercial frameworks · standard support$75K – $150K
EnterpriseUp to 17 frameworks · priority support · design-partner program$200K – $400K
Federal / FedRAMPFedRAMP Low / Moderate / High deployment$300K – $600K
DoDIL4 / IL5 deployment · Sovereign SKU · air-gap$500K – $1M+

Reach out via the channels in SECURITY.md for design-partner pricing.

Design-partner cycle · 2026·Q2 / Q3

Three forces converge in 2026–2027. Argus is positioned at the intersection. The first eighteen months post-launch are the category-creating window.

Regulators codifying continuous AI evidence requirements. AI tool surface area exploding through MCP servers, A2A flows, autonomous agents. Enterprise procurement requiring AI compliance evidence to close deals — especially in financial services, healthcare, federal, defense.