SPECA

Specification-Driven Auditing

Extracts typed security properties (Invariant / Precondition / Postcondition / Assumption) from natural-language specifications (such as EIPs and consensus specifications) and detects vulnerabilities that can only be expressed at the specification level.

Proof-Attempt Reasoning

For each property organized under a STRIDE + CWE Top 25 threat model, SPECA structurally challenges the system to "prove that this property holds," thereby surfacing gaps between the specification and the implementation.

Full Interpretability

Every audit step is structured as JSON. A 3-gate review loop (Dead Code / Trust Boundary / Scope) decomposes false positives by root cause, delivering audit results that are both auditable and interpretable.

Track Record

Evaluations across the research papers confirm that SPECA recovers a high proportion of known contest vulnerabilities and additionally discovers previously undisclosed bugs independently. On the RepoAudit benchmark it achieves 88.9% precision and 100% recall, while demonstrating cost efficiency at approximately $1.69 per bug.

Sherlock contests: recovered all 15 known H/M/L vulnerabilities with expert assistance; 8 (53%) under full automation
RepoAudit (15 C/C++ projects): detected all 35 known bugs, F1 = 0.94 / Precision 88.9%
Independently discovered 4 previously undisclosed bugs — one of which was a cryptographic-algorithm invariant violation missed by 366 contest auditors
Approximately ~$1.69 per bug / ~$30 per H/M/L bug, enabling efficient continuous monitoring with severity-preserving filters

Research Papers

SPECA is grounded in two research papers. The first is a real-world case study of an Ethereum Fusaka audit, and the second extends the same methodology to multi-implementation distributed protocols in general.

SPECA: Specification-Checklist-Driven Auditing — A Case Study on Ethereum Clients

Kamba, Sannai · arXiv:2602.07513 · 2026

The first SPECA paper. A real-world case study that converts specifications into an "audit checklist" and applies it to the 11 clients participating in Ethereum Fusaka.

Reused the checklist across implementations, attributing 76.5% of valid findings to cross-implementation analysis
Achieved a 31.5% valid-finding acceptance rate in the Fusaka contest (above the 27.6% average)
Detected 2 out of 3 High-severity vulnerabilities in V2 re-evaluation
Agent-driven workflow reduced manual verification to 40 minutes per submission

Read on arXiv PDF

Beyond Code Reasoning — Specification-Anchored Auditing of Multi-Implementation Distributed Protocols

Kamba, Murakami, Sannai · arXiv:2604.26495 · 2026

A follow-up paper that generalizes the methodology of the Fusaka case study to multi-implementation distributed protocols beyond Ethereum.

Recovered 15/15 known H/M/L Sherlock vulnerabilities (with expert assistance); 8/15 in fully automated mode
Independently discovered 4 previously undisclosed bugs, including a cryptographic-algorithm invariant violation missed by 366 contest auditors
RepoAudit (15 C/C++ projects): F1 = 0.94 / Precision 88.9% / Recall 100%
Approximately ~$1.69 per bug, and roughly ~$30 per confirmed H/M/L bug

Read on arXiv PDF

Get Started

Clone the repository and either run the Python orchestrator directly or launch SPECA via npm. Multi-language support spans Go, Rust, Nim, TypeScript, C, and more, with full automation through GitHub Actions.

Run the Python orchestrator directly

uv run python3 scripts/run_phase.py --target 04 --workers 4 --max-concurrent 64

Use speca-cli (npm release forthcoming)

npx speca-cli init
npx speca-cli run --target 04
npx speca-cli browse