Tutorial: LLM-Powered Review
This tutorial sets up LLM-powered review to automatically assess needs_review findings and generate regulatory-quality evidence text.
What LLM Review Does
Section titled “What LLM Review Does”Without LLM review, 150 Semi findings require manual triage. With it, the LLM examines the code context and determines pass/fail with an evidence paragraph suitable for Module A documentation.
Without LLM: 55 pass, 40 fail, 57 needs_reviewWith LLM: 85 pass, 48 fail, 19 needs_reviewThe LLM upgrades many needs_review findings to pass or fail with evidence citations.
Option 1: Local Ollama (Private)
Section titled “Option 1: Local Ollama (Private)”-
Install Ollama:
Terminal window curl -fsSL https://ollama.com/install.sh | sh -
Pull a model (70B recommended for code review):
Terminal window ollama pull llama3.1:70b# Or faster: ollama pull llama3.1:8b -
Scan with LLM:
Terminal window fleet scan --path . --llm ollama --output pretty
Option 2: Claude API (Highest Quality)
Section titled “Option 2: Claude API (Highest Quality)”-
Get an Anthropic API key from https://console.anthropic.com
-
Set the key:
Terminal window export ANTHROPIC_API_KEY=sk-ant-api03-... -
Scan with Claude:
Terminal window fleet scan --path . --llm claude --output pretty
Claude produces the highest quality evidence text — well-structured paragraphs with specific code citations.
Compare Evidence Quality
Section titled “Compare Evidence Quality”Here’s the same finding reviewed by different backends:
CRYPTO-01-R1: needs_review (confidence: 0.60)Message: "SHA-256 usage detected"No evidence text — requires manual review.
CRYPTO-01-R1: pass (confidence: 0.78)Evidence: "The code uses SHA-256 for hashing via the sha2 crate.This meets the requirement for state-of-the-art cryptography."Basic evidence — correct assessment but light on detail.
CRYPTO-01-R1: pass (confidence: 0.95)Evidence: "The product uses SHA-256 (256-bit) for allsecurity-relevant hashing operations via the sha2 crate(src/auth.rs:42). The implementation uses Sha256::new()from the RustCrypto project, a well-maintained cryptographiclibrary. No usage of deprecated algorithms (MD5, SHA-1) wasdetected in security contexts. This satisfies CRA Annex I,I.3(a) for state-of-the-art cryptographic mechanisms."Detailed, regulatory-quality evidence with specific citations.
CI Usage Patterns
Section titled “CI Usage Patterns”# PR checks: fast, no LLM (3 seconds)fleet scan --llm off --ci
# Main branch: thorough, with Claude (30-60 seconds)fleet scan --llm claude --ci
# Nightly: full review with detailed evidencefleet scan --llm claude --report weekly-report.mdEvidence Provenance
Section titled “Evidence Provenance”Every LLM-reviewed finding includes provenance tracking:
{ "llm_provenance": { "backend": "claude", "model": "claude-sonnet-4-6", "prompt_version": "v1.0.0", "token_usage": { "input": 2340, "output": 856 }, "confidence": 0.95 }}This ensures traceability: you can always see which model produced which evidence, at what confidence level, using which prompt version.