LLM Configuration Guide
Overview
Section titled “Overview”The LLM layer reviews Semi findings from static detectors and produces regulatory-quality evidence text. It is optional — scanning works without it, but Semi findings remain as needs_review until triaged manually or by LLM.
Backends
Section titled “Backends”| Backend | Privacy | Quality | Cost | Setup |
|---|---|---|---|---|
| Ollama (local) | Data never leaves machine | Good (model-dependent) | Free | Install Ollama + pull model |
| Claude (Anthropic) | Data sent to Anthropic API | Excellent | Per-token | Set ANTHROPIC_API_KEY |
| OpenAI (compatible) | Data sent to provider | Very good | Per-token | Set OPENAI_API_KEY |
| Off | N/A | N/A | Free | Default |
Ollama Setup (Local-First)
Section titled “Ollama Setup (Local-First)”- Install Ollama: https://ollama.com
- Pull a model (minimum 8B for code review, 70B recommended):
ollama pull llama3.1:8b # Fast, adequateollama pull llama3.1:70b # Recommended for thorough reviewollama pull codellama:34b # Code-specialized alternative- Run scan:
fleet scan --path . --llm ollamaEnvironment variables:
export FLEET_LLM_OLLAMA_URL=http://localhost:11434 # Defaultexport FLEET_LLM_OLLAMA_MODEL=llama3.1:70bClaude Setup
Section titled “Claude Setup”export ANTHROPIC_API_KEY=sk-ant-api03-...fleet scan --path . --llm claudeModels (via FLEET_LLM_CLAUDE_MODEL):
claude-sonnet-4-6— Default. Good balance of quality and speed.claude-opus-4-6— Deepest analysis. Best for critical assessments.
OpenAI-Compatible Setup
Section titled “OpenAI-Compatible Setup”Works with OpenAI, Azure OpenAI, vLLM, Together, LM Studio, or any OpenAI-compatible endpoint.
export OPENAI_API_KEY=sk-...export FLEET_LLM_OPENAI_BASE_URL=https://api.openai.com/v1 # Defaultexport FLEET_LLM_OPENAI_MODEL=gpt-4ofleet scan --path . --llm openaiFor Azure OpenAI:
export OPENAI_API_KEY=<azure-key>export FLEET_LLM_OPENAI_BASE_URL=https://your-resource.openai.azure.com/openai/deployments/your-deploymentexport FLEET_LLM_OPENAI_MODEL=gpt-4oFor local vLLM:
export OPENAI_API_KEY=dummyexport FLEET_LLM_OPENAI_BASE_URL=http://localhost:8000/v1export FLEET_LLM_OPENAI_MODEL=meta-llama/Llama-3.1-70B-InstructPrompt Strategy
Section titled “Prompt Strategy”Each requirement category has a specialized system prompt grounded in CRA Annex I language. Prompts are versioned (v1.0.0) for evidence traceability.
Categories with custom prompts:
- CRYPTO — Algorithm approval, key sizes, modes of operation
- NET — Transport security, credential handling, attack surface
- AUTH — Password storage, session management, JWT validation
- INPUT — Injection prevention, XSS, path traversal
- STOR — Encryption at rest, access controls
- LOG — Event coverage, data protection, structured format
- UPD — Update integrity, signature verification, rollback
- CONFIG — Secure defaults, debug mode
- AI — Model integrity, prompt injection, data exposure
The LLM responds with structured JSON:
{ "assessment": "pass | fail | inconclusive", "confidence": 0.92, "evidence_text": "<regulatory-quality paragraph>", "citations": [{ "file": "...", "line": 42, "snippet": "..." }], "reasoning": "<chain-of-thought>", "recommendations": ["<remediation if fail>"]}CI/CD Usage Patterns
Section titled “CI/CD Usage Patterns”fleet scan --llm off --ci
fleet scan --llm ollama --ci
fleet scan --llm claude --ci
fleet scan --llm off --ci --api-url https://fleet.example.comEvidence Provenance
Section titled “Evidence Provenance”Every LLM-generated evidence record includes provenance:
{ "llm_provenance": { "backend": "claude", "model": "claude-sonnet-4-6", "prompt_version": "v1.0.0", "token_usage": { "input": 2340, "output": 856 }, "confidence": 0.92 }}When prompts are updated (new prompt_version), previously generated Semi evidence is considered stale and should be re-reviewed.
Privacy Considerations
Section titled “Privacy Considerations”| Concern | Ollama (local) | Claude / OpenAI (cloud) |
|---|---|---|
| Data leaves machine | No | Yes — sent to the provider API |
| Secret redaction before send | Not needed (stays local) | Yes — secrets redacted to named markers |
| Request logging | Local only | Provider’s retention policy |
Before any snippet is sent to a cloud backend, Fleet runs a redaction pass that replaces 17 classes of credential — Anthropic / OpenAI / AWS / GitHub / Stripe / Slack / GCP keys, JWTs, PEM private-key blocks, and credentials embedded in URLs — with named markers such as __REDACTED_aws_access_key_id_3__. The original values are restored only in the response, via a call-scoped reverse map that is never persisted, so the audit trail records that a secret was present without ever storing the secret itself.
For fully air-gapped workflows, the Ollama backend keeps everything on the machine — no code leaves the host at all.