Appearance
AI resolution & verification
Two points in the pipeline use an LLM. Both go through the same provider registry, so you can swap the model behind either one with a single environment variable — no code change.
The pipeline has three conceptual stages: (1) resolution turns raw crawl data into candidate scope decisions, (2) the actual scanning/crawl work runs (no LLM — this is the workflow engine), and (3) verification checks results. The two LLM stages are 1 and 3; stage 2 is the non-AI scanning in between. These run around the DAG, not as steps inside it.
The two LLM stages
- Resolution (stage 1) — a multi-round agent that takes raw, messy discovered data and resolves it into clean scope decisions. It applies a program's existing in/out-of-scope rules to decide whether a discovered asset is in bounds and what it maps to — it doesn't invent the scope rules. Configured by
RESOLVER_PROVIDER(+ optionalRESOLVER_MODEL); it's an agent, so it gets a longer per-call ceiling (RESOLVER_TIMEOUT_SECONDS, default 300). - Verification (stage 3) — a lighter checker that verifies candidate results against a rulebook. Configured by
VERIFIER_PROVIDER(+ optionalVERIFIER_MODEL).
The provider registry
Every provider (openai, openrouter, opencode, deepseek, minimax, groq, cerebras, and a rotating meta-provider) implements one interface, IChatCompletionProvider. Each resolves its credentials/base-URL/default-model from matching Llm__<Provider>__* environment keys. Leaving a stage's *_MODEL empty uses the provider's default.
The rotating provider is a nice touch: it walks a priority list of members and, when one returns 429, parks it for a cooldown and falls through to the next — so a rate limit on one provider doesn't stall the pipeline.
Prompt caching (why the prompts are laid out the way they are)
MiniMax / DeepSeek / OpenAI automatically cache the common prefix of a request from token 0. Sonar exploits this: the large, stable instructions (the resolution system prompt, the verification rulebook) are loaded verbatim and kept at the front of every request, so they're byte-identical call to call and land as a cached prefix. The variable per-item data goes in the user message, after the cacheable block. Moving variable content forward would destroy the cache — hence the discipline. Cache hits are visible in telemetry (llm.cached_prompt_tokens).