Skip to content

AI resolution & verification

Two points in the pipeline use an LLM. Both go through the same provider registry, so you can swap the model behind either one with a single environment variable — no code change.

The pipeline has three conceptual stages: (1) resolution turns raw crawl data into candidate scope decisions, (2) the actual scanning/crawl work runs (no LLM — this is the workflow engine), and (3) verification checks results. The two LLM stages are 1 and 3; stage 2 is the non-AI scanning in between. These run around the DAG, not as steps inside it.

The two LLM stages

  • Resolution (stage 1) — a multi-round agent that takes raw, messy discovered data and resolves it into clean scope decisions. It applies a program's existing in/out-of-scope rules to decide whether a discovered asset is in bounds and what it maps to — it doesn't invent the scope rules. Configured by RESOLVER_PROVIDER (+ optional RESOLVER_MODEL); it's an agent, so it gets a longer per-call ceiling (RESOLVER_TIMEOUT_SECONDS, default 300).
  • Verification (stage 3) — a lighter checker that verifies candidate results against a rulebook. Configured by VERIFIER_PROVIDER (+ optional VERIFIER_MODEL).

The provider registry

Every provider (openai, openrouter, opencode, deepseek, minimax, groq, cerebras, and a rotating meta-provider) implements one interface, IChatCompletionProvider. Each resolves its credentials/base-URL/default-model from matching Llm__<Provider>__* environment keys. Leaving a stage's *_MODEL empty uses the provider's default.

The rotating provider is a nice touch: it walks a priority list of members and, when one returns 429, parks it for a cooldown and falls through to the next — so a rate limit on one provider doesn't stall the pipeline.

Prompt caching (why the prompts are laid out the way they are)

MiniMax / DeepSeek / OpenAI automatically cache the common prefix of a request from token 0. Sonar exploits this: the large, stable instructions (the resolution system prompt, the verification rulebook) are loaded verbatim and kept at the front of every request, so they're byte-identical call to call and land as a cached prefix. The variable per-item data goes in the user message, after the cacheable block. Moving variable content forward would destroy the cache — hence the discipline. Cache hits are visible in telemetry (llm.cached_prompt_tokens).