Appearance
Architecture
Sonar turns a description of a scan into structured knowledge about a target. You author a workflow once; Sonar runs it across many machines, gathers the output, and folds it into a queryable asset database — reliably, and without you babysitting it.
Vocabulary (used throughout)
- Target / Program — the thing you're scanning: a bug-bounty program (e.g. a HackerOne handle). "Target" and "program" mean the same thing here.
- Scope — a sub-target within a program (e.g.
*.acme.com) that says what's in-bounds. - Asset — anything a scan discovers about a target: a domain, IP, port, HTTP path, or detected technology. Assets are the rows in the database. See Asset model.
- Workflow — a reusable recipe: a DAG (directed acyclic graph — steps with "runs-after" edges and no loops) of scanning steps. A scan is one run of it.
- Reconciler — the background loop that every few seconds looks at a running scan and enqueues the next ready work. It's the engine; details on Reliability.
The moving parts
author / operate
│
┌─────▼──────┐ ┌──────────────┐
│ Sonar API │◀──────▶│ PostgreSQL │ programs, scopes, assets,
│ (backend) │ │ │ workflows, scans, tasks
└─────┬──────┘ └──────────────┘
│ enqueue task
┌─────▼──────┐
│ RabbitMQ │ message queue (task dispatch)
└─────┬──────┘
│ pull
┌─────▼──────┐ ┌──────────────┐
│ Workers │───────▶│ MinIO │ raw tool output (files)
│ (a fleet) │ │ (S3-like) │
└─────┬──────┘ └──────┬───────┘
│ result webhook │ read output
┌─────▼───────────────────────▼───────┐
│ Sonar API: parse + bulk-upsert into │
│ the asset tables (domains, ports…) │
└──────────────────────────────────────┘- Backend (Sonar API) — the brain. A .NET service that stores everything in PostgreSQL, exposes the REST API, and runs the background reconciler (the loop that drives each scan forward one step at a time). It serves the frontend too.
- RabbitMQ — the task queue. The backend enqueues one message per unit of work; workers pull from it. Decoupling dispatch from execution is what lets the fleet scale.
- Workers — disposable machines (VPS, GitHub runners, AWS) that pull a task, run a shell command (a scanning tool), upload the output, and report back. They hold no state.
- MinIO — S3-compatible object storage for the raw tool output (often large files). The DB stores structured facts; MinIO stores the artifacts behind them.
- Observability — OpenTelemetry → Prometheus / Loki / Tempo / Grafana for metrics, logs, and traces across the whole flow.
The lifecycle of a scan, in one breath
- You define a workflow (a DAG of steps) and start a scan of it, optionally scoped to one target.
- The backend captures a snapshot of the workflow (for faithful history/display — see the caveat) and marks the scan
Running. - Every few seconds the reconciler looks at the scan, works out which steps are ready, and enqueues tasks to RabbitMQ within the fleet's capacity.
- Workers pull tasks, run the tool, upload output to MinIO, and POST a result back.
- The backend parses the output and bulk-upserts it into the asset tables; that in turn unlocks downstream steps.
- When every step is done, the scan is
Completed(orCompletedWithErrors).
Each of these has a mechanism that makes it safe to interrupt, retry, and re-run — that's the subject of Reliability & consistency. First, the piece you touch most: the workflow engine.