Skip to content

Output tables

A workflow step turns raw tool output into structured rows by setting saveToDb: true and naming an output table. Sonar parses the step's JSON/JSONL and bulk-upserts it into that table. This page lists the catalog and how a step targets it.

The catalog

The tables a step may write to (OutputTableConstants):

programs, scopes, domains, ip_addresses, domain_ip_addresses, ports, http_ports, mobiles, technologies, http_path_technologies, port_technologies, http_paths, verify_http_paths.

The port/path family looks redundant but each is distinct:

  • ports — any open port (e.g. from an nmap sweep); may have no HTTP service.
  • http_ports — ports confirmed to speak HTTP/HTTPS (kept separate so an nmap scan that omits the service field never overwrites this).
  • http_paths — specific HTTP paths/URLs found on a host (with status code + length).
  • verify_http_paths — verification results, kept separate so a verify pass never clobbers the discovery rows in http_paths.

How a step writes to one

json
{
  "name": "Transform domains",
  "command": "python3 {INPUT_FILE} --in {INPUT_FILE_UPSTREAM} --out {OUTPUT_FILE}",
  "saveToDb": true,
  "outputTable": "domains",
  "dependsOn": [{ "stepName": "Merge & unique", "dependencyType": "Single" }]
}

The step's output must be an array (or JSONL) of objects whose fields match the table's columns. Sonar streams it in and upserts it. For example, a step writing to domains emits:

json
[{ "value": "api.acme.com" }, { "value": "www.acme.com" }]

and a step writing to http_paths emits rows like:

json
[{ "domainValue": "api.acme.com", "port": "443", "value": "/admin", "statusCode": 200, "length": 5123 }]

Get the exact columns for any table

Don't guess the schema. The authoritative per-table field guide (required columns, conflict keys, human descriptions) is served at GET /api/output-tables/guides — fetch it before authoring a saveToDb step.

Why writes are safe to repeat

Each table declares its conflict keys — the columns that make a row unique (e.g. domainsvalue; scopes(program_id, name); domain_ip_addressesdomain_id). The upsert uses ON CONFLICT (conflict keys), so re-processing the same output updates existing rows instead of duplicating them. The column names are resolved from the live EF Core model, so the mapping can never drift from the schema. This is the same idempotency that makes the whole pipeline safe to retry — see Reliability.

Every write also stamps updated_at = NOW(), which doubles as the target's recency heartbeat.