Skip to content

Skill: author-scan-workflow

Rendered verbatim from apps/mcp-server/skills/author-scan-workflow/SKILL.md — this is exactly what the agent runs on.

Author a scan workflow

A WorkflowDefinition is a reusable DAG template; a Scan is one run of it. Each step runs exactly one shell command on a worker. You author the whole workflow as one JSON document and submit it with the import_workflow MCP tool.

Export JSON shape (what import_workflow accepts)

json
{
  "name": "my-recon",
  "description": "…",
  "category": "recon",
  "metadata": {},
  "secrets": [
    { "name": "wildcard", "defaultValue": "", "description": "target wildcard",
      "isRequired": false, "isActive": true, "kind": "Parameter" }
  ],
  "steps": [
    {
      "name": "Select domains",
      "description": "pull in-scope domains",
      "command": "python3 {INPUT_FILE} --input-file {INPUT_FILE_SQL} --output-file {OUTPUT_FILE}",
      "executionLocation": "Vps",
      "workerImage": "Standard",
      "targetTags": ["vps.standard"],
      "maxConcurrent": 1,
      "variables": {
        "INPUT_FILE": "/scripts/scan_technologies/to_urls.py",
        "OUTPUT_FILE": "/scan-results/{scanId}/step1/urls.txt",
        "INPUT_FILE_SQL": "/scan-results/{scanId}/step1/input.csv"
      },
      "inputSource": "CustomSql",
      "inputSql": "SELECT DISTINCT d.value FROM domains d JOIN ... WHERE s.updated_at >= {PHASE_STARTED_AT};",
      "saveToDb": false,
      "outputTable": null,
      "dependsOn": []
    }
  ]
}

Dependencies are declared by step name in dependsOn[].stepName (resolved to IDs on import).

Dependency semantics (dependsOn[].dependencyType)

  • All (fan-in): the step runs once, after ALL upstream tasks complete. Use to merge.
  • Single (streaming fan-out): one downstream task per completed upstream task, created as they finish. The upstream step MUST emit OUTPUT or OUTPUT_FILE.

Variable conventions (resolved at dispatch)

  • {OUTPUT} / {OUTPUT_FILE} — this step's stdout / output file. Declare OUTPUT_FILE in variables if downstream consumes a file.
  • {INPUT_UPSTREAM} / {INPUT_FILE_UPSTREAM} — the upstream OUTPUT/OUTPUT_FILE fed into a Single-dep child.
  • {INPUT_FILE_SQL} — CSV/JSON file materialized from inputSql (needs inputSource: "CustomSql").
  • {INPUT_PATH_EXPAND} (one child per file in a MinIO dir) / {INPUT_FILE_EXPAND} (one child per line).
  • {PARAM.x} — a declared Parameter secret (visible, targeting). {SECRETS.x} — a masked Secret (command-only).
  • {scanId} / {taskId} / {random} — run/task identifiers.

Persisting results (the output-table catalog)

Set saveToDb: true and outputTable to one of: programs, scopes, domains, ip_addresses, domain_ip_addresses, ports, http_ports, mobiles, technologies, http_path_technologies, port_technologies, http_paths, verify_http_paths. The step must emit JSON/JSONL matching that table; it is bulk-upserted. (Query the live schema/field guide via the output-table guides endpoint if unsure.)

Validator checklist (import fails if any is violated)

  1. CustomSql input source requires INPUT_FILE_SQL in variables.
  2. A step with CustomSql cannot also have a Single dependency.
  3. At most one Single dependency per step.
  4. A Single-dep upstream step must declare OUTPUT or OUTPUT_FILE.
  5. No self-dependencies, no duplicate dependencies.
  6. Reserved variable names (the ones above) cannot be user-declared in variables.
  7. Every {PARAM.x} used must be a declared Parameter secret.
  8. maxConcurrent >= 1.
  9. The dependency graph must be acyclic.

Reference example

Adapt apps/sonar/storage/scripts/workflows/scan-subdomain.json: parallel recon tools (subfinder/assetfinder/chaos) → a All-dep "merge & unique" step → a Single-dep "transform output" step with saveToDb: true, outputTable: "domains".

Publish it

Call import_workflow with the JSON. Fix any validation error it returns and retry. Then run it with the run-scan skill.

Reference workflow (complete)

The skill cites scan-subdomain.json; here it is in full — a real multi-step recon DAG (parallel enumerators → All-dependency merge → Single-dependency transform that saves to the domains table). Adapt it rather than starting from scratch.

json
{
  "name": "Scan Subdomain",
  "description": "",
  "category": "Subdomain",
  "isActive": true,
  "metadata": null,
  "secrets": [
    {
      "name": "CHAOS_API_KEY",
      "defaultValue": null,
      "description": null,
      "isRequired": true,
      "isActive": true
    }
  ],
  "steps": [
    {
      "name": "Puredns",
      "description": null,
      "command": "puredns resolve {INPUT_PATH_EXPAND} --quiet --threads 10 --resolvers {INPUT_FILE_1} --skip-validation --write {OUTPUT_FILE} > /dev/null",
      "executionLocation": "Aws",
      "workerImage": "Standard",
      "variables": {
        "OUTPUT_FILE": "/scan-results/{scanId}/scan_subdomains/step7/{random}.txt",
        "INPUT_FILE_1": "/lists/nameservers.txt",
        "INPUT_PATH_EXPAND": "/scan-results/{scanId}/scan_subdomains/step6/"
      },
      "inputSource": "None",
      "inputSql": null,
      "saveToDb": false,
      "outputTable": null,
      "dependsOn": [
        {
          "stepName": "Splits files",
          "dependencyType": "All"
        }
      ]
    },
    {
      "name": "Subfinder",
      "description": null,
      "command": "subfinder -silent -dL {INPUT_PATH_EXPAND} > {OUTPUT_FILE}",
      "executionLocation": "Aws",
      "workerImage": "Standard",
      "variables": {
        "OUTPUT_FILE": "/scan-results/{scanId}/scan_subdomains/step2/subfinder/{random}.txt",
        "INPUT_PATH_EXPAND": "/scan-results/{scanId}/scan_subdomains/step1/output/"
      },
      "inputSource": "None",
      "inputSql": null,
      "saveToDb": false,
      "outputTable": null,
      "dependsOn": [
        {
          "stepName": "Get inputs & Splits files",
          "dependencyType": "All"
        }
      ]
    },
    {
      "name": "Get inputs & Splits files",
      "description": null,
      "command": "python3 {INPUT_FILE_1} --input-file {INPUT_FILE_SQL} --split 10 --output-path {OUTPUT_PATH}",
      "executionLocation": "Vps",
      "workerImage": "Standard",
      "variables": {
        "OUTPUT_PATH": "/scan-results/{scanId}/scan_subdomains/step1/output/",
        "INPUT_FILE_1": "/scripts/scan_subdomains/split_to_txt.py",
        "INPUT_FILE_SQL": "/scan-results/{scanId}/scan_subdomains/step1/input/input.json"
      },
      "inputSource": "CustomSql",
      "inputSql": "SELECT ws.scope_id, w.value AS value, w.original AS original\nFROM wildcards w\nJOIN wildcard_scopes ws ON ws.wildcard_id = w.id\nWHERE w.updated_at >= {PHASE_STARTED_AT}",
      "saveToDb": false,
      "outputTable": null,
      "dependsOn": []
    },
    {
      "name": "Remove OOS domains",
      "description": null,
      "command": "python3 {INPUT_FILE} --input-file {INPUT_FILE_1} --suffix-file {INPUT_FILE_2} --output-file {OUTPUT_FILE}",
      "executionLocation": "Vps",
      "workerImage": "Standard",
      "variables": {
        "INPUT_FILE": "/scripts/scan_subdomains/filter_domains.py",
        "OUTPUT_FILE": "/scan-results/{scanId}/scan_subdomains/step5/output.txt",
        "INPUT_FILE_1": "/scan-results/{scanId}/scan_subdomains/step4/output.txt",
        "INPUT_FILE_2": "/lists/suffixes.txt"
      },
      "inputSource": "None",
      "inputSql": null,
      "saveToDb": false,
      "outputTable": null,
      "dependsOn": [
        {
          "stepName": "Remove * in files",
          "dependencyType": "All"
        }
      ]
    },
    {
      "name": "Assetfinder",
      "description": null,
      "command": "cat {INPUT_PATH_EXPAND} | xargs -I{} assetfinder --subs-only {} > {OUTPUT_FILE}",
      "executionLocation": "Aws",
      "workerImage": "Standard",
      "variables": {
        "OUTPUT_FILE": "/scan-results/{scanId}/scan_subdomains/step2/assetfinder/{random}.txt",
        "INPUT_PATH_EXPAND": "/scan-results/{scanId}/scan_subdomains/step1/output/"
      },
      "inputSource": "None",
      "inputSql": null,
      "saveToDb": false,
      "outputTable": null,
      "dependsOn": [
        {
          "stepName": "Get inputs & Splits files",
          "dependencyType": "All"
        }
      ]
    },
    {
      "name": "Splits files",
      "description": null,
      "command": "split -l 1000 -a 4 {INPUT_FILE} {OUTPUT_PATH}output_",
      "executionLocation": "Vps",
      "workerImage": "Standard",
      "variables": {
        "INPUT_FILE": "/scan-results/{scanId}/scan_subdomains/step5/output.txt",
        "OUTPUT_PATH": "/scan-results/{scanId}/scan_subdomains/step6/"
      },
      "inputSource": "None",
      "inputSql": null,
      "saveToDb": false,
      "outputTable": null,
      "dependsOn": [
        {
          "stepName": "Remove OOS domains",
          "dependencyType": "All"
        }
      ]
    },
    {
      "name": "Chaos-client",
      "description": null,
      "command": "chaos -key {SECRETS.CHAOS_API_KEY} -silent -dL {INPUT_PATH_EXPAND} > {OUTPUT_FILE}",
      "executionLocation": "Aws",
      "workerImage": "Standard",
      "variables": {
        "OUTPUT_FILE": "/scan-results/{scanId}/scan_subdomains/step2/chaos_client/{random}.txt",
        "INPUT_PATH_EXPAND": "/scan-results/{scanId}/scan_subdomains/step1/output/"
      },
      "inputSource": "None",
      "inputSql": null,
      "saveToDb": false,
      "outputTable": null,
      "dependsOn": [
        {
          "stepName": "Get inputs & Splits files",
          "dependencyType": "All"
        }
      ]
    },
    {
      "name": "Remove * in files",
      "description": null,
      "command": "sed '/^\\*$/d' {INPUT_FILE} > {OUTPUT_FILE}",
      "executionLocation": "Vps",
      "workerImage": "Standard",
      "variables": {
        "INPUT_FILE": "/scan-results/{scanId}/scan_subdomains/step3/output.txt",
        "OUTPUT_FILE": "/scan-results/{scanId}/scan_subdomains/step4/output.txt"
      },
      "inputSource": "None",
      "inputSql": null,
      "saveToDb": false,
      "outputTable": null,
      "dependsOn": [
        {
          "stepName": "Merge files & Make unique",
          "dependencyType": "All"
        }
      ]
    },
    {
      "name": "Merge files & Make unique",
      "description": null,
      "command": "find {INPUT_PATH} {INPUT_PATH_1} {INPUT_PATH_2} -name '*.txt' -exec cat {} + | sort -u > {OUTPUT_FILE}",
      "executionLocation": "Vps",
      "workerImage": "Standard",
      "variables": {
        "INPUT_PATH": "/scan-results/{scanId}/scan_subdomains/step2/assetfinder/",
        "OUTPUT_FILE": "/scan-results/{scanId}/scan_subdomains/step3/output.txt",
        "INPUT_PATH_1": "/scan-results/{scanId}/scan_subdomains/step2/chaos_client/",
        "INPUT_PATH_2": "/scan-results/{scanId}/scan_subdomains/step2/subfinder/"
      },
      "inputSource": "None",
      "inputSql": null,
      "saveToDb": false,
      "outputTable": null,
      "dependsOn": [
        {
          "stepName": "Subfinder",
          "dependencyType": "All"
        },
        {
          "stepName": "Assetfinder",
          "dependencyType": "All"
        },
        {
          "stepName": "Chaos-client",
          "dependencyType": "All"
        }
      ]
    },
    {
      "name": "Transform output",
      "description": null,
      "command": "python3 {INPUT_FILE_1} --initial-input-file {INPUT_FILE_2} --domain-file {INPUT_FILE_UPSTREAM} --output-file {OUTPUT_FILE}",
      "executionLocation": "Vps",
      "workerImage": "Standard",
      "variables": {
        "OUTPUT_FILE": "/scan-results/{scanId}/scan_subdomains/step8/{random}.txt",
        "INPUT_FILE_1": "/scripts/scan_subdomains/transform_output.py",
        "INPUT_FILE_2": "/scan-results/{scanId}/scan_subdomains/step1/input/input.json"
      },
      "inputSource": "None",
      "inputSql": null,
      "saveToDb": true,
      "outputTable": "domains",
      "dependsOn": [
        {
          "stepName": "Puredns",
          "dependencyType": "Single"
        }
      ]
    }
  ]
}

Parameter vs Secret

Each declared input has a kind: Parameter (a visible targeting value, referenced as {PARAM.name}) or Secret (masked, command-only, referenced as {SECRETS.name}). At run time you pass values for these via create_scan's parameters / secretValues — the keys must match the declared names.

Before you write a saveToDb step

Fetch the exact columns for your target table from GET /api/output-tables/guides (a plain authenticated HTTP GET against the backend), and see Output tables for what each table means and an example row.


Next: once the workflow imports cleanly, run it →.