Authoring Parsers
Use a coding agent to scaffold a new nano parser from a sample log, validate it with Vector, and ship it
Authoring Parsers with a Coding Agent
The hosted nano product uses pivt AI to turn a pasted sample log into a working parser.yaml. In the OSS build you do the same thing with your coding agent — it reads the parsers repo for shape and conventions, drafts a parser, and you validate it with the Vector CLI.
This page assumes you've already completed Setup and have ~/nano-workspace/parsers/ cloned locally.
Parser file shape (refresher)
Every nano parser lives in its own directory under parsers/parsers/<name>/parser.yaml. The format:
name: my_log_source
display_name: My Log Source
version: "1.0.0"
description: "Parses ... events from ... covering ..."
match_values:
- my_log_source
- alternate_name
category: endpoint # or network, cloud, identity, etc.
vendor: "Vendor Name"
product: "Product Name"
parser_vrl: |
# VRL (Vector Remap Language) program goes here.
# Populate .udm, .ext, and .metadata from the raw event.
raw_log = string!(.message)
parsed, err = parse_json(raw_log)
if err == null {
.udm.event_type = string!(parsed.action)
.udm.user = string!(parsed.username)
# ... map remaining fields
}The contract is: read .message (raw event), produce .udm.* (normalized fields per the UDM reference), spill anything that doesn't fit into .ext.*. Look at parsers/parsers/windows_event/parser.yaml and parsers/parsers/sysmon/parser.yaml for fully-worked examples — the agent will use these as templates.
Workflow: from sample log to deployed parser
Step 1: Capture a sample event
Save 5–10 representative events from your log source to a file the agent can read. JSON one-per-line, raw syslog, or CSV — whatever the source actually emits.
cat > /tmp/myapp-sample.jsonl <<'EOF'
{"ts":"2026-05-09T10:00:00Z","action":"login","user":"alice","src":"10.0.0.5","status":"ok"}
{"ts":"2026-05-09T10:00:01Z","action":"login","user":"bob","src":"10.0.0.6","status":"failed","reason":"bad_password"}
{"ts":"2026-05-09T10:00:02Z","action":"logout","user":"alice","src":"10.0.0.5"}
EOFStep 2: Ask the agent to draft the parser
A good prompt names the source, points at the samples, and points at a reference parser whose shape is similar. Example:
I need a nano parser for "myapp" — sample events are in
/tmp/myapp-sample.jsonl. Use~/nano-workspace/parsers/parsers/okta/parser.yamlas a structural reference (similar JSON-with-action-and-user shape). Mapactiontoudm.event_type,usertoudm.user,srctoudm.src_ip,statustoudm.outcome. Put anything else under.ext. Write the parser toparsers/parsers/myapp/parser.yaml.
The agent will read the reference parser, read the samples, and produce a draft. Review the VRL — agents are good at the structural skeleton but can hallucinate VRL functions that don't exist. Cross-check anything unfamiliar against vector.dev/docs/reference/vrl.
Step 3: Validate with Vector
The parsers repo's contract is: a parser is correct if vector vrl runs the parser_vrl block against a sample event without error and produces the expected .udm shape.
Extract the VRL block to a standalone file (or have the agent do it):
# Pull just the VRL out of the YAML for testing
yq '.parser_vrl' parsers/parsers/myapp/parser.yaml > /tmp/myapp.vrlRun it against a single event:
echo '{"message":"{\"ts\":\"2026-05-09T10:00:00Z\",\"action\":\"login\",\"user\":\"alice\"}"}' \
| vector vrl --program /tmp/myapp.vrl --input /dev/stdin --print-objectYou should see a JSON object with .udm.event_type = "login", .udm.user = "alice", and the timestamp parsed. If the program errors, paste the error back to the agent — it will iterate.
Vector's VRL is strict — to_int(...) returns a fallible type that you have to handle with ?? null or an if err == null block. Most agent-introduced bugs are unhandled fallibility. The reference parsers show the right pattern.
Step 4: Round-trip a few real events
Once a single event parses cleanly, run the full sample file through:
vector vrl --program /tmp/myapp.vrl --input /tmp/myapp-sample.jsonl --print-objectSpot-check that:
event_typeis populated for every event- Timestamps round-trip correctly
src_ipis a valid IP, not a string"10.0.0.5"left in.ext- Nothing important landed in
.extthat should have been a normalized UDM field
If something looks off, hand the diff back to the agent: "For event 2, outcome is missing — failed logins should map to outcome: failure."
Step 5: Add it to your nano deployment
Once validated, you have two paths:
Option A — contribute upstream: open a PR against nanos-sh/parsers. The repo's contributing guide is in its README.md. This is the right path for vendors and protocols that other users will benefit from.
Option B — keep it private: import the parser directly into your deployment's UI under Ingestion → Log Sources → Repositories, or place the YAML in your own private parser repository and point your nano deployment at it under Ingestion → Repositories → Add Repository.
Tips for getting good output
- Always point the agent at 1–2 reference parsers in the prompt. The codebase has 60+ examples; a generic "write me a parser" prompt produces generic output.
- Be explicit about UDM mappings. The agent doesn't know which of your fields are user-meaningful vs. noise. Spell out the mapping in the prompt.
- Validate every iteration. A parser that looks correct can silently drop fields.
vector vrl --print-objectis the source of truth. - Keep
.extfor the long tail. If a field appears in 1% of events or is debugging-only, don't promote it to UDM — the agent will sometimes over-eagerly normalize.
What's next
- Authoring detections — once your data is parsed, write detections against it
- Detection-as-Code — the underlying nanodac workflow without an agent
- UDM Field Reference — the canonical list of normalized fields