How nano routes events to parsers, and why push and pull ingestion shapes need different routing rules

Push vs Pull Routing

Every event nano ingests has to be tagged with a source_type so the right parser runs. How that tagging happens depends on the ingestion shape. Most user confusion in the Source Configuration UI comes from applying push-style thinking to pull-style sources.

This page explains the two shapes, walks through "I have N log sources via Pub/Sub, what do I do?", and gives a fully-worked routing example for each pull driver.

TL;DR: HTTP-style "one source config, N source_types" only works because the sender labels each event with a header. Pull sources (Pub/Sub, Kafka, SQS) bind to one subscription/topic/queue and the sender cannot set a header on the way in. Routing within a pull source uses broker-native attributes or content sniffing, not a source_type field, because there is no inbound source_type field on a pull event.

The Two Ingestion Shapes

nano supports two structurally different ways for events to arrive. The routing model is different for each.

Push (HTTP, Splunk HEC, Vector native)

Sender labels each event in-band, via the X-Source-Type HTTP header, the HEC sourcetype field, or the .source_type field on a Vector-to-Vector event.
One endpoint, N source types: a single HTTP source configuration can dispatch events to many parsers based on .source_type.
The default routing match field is source_type (push events have it set on arrival), and match_type=exact rules work as expected.

Pull (GCP Pub/Sub, Apache Kafka, AWS S3 via SQS)

One source configuration = one binding (subscription / topic / queue). The sender cannot add a header to a Pub/Sub message attribute or a Kafka header on the consumer side; those have to be set by the publisher, before the message lands in the broker.
No inbound .source_type: a Pub/Sub event arriving at nano has .message, .attributes, and broker metadata, but no .source_type field. A routing rule that matches match_field=source_type, match_type=exact, match_value=foo can never fire on a pull source. It would require the field to exist on the event already, which it doesn't.
Multiplexing within one binding requires either:
1. A publisher-set broker attribute (Pub/Sub message attribute, Kafka header, SQS message attribute) that the routing rule reads, or
2. Content sniffing of a field inside the decoded message body, or
3. Single source type for the whole feed: every event in this binding is the same source type (the mono-vendor case).

The most common mistake: configuring a Pub/Sub or Kafka source with match_field=source_type, match_type=exact. This shape is structurally meaningless for a pull source, because the inbound event has no source_type field for the rule to match against. nano's server-side coalescing treats a stray rule of this shape as a default fallback (so ingestion still works), but the right fix is to use the per-driver attribute path or switch to Single source type mode.

Per-Driver Routing Field Summary

The match field a routing rule reads from depends on the driver. The natural broker-native field, the recommended attribute-based field, and the content-sniff path are all listed below.

Driver	Broker-native field	Recommended attribute path	Content-sniff example
GCP Pub/Sub	`subscription` (one per source config, not a routing rule)	`.attributes.source_type`	`.message.routing.event_type`
Apache Kafka	`.topic` (when fan-out within consumer group)	`.headers.source_type`	`.message.event_type`
AWS S3 via SQS	`.key` (S3 object key, e.g. `AWSLogs/.../CloudTrail/`) or `.bucket`	`.attributes.source_type` (SQS message attribute set by producer)	path-based on `.key`
Splunk HEC	`.sourcetype` (always set by sender: push, not pull)	n/a, `.sourcetype` is the canonical field	n/a
HTTP (push)	`.source_type` (from `X-Source-Type` header)	n/a, `.source_type` is the canonical field	n/a
Vector native (push)	`.source_type` (set by upstream Vector)	n/a	n/a

Decision Tree: "I Have N Log Sources via Pub/Sub"

This is the question that usually surfaces the push-vs-pull mismatch. The answer depends on how the publisher exports logs.

N Topics (Most Common)

If your environment has one Pub/Sub topic per log type (nanosiem-audit-logs, nanosiem-vpc-flowlogs, nanosiem-scc-findings), create one Source Configuration per subscription. Each binds to one subscription and tags every event with one source type. There is no UX problem to solve here; per-topic deployments are naturally per-source-config.

One Topic, N Log Types — Publisher Sets Attributes

If you control the publisher and it can set Pub/Sub message attributes, set a source_type attribute (or any agreed key) on each message before publishing. Then in the Source Configuration, switch to Multiple source types mode and add routing rules with match_field = .attributes.source_type.

One Topic, N Log Types — Third-Party Publisher

Some SaaS exporters (LimaCharlie's outbound Pub/Sub, for example) publish all event types to one topic and don't expose a way to set per-message attributes. In that case:

You actually need split routing (e.g. you want limacharlie_edr_process and limacharlie_edr_network to use different parsers): use Multiple source types mode and content-sniff a field inside the decoded message body, e.g. match_field = .message.routing.event_type.
You don't need split routing (e.g. one parser handles every LimaCharlie event): use Single source type mode. Every event in the subscription gets tagged limacharlie_edr and routed to that parser. This is the simplest answer and the right one for most mono-vendor feeds.

Per-Driver Examples

Each example is a concrete, fully-worked configuration for one of the three pull drivers (plus the push HEC case for contrast).

GCP Pub/Sub — Single Source Type (LimaCharlie)

Scenario: LimaCharlie publishes all event types to one Pub/Sub topic. You don't need per-event-type routing; one parser handles every LimaCharlie event.

Create the subscription and credential per the GCP Pub/Sub setup guide.
Create a Source Configuration of type gcp_pubsub:
- Project ID: your-gcp-project
- Subscription: nanosiem-limacharlie-sub
- Credential: the GCP service account credential
In the routing section, choose Single source type:
- Source type: limacharlie_edr
- Parser: pre-built limacharlie_edr parser

This produces one default routing rule and the generator emits an unconditional tag:

.source_type = "limacharlie_edr"

GCP Pub/Sub — Multiple Source Types via Message Attributes

Scenario: Your own producer publishes events from multiple internal services to one Pub/Sub topic and sets a source_type message attribute on each message.

Publisher (Python google-cloud-pubsub):

from google.cloud import pubsub_v1

publisher = pubsub_v1.PublisherClient()
topic_path = publisher.topic_path("your-gcp-project", "internal-events")

publisher.publish(
    topic_path,
    data=event_json.encode("utf-8"),
    source_type="auth_service",      # message attribute
)

Source Configuration routing rules (Multiple source types mode):

Match Field	Match Type	Match Value	Target Source Type
`.attributes.source_type`	`exact`	`auth_service`	`auth_service`
`.attributes.source_type`	`exact`	`api_gateway`	`api_gateway`
`.attributes.source_type`	`exact`	`payment_service`	`payment_service`
—	`default`	—	`unknown_internal`

The generator emits:

if .attributes.source_type == "auth_service" {
    .source_type = "auth_service"
} else if .attributes.source_type == "api_gateway" {
    .source_type = "api_gateway"
} else if .attributes.source_type == "payment_service" {
    .source_type = "payment_service"
} else {
    .source_type = "unknown_internal"
}

GCP Pub/Sub — Content Sniffing (Third-Party SaaS, One Topic, N Event Types)

Scenario: A third-party exporter publishes events from multiple event categories to one topic. You can't change the publisher, but the body contains a discriminator field.

Sample event body (decoded .message):

{
  "routing": {
    "event_type": "NEW_PROCESS",
    "hostname": "web-01"
  },
  "event": { "process_name": "powershell.exe", "command_line": "..." }
}

Source Configuration routing rules:

Match Field	Match Type	Match Value	Target Source Type
`.message.routing.event_type`	`exact`	`NEW_PROCESS`	`vendor_edr_process`
`.message.routing.event_type`	`exact`	`NEW_CONNECTION`	`vendor_edr_network`
`.message.routing.event_type`	`prefix`	`DNS_`	`vendor_edr_dns`
—	`default`	—	`vendor_edr_other`

Content-sniff field paths must be VRL paths: alphanumeric segments separated by dots, no spaces, no operators. The Source Configuration UI validates this on save and rejects anything that looks like an injection attempt (e.g. match_field = "X Y'; .source_type = \"hax\"" is rejected with a clear error).

Apache Kafka — One Topic per Log Type

Scenario: You already use a topic-per-log-type convention.

Create one Source Configuration per Kafka topic. Each one uses Single source type mode and tags every event with its source type. The consumer group is shared.

cloudtrail-events topic → Source Configuration aws_cloudtrail
okta-events topic → Source Configuration okta_sso
app-logs topic → Source Configuration my_app

Apache Kafka — One Topic, N Log Types via Headers

Scenario: A shared security-events topic carries multiple log types and the producer sets a source_type Kafka record header.

Producer (Java kafka-clients):

ProducerRecord<String, String> record =
    new ProducerRecord<>("security-events", null, eventJson);
record.headers().add("source_type", "okta_sso".getBytes(UTF_8));
producer.send(record);

Source Configuration routing rules (Multiple source types mode):

Match Field	Match Type	Match Value	Target Source Type
`.headers.source_type`	`exact`	`okta_sso`	`okta_sso`
`.headers.source_type`	`exact`	`aws_cloudtrail`	`aws_cloudtrail`
`.headers.source_type`	`prefix`	`app_`	`application_logs`
—	`default`	—	`generic_kafka`

Apache Kafka — Routing on Topic Name

Scenario: A single nano consumer subscribes to multiple topics in one consumer group, and you want to route on .topic (the broker-native field) rather than headers.

Match Field	Match Type	Match Value	Target Source Type
`.topic`	`exact`	`cloudtrail-events`	`aws_cloudtrail`
`.topic`	`exact`	`okta-events`	`okta_sso`
`.topic`	`prefix`	`app-`	`application_logs`
—	`default`	—	`generic_kafka`

The generator emits if .topic == "cloudtrail-events" { .source_type = "aws_cloudtrail" } etc.

AWS S3 via SQS — Routing on Object Key Prefix

Scenario: A single S3 bucket holds multiple log types under different prefixes (AWSLogs/.../CloudTrail/, AWSLogs/.../vpcflowlogs/, WAFLogs/) and one shared SQS queue receives notifications for all of them.

Source Configuration routing rules (Multiple source types mode):

Match Field	Match Type	Match Value	Target Source Type
`.key`	`prefix`	`AWSLogs/123456789012/CloudTrail/`	`aws_cloudtrail`
`.key`	`prefix`	`AWSLogs/123456789012/vpcflowlogs/`	`aws_vpc_flow`
`.key`	`prefix`	`WAFLogs/`	`aws_waf`
—	`default`	—	`aws_s3_other`

The .key match field reads the S3 object key (in the routing-rule UI it's the "Object key" preset; the default preset is .bucket). Producer-controlled SQS attribute alternative: if your application code wraps writes through Lambda or another shim, you can set an SQS message attribute on publish and route on .attributes.source_type instead, exactly like the Pub/Sub attribute example.

Per-prefix queues are usually simpler. If you have control over the S3 bucket notification configuration, route each prefix to its own SQS queue and create one Source Configuration per queue (each in Single source type mode). See AWS S3 via SQS: Multiple Log Types from One Bucket for the per-prefix pattern. The shared-queue approach above is for when you can't change the bucket notification topology.

Splunk HEC — Routing on `sourcetype` (Push, for Contrast)

HEC is a push protocol, included here so you can see the contrast. The HEC client sets sourcetype on each event, and nano routes on the canonical .sourcetype field. The match shape looks similar to a pull-source attribute match, but the field is set by the sender per-event, not by a publisher attribute.

Match Field	Match Type	Match Value	Target Source Type
`sourcetype`	`exact`	`WinEventLog:Security`	`windows_event`
`sourcetype`	`exact`	`pan:traffic`	`palo_alto`
`sourcetype`	`prefix`	`myapp:`	`my_app`
—	`default`	—	`unknown_hec`

Because HEC is push-shaped, a single HEC Source Configuration handles N source types natively; there's no "one binding per source type" constraint.

When to Create a New Source Configuration vs Add a Routing Rule

A short rule of thumb:

Situation	Create a new Source Configuration	Add a routing rule
New Pub/Sub subscription / Kafka topic / SQS queue	Yes	No
New credential / different cloud account	Yes	No
Same binding, new log type (publisher sets an attribute or there's a content discriminator)	No	Yes (Multiple source types mode)
Same binding, mono-vendor feed	No	Use Single source type mode (one default rule)
New HTTP source type from an existing forwarder	No	Yes (push: `match_field=source_type`)

A Source Configuration models a binding (one subscription, one topic, one queue, or one HTTP listener). Routing rules model how events inside that binding are split into source types. If you find yourself wanting to split across bindings, that's a new Source Configuration.

Source Configuration UI

Routing rules are configured per Source Configuration in the nano web UI: navigate to Settings → Source Configurations, open the configuration, and use the routing section. Pull-source configurations expose two modes, Single source type (one default rule) and Multiple source types (the routing-rules table), with per-driver match-field defaults.

Push vs Pull Routing

Push vs Pull Routing

The Two Ingestion Shapes

Push (HTTP, Splunk HEC, Vector native)

Pull (GCP Pub/Sub, Apache Kafka, AWS S3 via SQS)

Per-Driver Routing Field Summary

Decision Tree: "I Have N Log Sources via Pub/Sub"

N Topics (Most Common)

One Topic, N Log Types — Publisher Sets Attributes

One Topic, N Log Types — Third-Party Publisher

Per-Driver Examples

GCP Pub/Sub — Single Source Type (LimaCharlie)

GCP Pub/Sub — Multiple Source Types via Message Attributes

GCP Pub/Sub — Content Sniffing (Third-Party SaaS, One Topic, N Event Types)

Apache Kafka — One Topic per Log Type

Apache Kafka — One Topic, N Log Types via Headers

Apache Kafka — Routing on Topic Name

AWS S3 via SQS — Routing on Object Key Prefix

Splunk HEC — Routing on `sourcetype` (Push, for Contrast)

When to Create a New Source Configuration vs Add a Routing Rule

Source Configuration UI

See Also

On this page