Push vs Pull Routing
How nano routes events to parsers — and why push and pull ingestion shapes need different routing rules
Push vs Pull Routing
Every event nano ingests has to be tagged with a source_type so the right parser runs. How that tagging happens depends on the ingestion shape — and most user confusion in the Source Configuration UI comes from applying push-style thinking to pull-style sources.
This page explains the two shapes, walks through "I have N log sources via Pub/Sub — what do I do?", and gives a fully-worked routing example for each pull driver.
TL;DR: HTTP-style "one source config, N source_types" only works because the sender labels each event with a header. Pull sources (Pub/Sub, Kafka, SQS) bind to one subscription/topic/queue and the sender cannot set a header on the way in. Routing within a pull source uses broker-native attributes or content sniffing, not a source_type field — because there is no inbound source_type field on a pull event.
The Two Ingestion Shapes
nano supports two structurally different ways for events to arrive. The routing model is different for each.
Push (HTTP, Splunk HEC, Vector native)
- Sender labels each event in-band — via the
X-Source-TypeHTTP header, the HECsourcetypefield, or the.source_typefield on a Vector-to-Vector event. - One endpoint, N source types: a single HTTP source configuration can dispatch events to many parsers based on
.source_type. - The default routing match field is
source_type(push events have it set on arrival), andmatch_type=exactrules work as expected.
Pull (GCP Pub/Sub, Apache Kafka, AWS S3 via SQS)
- One source configuration = one binding (subscription / topic / queue). The sender cannot add a header to a Pub/Sub message attribute or a Kafka header on the consumer side — those have to be set by the publisher, before the message lands in the broker.
- No inbound
.source_type: a Pub/Sub event arriving at nano has.message,.attributes, and broker metadata — but no.source_typefield. A routing rule that matchesmatch_field=source_type, match_type=exact, match_value=foocan never fire on a pull source — it would require the field to exist on the event already, which it doesn't. - Multiplexing within one binding requires either:
- A publisher-set broker attribute (Pub/Sub message attribute, Kafka header, SQS message attribute) that the routing rule reads, or
- Content sniffing of a field inside the decoded message body, or
- Single source type for the whole feed — every event in this binding is the same source type (the mono-vendor case).
The most common mistake: configuring a Pub/Sub or Kafka source with match_field=source_type, match_type=exact. This shape is structurally meaningless for a pull source — the inbound event has no source_type field for the rule to match against. nano's server-side coalescing will treat a stray rule of this shape as a default fallback (so ingestion still works), but the right fix is to use the per-driver attribute path or switch to Single source type mode.
Per-Driver Routing Field Summary
The match field a routing rule reads from depends on the driver. The natural broker-native field, the recommended attribute-based field, and the content-sniff path are all listed below.
| Driver | Broker-native field | Recommended attribute path | Content-sniff example |
|---|---|---|---|
| GCP Pub/Sub | subscription (one per source config — not a routing rule) | .attributes.source_type | .message.routing.event_type |
| Apache Kafka | .topic (when fan-out within consumer group) | .headers.source_type | .message.event_type |
| AWS S3 via SQS | object key prefix (e.g. AWSLogs/.../CloudTrail/) | .attributes.source_type (SQS message attribute set by producer) | path-based on the object key |
| Splunk HEC | .sourcetype (always set by sender — push, not pull) | n/a — .sourcetype is the canonical field | n/a |
| HTTP (push) | .source_type (from X-Source-Type header) | n/a — .source_type is the canonical field | n/a |
| Vector native (push) | .source_type (set by upstream Vector) | n/a | n/a |
Decision Tree: "I Have N Log Sources via Pub/Sub"
This is the question that usually surfaces the push-vs-pull mismatch. The answer depends on how the publisher exports logs.
N Topics (Most Common)
If your environment has one Pub/Sub topic per log type — nanosiem-audit-logs, nanosiem-vpc-flowlogs, nanosiem-scc-findings — create one Source Configuration per subscription. Each binds to one subscription and tags every event with one source type. There is no UX problem to solve here; per-topic deployments are naturally per-source-config.
One Topic, N Log Types — Publisher Sets Attributes
If you control the publisher and it can set Pub/Sub message attributes, set a source_type attribute (or any agreed key) on each message before publishing. Then in the Source Configuration, switch to Multiple source types mode and add routing rules with match_field = .attributes.source_type.
One Topic, N Log Types — Third-Party Publisher
Some SaaS exporters (LimaCharlie's outbound Pub/Sub, for example) publish all event types to one topic and don't expose a way to set per-message attributes. In that case:
- You actually need split routing (e.g. you want
limacharlie_edr_processandlimacharlie_edr_networkto use different parsers): use Multiple source types mode and content-sniff a field inside the decoded message body, e.g.match_field = .message.routing.event_type. - You don't need split routing (e.g. one parser handles every LimaCharlie event): use Single source type mode. Every event in the subscription gets tagged
limacharlie_edrand routed to that parser. This is the simplest answer and the right one for most mono-vendor feeds.
Per-Driver Examples
Each example is a concrete, fully-worked configuration for one of the three pull drivers (plus the push HEC case for contrast).
GCP Pub/Sub — Single Source Type (LimaCharlie)
Scenario: LimaCharlie publishes all event types to one Pub/Sub topic. You don't need per-event-type routing — one parser handles every LimaCharlie event.
- Create the subscription and credential per the GCP Pub/Sub setup guide.
- Create a Source Configuration of type
gcp_pubsub:- Project ID:
your-gcp-project - Subscription:
nanosiem-limacharlie-sub - Credential: the GCP service account credential
- Project ID:
- In the routing section, choose Single source type:
- Source type:
limacharlie_edr - Parser: pre-built
limacharlie_edrparser
- Source type:
This produces one default routing rule and the generator emits an unconditional tag:
.source_type = "limacharlie_edr"GCP Pub/Sub — Multiple Source Types via Message Attributes
Scenario: Your own producer publishes events from multiple internal services to one Pub/Sub topic and sets a source_type message attribute on each message.
Publisher (Python google-cloud-pubsub):
from google.cloud import pubsub_v1
publisher = pubsub_v1.PublisherClient()
topic_path = publisher.topic_path("your-gcp-project", "internal-events")
publisher.publish(
topic_path,
data=event_json.encode("utf-8"),
source_type="auth_service", # message attribute
)Source Configuration routing rules (Multiple source types mode):
| Match Field | Match Type | Match Value | Target Source Type |
|---|---|---|---|
.attributes.source_type | exact | auth_service | auth_service |
.attributes.source_type | exact | api_gateway | api_gateway |
.attributes.source_type | exact | payment_service | payment_service |
| — | default | — | unknown_internal |
The generator emits:
if .attributes.source_type == "auth_service" {
.source_type = "auth_service"
} else if .attributes.source_type == "api_gateway" {
.source_type = "api_gateway"
} else if .attributes.source_type == "payment_service" {
.source_type = "payment_service"
} else {
.source_type = "unknown_internal"
}GCP Pub/Sub — Content Sniffing (Third-Party SaaS, One Topic, N Event Types)
Scenario: A third-party exporter publishes events from multiple event categories to one topic. You can't change the publisher, but the body contains a discriminator field.
Sample event body (decoded .message):
{
"routing": {
"event_type": "NEW_PROCESS",
"hostname": "web-01"
},
"event": { "process_name": "powershell.exe", "command_line": "..." }
}Source Configuration routing rules:
| Match Field | Match Type | Match Value | Target Source Type |
|---|---|---|---|
.message.routing.event_type | exact | NEW_PROCESS | vendor_edr_process |
.message.routing.event_type | exact | NEW_CONNECTION | vendor_edr_network |
.message.routing.event_type | prefix | DNS_ | vendor_edr_dns |
| — | default | — | vendor_edr_other |
Content-sniff field paths must be VRL paths — alphanumeric segments separated by dots, no spaces, no operators. The Source Configuration UI validates this on save and rejects anything that looks like an injection attempt (e.g. match_field = "X Y'; .source_type = \"hax\"" is rejected with a clear error).
Apache Kafka — One Topic per Log Type
Scenario: You already use a topic-per-log-type convention.
Create one Source Configuration per Kafka topic. Each one uses Single source type mode and tags every event with its source type. The consumer group is shared.
cloudtrail-eventstopic → Source Configurationaws_cloudtrailokta-eventstopic → Source Configurationokta_ssoapp-logstopic → Source Configurationmy_app
Apache Kafka — One Topic, N Log Types via Headers
Scenario: A shared security-events topic carries multiple log types and the producer sets a source_type Kafka record header.
Producer (Java kafka-clients):
ProducerRecord<String, String> record =
new ProducerRecord<>("security-events", null, eventJson);
record.headers().add("source_type", "okta_sso".getBytes(UTF_8));
producer.send(record);Source Configuration routing rules (Multiple source types mode):
| Match Field | Match Type | Match Value | Target Source Type |
|---|---|---|---|
.headers.source_type | exact | okta_sso | okta_sso |
.headers.source_type | exact | aws_cloudtrail | aws_cloudtrail |
.headers.source_type | prefix | app_ | application_logs |
| — | default | — | generic_kafka |
Apache Kafka — Routing on Topic Name
Scenario: A single nano consumer subscribes to multiple topics in one consumer group, and you want to route on .topic (the broker-native field) rather than headers.
| Match Field | Match Type | Match Value | Target Source Type |
|---|---|---|---|
.topic | exact | cloudtrail-events | aws_cloudtrail |
.topic | exact | okta-events | okta_sso |
.topic | prefix | app- | application_logs |
| — | default | — | generic_kafka |
The generator emits if .topic == "cloudtrail-events" { .source_type = "aws_cloudtrail" } etc.
AWS S3 via SQS — Routing on Object Key Prefix
Scenario: A single S3 bucket holds multiple log types under different prefixes (AWSLogs/.../CloudTrail/, AWSLogs/.../vpcflowlogs/, WAFLogs/) and one shared SQS queue receives notifications for all of them.
Source Configuration routing rules (Multiple source types mode):
| Match Field | Match Type | Match Value | Target Source Type |
|---|---|---|---|
.object.key | prefix | AWSLogs/123456789012/CloudTrail/ | aws_cloudtrail |
.object.key | prefix | AWSLogs/123456789012/vpcflowlogs/ | aws_vpc_flow |
.object.key | prefix | WAFLogs/ | aws_waf |
| — | default | — | aws_s3_other |
Producer-controlled SQS attribute alternative — if your application code wraps writes through Lambda or another shim, you can set an SQS message attribute on publish and route on .attributes.source_type instead, exactly like the Pub/Sub attribute example.
Per-prefix queues are usually simpler. If you have control over the S3 bucket notification configuration, route each prefix to its own SQS queue and create one Source Configuration per queue (each in Single source type mode). See AWS S3 via SQS — Multiple Log Types from One Bucket for the per-prefix pattern. The shared-queue approach above is for when you can't change the bucket notification topology.
Splunk HEC — Routing on sourcetype (Push, for Contrast)
HEC is a push protocol — included here so you can see the contrast. The HEC client sets sourcetype on each event, and nano routes on the canonical .sourcetype field. The match shape looks similar to a pull-source attribute match, but the field is set by the sender per-event, not by a publisher attribute.
| Match Field | Match Type | Match Value | Target Source Type |
|---|---|---|---|
sourcetype | exact | WinEventLog:Security | windows_event |
sourcetype | exact | pan:traffic | palo_alto |
sourcetype | prefix | myapp: | my_app |
| — | default | — | unknown_hec |
Because HEC is push-shaped, a single HEC Source Configuration handles N source types natively — there's no "one binding per source type" constraint.
When to Create a New Source Configuration vs Add a Routing Rule
A short rule of thumb:
| Situation | Create a new Source Configuration | Add a routing rule |
|---|---|---|
| New Pub/Sub subscription / Kafka topic / SQS queue | Yes | No |
| New credential / different cloud account | Yes | No |
| Same binding, new log type — publisher sets an attribute or there's a content discriminator | No | Yes (Multiple source types mode) |
| Same binding, mono-vendor feed | No | Use Single source type mode (one default rule) |
| New HTTP source type from an existing forwarder | No | Yes (push: match_field=source_type) |
A Source Configuration models a binding (one subscription, one topic, one queue, or one HTTP listener). Routing rules model how events inside that binding are split into source types. If you find yourself wanting to split across bindings, that's a new Source Configuration.
Source Configuration UI
Routing rules are configured per Source Configuration in the nano web UI: navigate to Settings → Source Configurations, open the configuration, and use the routing section. Pull-source configurations expose two modes — Single source type (one default rule) and Multiple source types (the routing-rules table) — with per-driver match-field defaults.
See Also
- Data Ingestion Overview — pipeline, source-type requirement, processing stages
- Supported Data Sources — protocol-level reference for every ingestion method
- GCP Pub/Sub — end-to-end setup, sample events, troubleshooting
- Apache Kafka — bootstrap, SASL, multiple topics, routing rules table
- AWS S3 via SQS — bucket notifications, per-prefix vs shared queue
- Splunk HEC — sourcetype routing, common Splunk → nano mappings