nano SIEM
Reference

Supported Data Sources

Supported Data Sources

nano uses a source_type field to route incoming logs to the appropriate parser. Every log event must have a source_type defined for the system to process it correctly. This page explains which ingestion methods are supported and how to handle sources that don't natively support source type identification.

Source Type Requirement

When logs arrive at nano, the system needs to know what type of log it is (e.g., aws_cloudtrail, palo_alto, okta) to route it to the correct parser. Without a source type, logs cannot be parsed or stored.

Source types can be defined in several ways:

  • HTTP header: X-Source-Type: aws_cloudtrail
  • Event field: .source_type = "palo_alto" (for Vector-to-Vector forwarding)
  • Feed configuration: One feed per log type (for cloud pull sources)

Directly Supported Sources

These ingestion methods support source type identification natively:

SourceHow source_type is definedBest for
HTTPX-Source-Type headerApplications, webhooks, log shippers (most common)
Vector.source_type field in eventOn-prem aggregators forwarding to cloud
AWS S3Feed configuration (one feed per log type)CloudTrail, VPC Flow Logs, ALB logs (setup guide)
GCP Pub/SubFeed configuration (one feed per log type)Cloud Audit Logs, Security Command Center (setup guide)
KafkaFeed configuration or topic routingHigh-volume streaming pipelines (setup guide)

HTTP Ingestion

The most common and recommended method. Send logs via HTTP POST with the X-Source-Type header:

curl -X POST https://your-nanosiem.com:8080/ \
  -H "Authorization: Bearer $TOKEN" \
  -H "X-Source-Type: my_app" \
  -H "Content-Type: application/json" \
  -d '{"timestamp": "2024-01-01T12:00:00Z", "message": "User login"}'

Works with any HTTP-capable log shipper:

  • Fluentd/Fluent Bit (HTTP output)
  • Filebeat (HTTP output)
  • Cribl Stream
  • Custom applications

Vector-to-Vector Forwarding

For on-premises deployments, use a local Vector instance as an aggregator that forwards to nano. The aggregator sets the source_type field before forwarding:

# On-premises Vector aggregator
[sources.firewall_syslog]
type = "syslog"
address = "0.0.0.0:514"

[transforms.tag_source]
type = "remap"
inputs = ["firewall_syslog"]
source = '.source_type = "palo_alto"'

[sinks.cloud_siem]
type = "vector"
inputs = ["tag_source"]
address = "your-nanosiem.com:6000"

nano listens on port 6000 for Vector-to-Vector traffic and processes events with the source_type field already set.

Cloud Pull Sources (S3, Pub/Sub, Kafka)

For cloud-based log sources, the source type is defined at the feed level. Create one feed per log type:

  • AWS CloudTrail Feed → source_type: aws_cloudtrail
  • AWS VPC Flow Logs Feed → source_type: aws_vpc_flow
  • GCP Audit Logs Feed → source_type: gcp_audit

Each feed pulls from a specific queue/topic and knows what log type to expect.

Sources Requiring Vector Aggregator

The following protocols do not support source type identification natively. To ingest data from these sources, you must deploy a Vector aggregator on-premises that receives raw logs, tags them with the appropriate source_type, and forwards them to nano via the Vector protocol. See the Vector Aggregator guide for complete setup instructions.

Syslog

Raw syslog (RFC 3164/5424) has no concept of source type. Different devices send different log formats to the same syslog port.

Solution: Deploy a Vector aggregator that routes based on hostname, appname, or content:

# Vector aggregator for syslog
[sources.syslog_all]
type = "syslog"
address = "0.0.0.0:514"

[transforms.route_by_host]
type = "remap"
inputs = ["syslog_all"]
source = '''
# Route based on hostname pattern
if .hostname =~ /^fw-/ {
    .source_type = "palo_alto"
} else if .hostname =~ /^sw-/ {
    .source_type = "cisco_switch"
} else if .appname == "sshd" {
    .source_type = "linux_auth"
} else if .appname == "nginx" {
    .source_type = "nginx"
} else {
    .source_type = "generic_syslog"
}
'''

[sinks.cloud_siem]
type = "vector"
inputs = ["route_by_host"]
address = "your-nanosiem.com:6000"

OpenTelemetry (OTLP)

OpenTelemetry Protocol doesn't include a source type concept. OTLP is designed for observability data (traces, metrics, logs) but doesn't categorize logs by security source type.

Solution: Use a Vector aggregator to receive OTLP and tag with source type:

[sources.otlp]
type = "opentelemetry"
grpc.address = "0.0.0.0:4317"
http.address = "0.0.0.0:4318"

[transforms.tag_otlp]
type = "remap"
inputs = ["otlp"]
source = '''
# Tag based on resource attributes or service name
service = to_string(.resources.service.name) ?? "unknown"
if service == "api-gateway" {
    .source_type = "api_gateway"
} else if service == "auth-service" {
    .source_type = "auth_service"
} else {
    .source_type = "otlp_" + service
}
'''

[sinks.cloud_siem]
type = "vector"
inputs = ["tag_otlp"]
address = "your-nanosiem.com:6000"

Fluent Protocol (Fluentd/Fluent Bit)

The Fluent protocol uses tags but these don't map directly to security source types.

Solution: Use a Vector aggregator to receive Fluent protocol and map tags to source types:

[sources.fluent]
type = "fluent"
address = "0.0.0.0:24224"

[transforms.tag_fluent]
type = "remap"
inputs = ["fluent"]
source = '''
# Map Fluent tags to source types
tag = to_string(.tag) ?? ""
if starts_with(tag, "kube.") {
    .source_type = "kubernetes"
} else if starts_with(tag, "docker.") {
    .source_type = "docker"
} else if starts_with(tag, "app.auth") {
    .source_type = "auth_service"
} else {
    .source_type = "fluent_" + replace(tag, ".", "_")
}
'''

[sinks.cloud_siem]
type = "vector"
inputs = ["tag_fluent"]
address = "your-nanosiem.com:6000"

File-Based Ingestion

Reading from files requires knowing what type of log each file contains.

Solution: Use a Vector aggregator with file sources configured per log type:

# Each file source knows its type
[sources.nginx_access]
type = "file"
include = ["/var/log/nginx/access.log"]

[sources.auth_log]
type = "file"
include = ["/var/log/auth.log"]

[transforms.tag_files]
type = "remap"
inputs = ["nginx_access"]
source = '.source_type = "nginx"'

[transforms.tag_auth]
type = "remap"
inputs = ["auth_log"]
source = '.source_type = "linux_auth"'

[sinks.cloud_siem]
type = "vector"
inputs = ["tag_files", "tag_auth"]
address = "your-nanosiem.com:6000"

Aggregator Deployment Pattern

For on-premises environments, the recommended architecture is:

┌─────────────────────────────────────────────────────────────┐
│                     On-Premises Network                      │
│                                                             │
│  ┌─────────┐  ┌─────────┐  ┌─────────┐  ┌─────────┐        │
│  │Firewall │  │ Servers │  │  Apps   │  │ Network │        │
│  │ Syslog  │  │ Syslog  │  │  OTLP   │  │ Devices │        │
│  └────┬────┘  └────┬────┘  └────┬────┘  └────┬────┘        │
│       │            │            │            │              │
│       └────────────┴─────┬──────┴────────────┘              │
│                          │                                  │
│                   ┌──────▼──────┐                           │
│                   │   Vector    │                           │
│                   │ Aggregator  │                           │
│                   │             │                           │
│                   │ - Receives  │                           │
│                   │ - Tags      │                           │
│                   │ - Forwards  │                           │
│                   └──────┬──────┘                           │
│                          │                                  │
└──────────────────────────┼──────────────────────────────────┘
                           │ Vector Protocol (port 6000)
                           │ TLS encrypted

                   ┌──────────────┐
                   │  nano    │
                   │   (Cloud)    │
                   └──────────────┘

Benefits of this pattern:

  • Centralized routing: All source type logic in one place
  • Bandwidth optimization: Aggregator can batch and compress
  • Security: Only one outbound connection to cloud
  • Flexibility: Add new log sources without cloud changes

Source Type Aliases

nano automatically normalizes common source type variations:

InputNormalized to
winlog, windows, winevtwindows_event
apache_access, httpdapache
cloudtrail, aws_ctaws_cloudtrail
rsyslog, syslog-ngsyslog
pan, panospalo_alto
asa, cisco_firewallcisco_asa
fortigate, fgtfortinet

Summary

Want to ingest...Use this method
Application logsHTTP with X-Source-Type header
AWS CloudTrail/VPC Flow LogsAWS S3 feed (one feed per type)
GCP Cloud logsGCP Pub/Sub feed (one feed per type)
Kafka streamsKafka feed (one feed per type)
Network device syslogVector Aggregator on-prem
Server syslogVector Aggregator on-prem
OpenTelemetry logsVector Aggregator on-prem
Fluentd/Fluent BitVector Aggregator on-prem
Log filesVector Aggregator on-prem

For any source that doesn't support native source type identification, deploy a Vector aggregator to handle tagging and forwarding.

On this page

On this page