Deploy an on-premises Vector aggregator to forward syslog, OTLP, file, and other logs to nano

Vector Aggregator

This guide walks through deploying a Vector aggregator on your network to collect logs from sources that can't connect to nano directly — syslog from firewalls and switches, OpenTelemetry from applications, log files on servers, and Fluent protocol from Kubernetes.

The aggregator receives logs locally, tags each event with a source_type, and forwards everything to nano over Vector's native protocol on port 6000.

Why an Aggregator?

nano requires every event to have a source_type field so it can route to the correct parser. Protocols like syslog, OTLP, and Fluent don't include this concept — a syslog port receives logs from dozens of different device types on the same socket.

The Vector aggregator solves this by:

Receiving logs via any protocol Vector supports (syslog, OTLP, file, Fluent, etc.)
Tagging each event with the correct source_type using VRL transforms
Forwarding tagged events to nano, where they enter the normal parsing pipeline

Prerequisites

A Linux server (or VM/container) on your network that can receive logs from your devices
Network connectivity from the aggregator to nano on port 6000
Root/sudo access (syslog on port 514 requires privileged ports)

Step 1: Install Vector

Install Vector on your aggregator host:

# Install via the official script
curl --proto '=https' --tlsv1.2 -sSfL https://sh.vector.dev | bash

# Or via package manager (Debian/Ubuntu)
apt-get install vector

# Or via package manager (RHEL/CentOS)
yum install vector

Verify the installation:

vector --version

Step 2: Configure Sources

Create a Vector configuration file at /etc/vector/vector.toml. Start with the sources you need — you can combine any number of these in a single config.

Syslog

Receive syslog from network devices (firewalls, switches, routers, servers):

# Syslog over UDP (most network devices)
[sources.syslog_udp]
type = "syslog"
address = "0.0.0.0:514"
mode = "udp"

# Syslog over TCP (for reliable delivery)
[sources.syslog_tcp]
type = "syslog"
address = "0.0.0.0:514"
mode = "tcp"

OpenTelemetry (OTLP)

Receive logs from applications instrumented with OpenTelemetry:

[sources.otlp]
type = "opentelemetry"
grpc.address = "0.0.0.0:4317"
http.address = "0.0.0.0:4318"

Log Files

Tail log files on the aggregator host (or mounted volumes):

[sources.nginx_logs]
type = "file"
include = ["/var/log/nginx/access.log"]

[sources.auth_logs]
type = "file"
include = ["/var/log/auth.log"]

[sources.app_logs]
type = "file"
include = ["/var/log/myapp/*.log"]

Fluent Protocol (Fluentd / Fluent Bit)

Receive logs from Fluent Bit or Fluentd forwarders (common in Kubernetes):

[sources.fluent]
type = "fluent"
address = "0.0.0.0:24224"

Step 3: Tag with source_type

This is the critical step. Each event must have a .source_type field set before forwarding to nano. Use VRL (Vector Remap Language) transforms to inspect the event and assign the right type.

Syslog Routing

Route syslog events based on hostname, appname, or message content:

[transforms.tag_syslog]
type = "remap"
inputs = ["syslog_udp", "syslog_tcp"]
source = '''
# Route by hostname pattern
hostname = to_string(.hostname) ?? ""
appname = to_string(.appname) ?? ""

if starts_with(hostname, "fw-") || starts_with(hostname, "pa-") {
    .source_type = "palo_alto"
} else if starts_with(hostname, "fgt-") || starts_with(hostname, "forti-") {
    .source_type = "fortinet"
} else if starts_with(hostname, "asa-") {
    .source_type = "cisco_asa"
} else if starts_with(hostname, "sw-") || starts_with(hostname, "cat-") {
    .source_type = "cisco_switch"
} else if appname == "sshd" || appname == "sudo" || appname == "PAM" {
    .source_type = "linux_auth"
} else if appname == "nginx" {
    .source_type = "nginx"
} else if appname == "named" || appname == "bind" {
    .source_type = "dns_server"
} else {
    .source_type = "generic_syslog"
}
'''

Tailor this to your network. The hostname patterns above are examples — replace them with your actual naming conventions. You can also route based on syslog facility, severity, or message content using VRL string functions like contains(), match(), and regex.

OTLP Routing

Route OpenTelemetry logs based on service name or resource attributes:

[transforms.tag_otlp]
type = "remap"
inputs = ["otlp"]
source = '''
service = to_string(.resources.service.name) ?? "unknown"

if service == "api-gateway" {
    .source_type = "api_gateway"
} else if service == "auth-service" {
    .source_type = "auth_service"
} else if service == "payment-service" {
    .source_type = "payment_service"
} else {
    # Prefix with otlp_ for unrecognized services
    .source_type = "otlp_" + replace(service, "-", "_")
}
'''

File Routing

For file sources, you know the type at configuration time:

[transforms.tag_nginx]
type = "remap"
inputs = ["nginx_logs"]
source = '.source_type = "nginx"'

[transforms.tag_auth]
type = "remap"
inputs = ["auth_logs"]
source = '.source_type = "linux_auth"'

[transforms.tag_app]
type = "remap"
inputs = ["app_logs"]
source = '.source_type = "my_app"'

Fluent Routing

Route Fluent protocol events based on the Fluent tag:

[transforms.tag_fluent]
type = "remap"
inputs = ["fluent"]
source = '''
tag = to_string(.tag) ?? ""

if starts_with(tag, "kube.") {
    .source_type = "kubernetes"
} else if starts_with(tag, "docker.") {
    .source_type = "docker"
} else if starts_with(tag, "app.auth") {
    .source_type = "auth_service"
} else {
    .source_type = "fluent_" + replace(tag, ".", "_")
}
'''

Step 4: Forward to nano

Add a Vector sink that forwards all tagged events to nano:

[sinks.nanosiem]
type = "vector"
inputs = ["tag_syslog", "tag_otlp", "tag_nginx", "tag_auth", "tag_app", "tag_fluent"]
address = "your-nano-instance.com:6000"
version = "2"
acknowledgements.enabled = true

# mTLS — required for port 6000. Download certs from the nano portal.
[sinks.nanosiem.tls]
crt_file = "/etc/vector/mtls/client.crt"
key_file = "/etc/vector/mtls/client.key"
ca_file = "/etc/vector/mtls/ca.crt"

# Batching for throughput
[sinks.nanosiem.batch]
max_events = 1000
timeout_secs = 5

# Buffer to disk if nano is temporarily unavailable
[sinks.nanosiem.buffer]
type = "disk"
max_size = 1073741824  # 1 GB
when_full = "block"

List all your tag transforms in inputs. Only events that pass through a tagging transform will be forwarded. If you add a new source, remember to add its tag transform to this list.

mTLS Certificates

Port 6000 requires mutual TLS. Download your certificate bundle from the nano portal at Settings > Deployment Credentials > Vector mTLS:

ca.crt — your deployment's CA certificate
client.crt — the client certificate
client.key — the client private key

Place these on your aggregator host (e.g., /etc/vector/mtls/) and ensure the paths in the sink config match. See On-Premise Collection for certificate rotation details.

Step 5: Start the Aggregator

# Validate the config
vector validate /etc/vector/vector.toml

# Start Vector
sudo systemctl enable vector
sudo systemctl start vector

# Check logs
sudo journalctl -u vector -f

Running with Docker

If you prefer to run the aggregator as a container:

docker run -d \
  --name vector-aggregator \
  -v /etc/vector/vector.toml:/etc/vector/vector.toml:ro \
  -p 514:514/udp \
  -p 514:514/tcp \
  -p 4317:4317 \
  -p 4318:4318 \
  -p 24224:24224 \
  timberio/vector:latest-alpine

Step 6: Point Sources at the Aggregator

Network Devices (Syslog)

Configure your devices to send syslog to the aggregator's IP:

# Cisco IOS/IOS-XE
logging host 10.0.1.50
logging facility local0
logging trap informational

# Palo Alto
set deviceconfig system syslog-server Vector server 10.0.1.50 transport UDP port 514 facility LOG_LOCAL0

# Fortinet FortiGate
config log syslogd setting
    set status enable
    set server "10.0.1.50"
    set port 514
end

Applications (OTLP)

Point your OpenTelemetry SDK or Collector at the aggregator:

# Environment variable for OTLP exporters
export OTEL_EXPORTER_OTLP_ENDPOINT="http://10.0.1.50:4317"

Or in an OpenTelemetry Collector config:

exporters:
  otlp:
    endpoint: "10.0.1.50:4317"
    tls:
      insecure: true  # Use TLS in production

Step 7: Create Log Sources in nano

Once events are flowing through the aggregator with source_type set, create a log source in nano for each source type:

Navigate to Feeds → New Feed
Select "Sample from existing data" — nano will see events arriving with the source type you assigned
Select the source type (e.g., palo_alto, linux_auth)
The AI generates a parser for the log format
Publish to deploy

Repeat for each distinct source type your aggregator forwards.

Complete Example Configuration

Here's a full /etc/vector/vector.toml for a typical environment with firewalls, Linux servers, and an application:

# =============================================================================
# Sources
# =============================================================================

[sources.syslog_udp]
type = "syslog"
address = "0.0.0.0:514"
mode = "udp"

[sources.syslog_tcp]
type = "syslog"
address = "0.0.0.0:514"
mode = "tcp"

[sources.otlp]
type = "opentelemetry"
grpc.address = "0.0.0.0:4317"
http.address = "0.0.0.0:4318"

# =============================================================================
# Tagging Transforms
# =============================================================================

[transforms.tag_syslog]
type = "remap"
inputs = ["syslog_udp", "syslog_tcp"]
source = '''
hostname = to_string(.hostname) ?? ""
appname = to_string(.appname) ?? ""

if starts_with(hostname, "fw-") {
    .source_type = "palo_alto"
} else if starts_with(hostname, "fgt-") {
    .source_type = "fortinet"
} else if appname == "sshd" || appname == "sudo" {
    .source_type = "linux_auth"
} else {
    .source_type = "generic_syslog"
}
'''

[transforms.tag_otlp]
type = "remap"
inputs = ["otlp"]
source = '''
service = to_string(.resources.service.name) ?? "unknown"
.source_type = "otlp_" + replace(service, "-", "_")
'''

# =============================================================================
# Sink — Forward to nano
# =============================================================================

[sinks.nanosiem]
type = "vector"
inputs = ["tag_syslog", "tag_otlp"]
address = "your-nano-instance.com:6000"
version = "2"
acknowledgements.enabled = true

[sinks.nanosiem.tls]
crt_file = "/etc/vector/mtls/client.crt"
key_file = "/etc/vector/mtls/client.key"
ca_file = "/etc/vector/mtls/ca.crt"

[sinks.nanosiem.batch]
max_events = 1000
timeout_secs = 5

[sinks.nanosiem.buffer]
type = "disk"
max_size = 1073741824
when_full = "block"

Multiple Aggregators

For large or distributed environments, deploy multiple aggregators — one per site, datacenter, or network segment. All forward to the same nano instance:

┌─────────────────────┐     ┌─────────────────────┐
│   Site A             │     │   Site B             │
│                      │     │                      │
│  Firewalls ─┐        │     │  Firewalls ─┐        │
│  Servers  ──┤ Vector  │     │  Servers  ──┤ Vector  │
│  Apps     ──┘   ──────┼─────┼─  Apps    ──┘   ──────┼─── nano
│                      │     │                      │     (port 6000)
└─────────────────────┘     └─────────────────────┘

Each aggregator uses its own tagging transforms. You can use the same source_type values across sites — nano merges them into a single searchable stream.

Troubleshooting

No events appearing in nano

Check the aggregator is receiving logs:

# Watch Vector's internal logs
journalctl -u vector -f

# Check Vector metrics (if enabled)
curl http://localhost:8686/metrics

Verify source_type is being set — add a temporary console sink to inspect events:

[sinks.debug]
type = "console"
inputs = ["tag_syslog"]
encoding.codec = "json"

Check network connectivity to nano:
```
nc -zv your-nano-instance.com 6000
```

Check nano Vector logs for incoming connections:

docker logs nanosiem-vector 2>&1 | grep -i "vector_native\|vector_in\|6000"

Events arriving but not parsed

Verify the source_type value in your tagging transform matches the log source name in nano
nano normalizes common aliases automatically (e.g., winlog → windows_event, pan → palo_alto) but custom names must match exactly
Check System → Ingestion Errors for parse failures

Syslog not arriving at the aggregator

Verify the device is configured to send to the aggregator's IP and port
Check firewall rules allow UDP/TCP 514 inbound to the aggregator
Test with nc -lu 514 or tcpdump -i any port 514 to confirm traffic is reaching the host

Buffer filling up (nano unreachable)

If nano is temporarily unavailable, Vector buffers events to disk (up to max_size). When the buffer fills:

when_full = "block" — Vector stops accepting new events (backpressure). Sources will buffer or drop depending on their type.
when_full = "drop_newest" — Vector drops new events but keeps accepting traffic.

For production, monitor the buffer size and set up alerting for connectivity issues.

Performance Considerations

Syslog UDP can handle 50,000+ events/sec on modest hardware
Disk buffering adds ~10% overhead but prevents data loss during nano outages
Batch size of 1000 events balances throughput and latency
For very high volume (100k+ EPS), consider running multiple aggregator instances with a load balancer in front

Next Steps

Create detection rules for your on-prem log sources
Configure enrichment to add GeoIP data to syslog source IPs
Set up cloud integrations: AWS S3/SQS, GCP Pub/Sub, Kafka

Vector Aggregator

On this page