Vector Aggregator
Deploy an on-premises Vector aggregator to forward syslog, OTLP, file, and other logs to nano
Vector Aggregator
This guide walks through deploying a Vector aggregator on your network to collect logs from sources that can't connect to nano directly — syslog from firewalls and switches, OpenTelemetry from applications, log files on servers, and Fluent protocol from Kubernetes.
The aggregator receives logs locally, tags each event with a source_type, and forwards everything to nano over Vector's native protocol on port 6000.
Why an Aggregator?
nano requires every event to have a source_type field so it can route to the correct parser. Protocols like syslog, OTLP, and Fluent don't include this concept — a syslog port receives logs from dozens of different device types on the same socket.
The Vector aggregator solves this by:
- Receiving logs via any protocol Vector supports (syslog, OTLP, file, Fluent, etc.)
- Tagging each event with the correct
source_typeusing VRL transforms - Forwarding tagged events to nano, where they enter the normal parsing pipeline
Prerequisites
- A Linux server (or VM/container) on your network that can receive logs from your devices
- Network connectivity from the aggregator to nano on port 6000
- Root/sudo access (syslog on port 514 requires privileged ports)
Step 1: Install Vector
Install Vector on your aggregator host:
# Install via the official script
curl --proto '=https' --tlsv1.2 -sSfL https://sh.vector.dev | bash
# Or via package manager (Debian/Ubuntu)
apt-get install vector
# Or via package manager (RHEL/CentOS)
yum install vectorVerify the installation:
vector --versionStep 2: Configure Sources
Create a Vector configuration file at /etc/vector/vector.toml. Start with the sources you need — you can combine any number of these in a single config.
Syslog
Receive syslog from network devices (firewalls, switches, routers, servers):
# Syslog over UDP (most network devices)
[sources.syslog_udp]
type = "syslog"
address = "0.0.0.0:514"
mode = "udp"
# Syslog over TCP (for reliable delivery)
[sources.syslog_tcp]
type = "syslog"
address = "0.0.0.0:514"
mode = "tcp"OpenTelemetry (OTLP)
Receive logs from applications instrumented with OpenTelemetry:
[sources.otlp]
type = "opentelemetry"
grpc.address = "0.0.0.0:4317"
http.address = "0.0.0.0:4318"Log Files
Tail log files on the aggregator host (or mounted volumes):
[sources.nginx_logs]
type = "file"
include = ["/var/log/nginx/access.log"]
[sources.auth_logs]
type = "file"
include = ["/var/log/auth.log"]
[sources.app_logs]
type = "file"
include = ["/var/log/myapp/*.log"]Fluent Protocol (Fluentd / Fluent Bit)
Receive logs from Fluent Bit or Fluentd forwarders (common in Kubernetes):
[sources.fluent]
type = "fluent"
address = "0.0.0.0:24224"Step 3: Tag with source_type
This is the critical step. Each event must have a .source_type field set before forwarding to nano. Use VRL (Vector Remap Language) transforms to inspect the event and assign the right type.
Syslog Routing
Route syslog events based on hostname, appname, or message content:
[transforms.tag_syslog]
type = "remap"
inputs = ["syslog_udp", "syslog_tcp"]
source = '''
# Route by hostname pattern
hostname = to_string(.hostname) ?? ""
appname = to_string(.appname) ?? ""
if starts_with(hostname, "fw-") || starts_with(hostname, "pa-") {
.source_type = "palo_alto"
} else if starts_with(hostname, "fgt-") || starts_with(hostname, "forti-") {
.source_type = "fortinet"
} else if starts_with(hostname, "asa-") {
.source_type = "cisco_asa"
} else if starts_with(hostname, "sw-") || starts_with(hostname, "cat-") {
.source_type = "cisco_switch"
} else if appname == "sshd" || appname == "sudo" || appname == "PAM" {
.source_type = "linux_auth"
} else if appname == "nginx" {
.source_type = "nginx"
} else if appname == "named" || appname == "bind" {
.source_type = "dns_server"
} else {
.source_type = "generic_syslog"
}
'''Tailor this to your network. The hostname patterns above are examples — replace them with your actual naming conventions. You can also route based on syslog facility, severity, or message content using VRL string functions like contains(), match(), and regex.
OTLP Routing
Route OpenTelemetry logs based on service name or resource attributes:
[transforms.tag_otlp]
type = "remap"
inputs = ["otlp"]
source = '''
service = to_string(.resources.service.name) ?? "unknown"
if service == "api-gateway" {
.source_type = "api_gateway"
} else if service == "auth-service" {
.source_type = "auth_service"
} else if service == "payment-service" {
.source_type = "payment_service"
} else {
# Prefix with otlp_ for unrecognized services
.source_type = "otlp_" + replace(service, "-", "_")
}
'''File Routing
For file sources, you know the type at configuration time:
[transforms.tag_nginx]
type = "remap"
inputs = ["nginx_logs"]
source = '.source_type = "nginx"'
[transforms.tag_auth]
type = "remap"
inputs = ["auth_logs"]
source = '.source_type = "linux_auth"'
[transforms.tag_app]
type = "remap"
inputs = ["app_logs"]
source = '.source_type = "my_app"'Fluent Routing
Route Fluent protocol events based on the Fluent tag:
[transforms.tag_fluent]
type = "remap"
inputs = ["fluent"]
source = '''
tag = to_string(.tag) ?? ""
if starts_with(tag, "kube.") {
.source_type = "kubernetes"
} else if starts_with(tag, "docker.") {
.source_type = "docker"
} else if starts_with(tag, "app.auth") {
.source_type = "auth_service"
} else {
.source_type = "fluent_" + replace(tag, ".", "_")
}
'''Step 4: Forward to nano
Add a Vector sink that forwards all tagged events to nano:
[sinks.nanosiem]
type = "vector"
inputs = ["tag_syslog", "tag_otlp", "tag_nginx", "tag_auth", "tag_app", "tag_fluent"]
address = "your-nano-instance.com:6000"
version = "2"
acknowledgements.enabled = true
# mTLS — required for port 6000. Download certs from the nano portal.
[sinks.nanosiem.tls]
crt_file = "/etc/vector/mtls/client.crt"
key_file = "/etc/vector/mtls/client.key"
ca_file = "/etc/vector/mtls/ca.crt"
# Batching for throughput
[sinks.nanosiem.batch]
max_events = 1000
timeout_secs = 5
# Buffer to disk if nano is temporarily unavailable
[sinks.nanosiem.buffer]
type = "disk"
max_size = 1073741824 # 1 GB
when_full = "block"List all your tag transforms in inputs. Only events that pass through a tagging transform will be forwarded. If you add a new source, remember to add its tag transform to this list.
mTLS Certificates
Port 6000 requires mutual TLS. Download your certificate bundle from the nano portal at Settings > Deployment Credentials > Vector mTLS:
ca.crt— your deployment's CA certificateclient.crt— the client certificateclient.key— the client private key
Place these on your aggregator host (e.g., /etc/vector/mtls/) and ensure the paths in the sink config match. See On-Premise Collection for certificate rotation details.
Step 5: Start the Aggregator
# Validate the config
vector validate /etc/vector/vector.toml
# Start Vector
sudo systemctl enable vector
sudo systemctl start vector
# Check logs
sudo journalctl -u vector -fRunning with Docker
If you prefer to run the aggregator as a container:
docker run -d \
--name vector-aggregator \
-v /etc/vector/vector.toml:/etc/vector/vector.toml:ro \
-p 514:514/udp \
-p 514:514/tcp \
-p 4317:4317 \
-p 4318:4318 \
-p 24224:24224 \
timberio/vector:latest-alpineStep 6: Point Sources at the Aggregator
Network Devices (Syslog)
Configure your devices to send syslog to the aggregator's IP:
# Cisco IOS/IOS-XE
logging host 10.0.1.50
logging facility local0
logging trap informational
# Palo Alto
set deviceconfig system syslog-server Vector server 10.0.1.50 transport UDP port 514 facility LOG_LOCAL0
# Fortinet FortiGate
config log syslogd setting
set status enable
set server "10.0.1.50"
set port 514
endApplications (OTLP)
Point your OpenTelemetry SDK or Collector at the aggregator:
# Environment variable for OTLP exporters
export OTEL_EXPORTER_OTLP_ENDPOINT="http://10.0.1.50:4317"Or in an OpenTelemetry Collector config:
exporters:
otlp:
endpoint: "10.0.1.50:4317"
tls:
insecure: true # Use TLS in productionStep 7: Create Log Sources in nano
Once events are flowing through the aggregator with source_type set, create a log source in nano for each source type:
- Navigate to Feeds → New Feed
- Select "Sample from existing data" — nano will see events arriving with the source type you assigned
- Select the source type (e.g.,
palo_alto,linux_auth) - The AI generates a parser for the log format
- Publish to deploy
Repeat for each distinct source type your aggregator forwards.
Complete Example Configuration
Here's a full /etc/vector/vector.toml for a typical environment with firewalls, Linux servers, and an application:
# =============================================================================
# Sources
# =============================================================================
[sources.syslog_udp]
type = "syslog"
address = "0.0.0.0:514"
mode = "udp"
[sources.syslog_tcp]
type = "syslog"
address = "0.0.0.0:514"
mode = "tcp"
[sources.otlp]
type = "opentelemetry"
grpc.address = "0.0.0.0:4317"
http.address = "0.0.0.0:4318"
# =============================================================================
# Tagging Transforms
# =============================================================================
[transforms.tag_syslog]
type = "remap"
inputs = ["syslog_udp", "syslog_tcp"]
source = '''
hostname = to_string(.hostname) ?? ""
appname = to_string(.appname) ?? ""
if starts_with(hostname, "fw-") {
.source_type = "palo_alto"
} else if starts_with(hostname, "fgt-") {
.source_type = "fortinet"
} else if appname == "sshd" || appname == "sudo" {
.source_type = "linux_auth"
} else {
.source_type = "generic_syslog"
}
'''
[transforms.tag_otlp]
type = "remap"
inputs = ["otlp"]
source = '''
service = to_string(.resources.service.name) ?? "unknown"
.source_type = "otlp_" + replace(service, "-", "_")
'''
# =============================================================================
# Sink — Forward to nano
# =============================================================================
[sinks.nanosiem]
type = "vector"
inputs = ["tag_syslog", "tag_otlp"]
address = "your-nano-instance.com:6000"
version = "2"
acknowledgements.enabled = true
[sinks.nanosiem.tls]
crt_file = "/etc/vector/mtls/client.crt"
key_file = "/etc/vector/mtls/client.key"
ca_file = "/etc/vector/mtls/ca.crt"
[sinks.nanosiem.batch]
max_events = 1000
timeout_secs = 5
[sinks.nanosiem.buffer]
type = "disk"
max_size = 1073741824
when_full = "block"Multiple Aggregators
For large or distributed environments, deploy multiple aggregators — one per site, datacenter, or network segment. All forward to the same nano instance:
┌─────────────────────┐ ┌─────────────────────┐
│ Site A │ │ Site B │
│ │ │ │
│ Firewalls ─┐ │ │ Firewalls ─┐ │
│ Servers ──┤ Vector │ │ Servers ──┤ Vector │
│ Apps ──┘ ──────┼─────┼─ Apps ──┘ ──────┼─── nano
│ │ │ │ (port 6000)
└─────────────────────┘ └─────────────────────┘Each aggregator uses its own tagging transforms. You can use the same source_type values across sites — nano merges them into a single searchable stream.
Troubleshooting
No events appearing in nano
-
Check the aggregator is receiving logs:
# Watch Vector's internal logs journalctl -u vector -f # Check Vector metrics (if enabled) curl http://localhost:8686/metrics -
Verify source_type is being set — add a temporary
consolesink to inspect events:[sinks.debug] type = "console" inputs = ["tag_syslog"] encoding.codec = "json" -
Check network connectivity to nano:
nc -zv your-nano-instance.com 6000 -
Check nano Vector logs for incoming connections:
docker logs nanosiem-vector 2>&1 | grep -i "vector_native\|vector_in\|6000"
Events arriving but not parsed
- Verify the
source_typevalue in your tagging transform matches the log source name in nano - nano normalizes common aliases automatically (e.g.,
winlog→windows_event,pan→palo_alto) but custom names must match exactly - Check System → Ingestion Errors for parse failures
Syslog not arriving at the aggregator
- Verify the device is configured to send to the aggregator's IP and port
- Check firewall rules allow UDP/TCP 514 inbound to the aggregator
- Test with
nc -lu 514ortcpdump -i any port 514to confirm traffic is reaching the host
Buffer filling up (nano unreachable)
If nano is temporarily unavailable, Vector buffers events to disk (up to max_size). When the buffer fills:
when_full = "block"— Vector stops accepting new events (backpressure). Sources will buffer or drop depending on their type.when_full = "drop_newest"— Vector drops new events but keeps accepting traffic.
For production, monitor the buffer size and set up alerting for connectivity issues.
Performance Considerations
- Syslog UDP can handle 50,000+ events/sec on modest hardware
- Disk buffering adds ~10% overhead but prevents data loss during nano outages
- Batch size of 1000 events balances throughput and latency
- For very high volume (100k+ EPS), consider running multiple aggregator instances with a load balancer in front
Next Steps
- Create detection rules for your on-prem log sources
- Configure enrichment to add GeoIP data to syslog source IPs
- Set up cloud integrations: AWS S3/SQS, GCP Pub/Sub, Kafka