On-Premise Log Collection
Deploy Vector agents and aggregators to collect Windows events, syslog, and any on-premise log source and ship them to nano
On-Premise Log Collection
This guide covers collecting logs from on-premise infrastructure — Windows endpoints, Linux servers, network devices, and applications — and shipping them to nano using Vector.
The architecture uses a two-tier model: lightweight agents on endpoints forward to a central aggregator, which buffers and ships everything to nano. This keeps endpoint footprint small, provides a single egress point for firewall rules, and gives you disk-backed buffering so no events are lost if nano is temporarily unreachable.
Architecture
Why two tiers?
| Concern | Agent-only (direct to SIEM) | Agent + Aggregator |
|---|---|---|
| Firewall rules | Every endpoint needs outbound to SIEM | Only the aggregator needs outbound |
| Buffering | Each endpoint buffers independently | Centralized 1 GB disk buffer |
| Enrichment | Must happen on each endpoint | Can enrich centrally |
| Monitoring | Monitor every endpoint | Monitor one aggregator |
| Network | Many small connections | One persistent connection |
For small environments (1-5 endpoints), agents can ship directly to nano by pointing the sink at your-nano-instance.com:6000 instead of the aggregator. You'll need the mTLS certificates on each agent — see mTLS certificates below. The configs work either way — just change the sink address and add the TLS block.
What You Can Collect
Vector has native sources for nearly every on-premise log type. The agent/aggregator model described here works for all of them:
| Source | Vector Source Type | Typical Use |
|---|---|---|
| Windows Event Log | windows_event_log | Security, Sysmon, PowerShell, Defender, Application, System |
| Linux syslog | journald or file | Auth logs, kernel, systemd services |
| Log files | file | Application logs, web server access/error logs, audit logs |
| Syslog (network) | syslog | Firewalls, switches, routers, appliances |
| OTLP | opentelemetry | Application telemetry (traces, metrics, logs) |
| Fluent protocol | fluent | Kubernetes, Docker, Fluent Bit forwarders |
| Docker logs | docker_logs | Container stdout/stderr |
| Kafka | kafka | Existing Kafka-based log pipelines |
| SNMP traps | exec + trap receiver | Network device events |
This guide focuses on the most common on-prem scenario — Windows Event Log collection — then shows how to extend the pattern for other sources.
Part 1: Windows Event Log Agent
The Vector agent runs as a Windows service on each endpoint, collecting events from the Windows Event Log API and forwarding them to the aggregator.
Channels collected
The recommended baseline covers the security-relevant event channels:
| Channel | What it captures |
|---|---|
| Security | Logon/logoff, privilege use, object access, policy changes |
| System | Service start/stop, driver loading, time changes |
| Application | Application crashes, warnings, installer events |
| Sysmon | Process creation, network connections, file creation, registry access, DNS queries |
| PowerShell | Script block logging, module logging, transcription |
| Windows Defender | Detections, quarantine actions, scan results |
Sysmon is strongly recommended. Without Sysmon, you lose visibility into process creation with command lines, network connections per process, and file/registry activity. Install Sysmon with a community config like SwiftOnSecurity/sysmon-config or olafhartong/sysmon-modular before deploying Vector.
Install Vector on Windows
Download and install the Vector MSI:
# Download Vector (use the latest stable release 0.55.0+ for windows_event_log support)
Invoke-WebRequest -Uri "https://packages.timber.io/vector/latest/vector-x64.msi" -OutFile "$env:TEMP\vector.msi"
# Install silently
Start-Process msiexec.exe -Wait -ArgumentList "/i $env:TEMP\vector.msi /quiet /norestart"
# Verify installation
& "C:\Program Files\Vector\bin\vector.exe" --versionVector installs to C:\Program Files\Vector and creates a Windows service called vector.
Agent configuration
Create the configuration file at C:\ProgramData\Vector\vector.toml:
# Vector Agent — Windows Event Log Collection
# Ships to aggregator at <AGGREGATOR_IP>:9000
data_dir = "C:/ProgramData/Vector/data"
# =============================================================================
# Source: Windows Event Log
# =============================================================================
[sources.windows_events]
type = "windows_event_log"
channels = [
"Application",
"System",
"Security",
"Microsoft-Windows-Sysmon/Operational",
"Microsoft-Windows-PowerShell/Operational",
"Microsoft-Windows-Windows Defender/Operational"
]
# Don't backfill existing events on first run (set to true for initial import)
read_existing_events = false
# Rate limit per channel — prevents event storms from overwhelming the agent
events_per_second = 500
# Truncate very long field values (command lines, PowerShell scripts)
max_event_data_length = 2000
# =============================================================================
# Transform: Enrich with host metadata and classify
# =============================================================================
[transforms.enrich_host]
type = "remap"
inputs = ["windows_events"]
source = '''
# Classify — Sysmon events get their own source_type for dedicated parsing
channel = downcase(to_string(.channel) ?? "")
source_type = "windows_event"
if contains(channel, "sysmon") {
source_type = "windows_sysmon"
}
# Flatten the structured event into a JSON message
event_json = encode_json(.)
# Output: flat event with message, source_type, and hostname
. = {}
.message = event_json
.source_type = source_type
.src_host = "YOUR_HOSTNAME"
'''
# =============================================================================
# Sink: Forward to aggregator via Vector protocol
# =============================================================================
[sinks.aggregator]
type = "vector"
inputs = ["enrich_host"]
address = "AGGREGATOR_IP:9000"
acknowledgements.enabled = true
# Disk buffer — survives agent restarts without losing events
[sinks.aggregator.buffer]
type = "disk"
max_size = 536870912 # 512 MB
when_full = "block"Replace AGGREGATOR_IP with your aggregator's IP address and YOUR_HOSTNAME with the machine's hostname (or use a script to auto-detect — see the management section below).
Start the agent
# Validate the config
& "C:\Program Files\Vector\bin\vector.exe" validate "C:\ProgramData\Vector\vector.toml"
# Start the Windows service
Start-Service vector
# Verify it's running
Get-Service vector
# Watch logs (useful for first-time setup)
Get-Content "C:\ProgramData\Vector\data\vector.log" -Tail 50 -WaitTuning
| Parameter | Default | When to change |
|---|---|---|
events_per_second | 500 | Increase on busy DCs with heavy audit policies. Decrease on workstations to reduce CPU. |
max_event_data_length | 2000 | Increase if you need full PowerShell script blocks or very long command lines. |
read_existing_events | false | Set to true on first deployment if you need historical events. Reset to false after initial import. |
buffer.max_size | 512 MB | Increase if the aggregator may be offline for extended periods. |
Part 2: Aggregator
The aggregator is a Linux host (VM, bare metal, or container) that receives events from all agents, optionally enriches them, and forwards to nano. It runs Vector in a Docker container for easy management.
Architecture placement
Place the aggregator on the same network segment as your endpoints, or at least where agents can reach it on port 9000. The aggregator only needs one outbound connection — to nano on port 6000.
Prerequisites
- Linux host with Docker and Docker Compose installed
- Network access from agents on port 9000 (inbound)
- Network access to nano on port 6000 (outbound)
- At least 2 GB of disk space for buffering
Aggregator configuration
Create a directory for the aggregator:
sudo mkdir -p /opt/vector-aggregator/config
cd /opt/vector-aggregatorCreate the Vector config at /opt/vector-aggregator/config/vector.toml:
# Vector Aggregator — Receives from agents, forwards to nano
# Listens on port 9000 for Vector protocol connections
data_dir = "/var/lib/vector"
[api]
enabled = true
address = "0.0.0.0:8686"
# =============================================================================
# Source: Vector protocol (receives from agents)
# =============================================================================
[sources.agents]
type = "vector"
address = "0.0.0.0:9000"
acknowledgements.enabled = true
# =============================================================================
# Transform: Enrich with aggregator metadata (optional)
# =============================================================================
[transforms.enrich]
type = "remap"
inputs = ["agents"]
source = '''
# Tag events with the aggregator they passed through
.aggregator_host = "aggregator-01"
.aggregator_timestamp = now()
'''
# =============================================================================
# Sink: Forward to nano (Vector-to-Vector protocol)
# =============================================================================
[sinks.siem]
type = "vector"
inputs = ["enrich"]
address = "YOUR_NANO_INSTANCE:6000"
acknowledgements.enabled = true
# mTLS — required for port 6000. Download certs from the nano portal.
[sinks.siem.tls]
crt_file = "/etc/vector/mtls/client.crt"
key_file = "/etc/vector/mtls/client.key"
ca_file = "/etc/vector/mtls/ca.crt"
# Disk buffer — 1 GB, blocks when full to ensure no data loss
[sinks.siem.buffer]
type = "disk"
max_size = 1073741824 # 1 GB
when_full = "block"
# =============================================================================
# Metrics — Prometheus endpoint for monitoring
# =============================================================================
[sources.internal_metrics]
type = "internal_metrics"
scrape_interval_secs = 10
namespace = "nanosiem_aggregator"
[sinks.prometheus_metrics]
type = "prometheus_exporter"
inputs = ["internal_metrics"]
address = "0.0.0.0:9598"
default_namespace = "nanosiem"Replace YOUR_NANO_INSTANCE with your nano instance's hostname or IP.
Docker Compose
Create /opt/vector-aggregator/docker-compose.yml:
version: '3.8'
services:
vector-aggregator:
image: timberio/vector:latest-alpine
container_name: vector-aggregator
security_opt:
- no-new-privileges:true
environment:
VECTOR_DATA_DIR: /var/lib/vector
ports:
# Vector protocol — agents connect here
- "9000:9000"
# Prometheus metrics
- "9598:9598"
# Vector API (health checks, reloading)
- "8686:8686"
volumes:
- ./config/vector.toml:/etc/vector/vector.toml:ro
- vector-aggregator-data:/var/lib/vector
command: ["--config", "/etc/vector/vector.toml"]
healthcheck:
test: ["CMD", "wget", "-q", "--spider", "http://localhost:9598/metrics"]
interval: 10s
timeout: 5s
retries: 3
restart: unless-stopped
volumes:
vector-aggregator-data:Start the aggregator
cd /opt/vector-aggregator
# Validate the config
docker run --rm -v $(pwd)/config:/etc/vector:ro timberio/vector:latest-alpine validate /etc/vector/vector.toml
# Start
docker compose up -d
# Check logs
docker compose logs -f
# Verify health
curl -s http://localhost:9598/metrics | head -20Monitoring the aggregator
The aggregator exposes Prometheus metrics on port 9598. Key metrics to watch:
| Metric | What it tells you |
|---|---|
nanosiem_component_received_events_total{component_id="agents"} | Total events received from agents |
nanosiem_component_sent_events_total{component_id="siem"} | Total events forwarded to nano |
nanosiem_buffer_events{component_id="siem"} | Events currently in the disk buffer |
nanosiem_buffer_byte_size{component_id="siem"} | Buffer disk usage in bytes |
nanosiem_component_errors_total{component_id="siem"} | Errors sending to nano (connectivity issues) |
If you run Prometheus, add the aggregator as a scrape target:
scrape_configs:
- job_name: 'vector-aggregator'
static_configs:
- targets: ['aggregator-ip:9598']Firewall rules
| Direction | Port | Protocol | Purpose |
|---|---|---|---|
| Inbound | 9000/tcp | Vector protocol | Agents connect to send events |
| Inbound | 9598/tcp | HTTP | Prometheus metrics scraping |
| Inbound | 8686/tcp | HTTP | Vector API (optional, restrict to admin IPs) |
| Outbound | 6000/tcp | Vector protocol | Forward events to nano |
Part 3: How Events Reach nano
Once events leave the aggregator, here's what happens on the nano side:
- Vector Native Source (port 6000) receives events over mTLS — only clients with a valid certificate signed by your deployment's CA can connect
- Auth & Normalization marks events as
vector_forwarded, normalizessource_typeto lowercase, ensures.messageexists - Source Type Router sends events to the correct parser based on
source_type(e.g.,windows_event,windows_sysmon,apache, etc.) - Parsers extract structured fields from the raw event into UDM columns (IP addresses, usernames, process info, etc.)
- UDM Normalization maps extracted fields to the unified data model
- Deduplication prevents duplicates from retries
- ClickHouse stores the normalized event for search and detection
nano-side configuration
nano's Vector instance is pre-configured to receive on port 6000 with mTLS enabled. No changes are needed on the nano side — just ensure the port is reachable from your aggregator and your aggregator has valid mTLS certificates (see below).
If nano runs behind a load balancer, expose port 6000 in addition to the standard HTTP ports. The Vector protocol is a persistent TCP connection, so configure the load balancer for TCP passthrough (not HTTP) to preserve the mTLS handshake end-to-end.
mTLS certificates
Port 6000 requires mutual TLS — your aggregator must present a valid client certificate to connect. Each nano deployment has its own private CA that signs client certificates.
Download your certificates from the nano portal:
- Go to Settings > Deployment Credentials > Vector mTLS
- Download the certificate bundle — three PEM files:
ca.crt— your deployment's CA certificateclient.crt— the client certificateclient.key— the client private key
- Place these files on your aggregator host (e.g.,
/etc/vector/mtls/)
The aggregator config above already includes the TLS block pointing to these files. Once the certs are in place, the aggregator can connect to nano on port 6000.
HTTP ingestion (port 8080) uses Bearer token auth instead of mTLS. If you prefer token-based auth, you can configure your aggregator to send via HTTP — see the Vector Aggregator guide. The Vector native protocol (port 6000) is recommended for better performance and backpressure handling.
Certificate management
Manage your mTLS certificates from Settings > Deployment Credentials:
| Action | What happens | Vector restart? |
|---|---|---|
| Download Bundle | Downloads ca.crt, client.crt, client.key | No |
| Regenerate Client Cert | Issues a new client cert signed by the same CA. Old certs remain valid until they expire (1-year lifetime). | No |
| Rotate CA | Generates an entirely new CA + client cert. All existing client certs are immediately invalidated. | Yes (rolling restart) |
Rotating the CA invalidates all existing client certificates immediately. Update the certificate files on all your aggregators before or immediately after rotating, or they will lose connectivity to nano.
Part 4: Managing the Deployment
Configuration management with Ansible
For environments with more than a few endpoints, use Ansible (or your preferred configuration management tool) to deploy and manage Vector agents. Here's the pattern:
# inventory.yml
all:
children:
windows_agents:
hosts:
dc01:
ansible_host: 10.0.10.11
srv01:
ansible_host: 10.0.10.12
ws01:
ansible_host: 10.0.10.13
vars:
ansible_connection: winrm
ansible_winrm_transport: ntlm
aggregators:
hosts:
agg01:
ansible_host: 10.0.10.50# playbook.yml
- name: Deploy Vector agents on Windows
hosts: windows_agents
vars:
vector_aggregator_ip: "10.0.10.50"
vector_aggregator_port: 9000
tasks:
- name: Install Vector MSI
ansible.windows.win_package:
path: "https://packages.timber.io/vector/latest/vector-x64.msi"
state: present
- name: Deploy Vector config
ansible.windows.win_template:
src: vector-agent.toml.j2
dest: C:\ProgramData\Vector\vector.toml
notify: Restart Vector
- name: Ensure Vector service is running
ansible.windows.win_service:
name: vector
state: started
start_mode: auto
handlers:
- name: Restart Vector
ansible.windows.win_service:
name: vector
state: restarted
- name: Deploy Vector aggregator
hosts: aggregators
tasks:
- name: Create aggregator directory
file:
path: /opt/vector-aggregator/config
state: directory
- name: Deploy Vector config
template:
src: vector-aggregator.toml.j2
dest: /opt/vector-aggregator/config/vector.toml
notify: Restart aggregator
- name: Deploy Docker Compose
template:
src: docker-compose.yml.j2
dest: /opt/vector-aggregator/docker-compose.yml
notify: Restart aggregator
- name: Start aggregator
community.docker.docker_compose_v2:
project_src: /opt/vector-aggregator
state: present
handlers:
- name: Restart aggregator
community.docker.docker_compose_v2:
project_src: /opt/vector-aggregator
state: restartedAdding or removing channels
To collect additional Windows event channels (e.g., DNS Server, DHCP, IIS), add them to the agent config:
channels = [
"Application",
"System",
"Security",
"Microsoft-Windows-Sysmon/Operational",
"Microsoft-Windows-PowerShell/Operational",
"Microsoft-Windows-Windows Defender/Operational",
# Add more channels:
"DNS Server",
"Microsoft-Windows-DHCP-Server/Operational",
"Microsoft-IIS-Logging/Logs"
]Then restart the Vector service on affected endpoints. With Ansible, this is a single command:
ansible-playbook playbook.yml --tags config --limit windows_agentsUpgrading Vector
# On Windows agents
Stop-Service vector
# Install new MSI (overwrites previous)
Start-Process msiexec.exe -Wait -ArgumentList "/i $env:TEMP\vector-new.msi /quiet /norestart"
Start-Service vector# On the aggregator
cd /opt/vector-aggregator
docker compose pull
docker compose up -dHealth checks
Quick checks to verify the pipeline is healthy:
# Aggregator: are events flowing?
curl -s http://aggregator-ip:9598/metrics | grep component_received_events_total
# Aggregator: is the buffer draining? (should be near 0)
curl -s http://aggregator-ip:9598/metrics | grep buffer_events
# Aggregator: Vector API health
curl -s http://aggregator-ip:8686/health
# nano: check for the source types in search
# Run in nano: source_type=windows_event | stats count by src_host# Windows agent: is the service running?
Get-Service vector
# Windows agent: check for errors in Vector's own log
Get-Content "C:\ProgramData\Vector\data\vector.log" -Tail 20Part 5: Extending to Other Log Sources
The same aggregator can receive logs from any source — not just Windows agents. Add sources directly on the aggregator to collect from devices that don't run Vector.
Linux servers (journald)
Deploy a Vector agent on Linux servers just like Windows, but use the journald source:
# Linux agent — /etc/vector/vector.toml
data_dir = "/var/lib/vector"
[sources.journald]
type = "journald"
include_units = ["sshd", "sudo", "systemd-logind", "auditd"]
[transforms.enrich]
type = "remap"
inputs = ["journald"]
source = '''
unit = to_string(._SYSTEMD_UNIT) ?? ""
if contains(unit, "sshd") || contains(unit, "sudo") || contains(unit, "logind") {
.source_type = "linux_auth"
} else if contains(unit, "auditd") {
.source_type = "linux_audit"
} else {
.source_type = "linux_syslog"
}
.src_host = get_hostname!()
.message = to_string(.MESSAGE) ?? encode_json(.)
'''
[sinks.aggregator]
type = "vector"
inputs = ["enrich"]
address = "AGGREGATOR_IP:9000"
acknowledgements.enabled = true
[sinks.aggregator.buffer]
type = "disk"
max_size = 268435456 # 256 MB
when_full = "block"Syslog from network devices
Add a syslog source directly on the aggregator to receive from firewalls, switches, and appliances:
# Add to the aggregator's vector.toml
[sources.syslog_udp]
type = "syslog"
address = "0.0.0.0:514"
mode = "udp"
[sources.syslog_tcp]
type = "syslog"
address = "0.0.0.0:514"
mode = "tcp"
[transforms.tag_syslog]
type = "remap"
inputs = ["syslog_udp", "syslog_tcp"]
source = '''
hostname = downcase(to_string(.hostname) ?? "")
appname = to_string(.appname) ?? ""
if starts_with(hostname, "fw-") || starts_with(hostname, "pa-") {
.source_type = "palo_alto"
} else if starts_with(hostname, "fgt-") {
.source_type = "fortinet"
} else if starts_with(hostname, "asa-") {
.source_type = "cisco_asa"
} else {
.source_type = "generic_syslog"
}
'''Update the aggregator's sink to include the syslog transform:
[sinks.siem]
inputs = ["enrich", "tag_syslog"]
# ... rest of sink config unchangedFor a more detailed guide on aggregator source configuration (syslog, OTLP, Fluent, log files), see the Vector Aggregator page.
Application logs (file tailing)
Tail log files on any server with a Vector agent:
[sources.app_logs]
type = "file"
include = ["/var/log/myapp/*.log"]
[transforms.tag_app]
type = "remap"
inputs = ["app_logs"]
source = '''
.source_type = "my_app"
.src_host = get_hostname!()
'''
[sinks.aggregator]
type = "vector"
inputs = ["tag_app"]
address = "AGGREGATOR_IP:9000"
acknowledgements.enabled = truePart 6: Creating Parsers in nano
Once events are flowing, create parsers in nano so they're properly normalized:
- Go to Feeds in nano
- Click New Feed
- Select "Sample from existing data" — nano will show the
source_typevalues it's receiving - Select your source type (e.g.,
windows_event,windows_sysmon) - The AI generates a VRL parser that extracts structured fields into UDM columns
- Review the parser, test with sample events, then Publish
Published parsers deploy to all Vector pods within ~40 seconds — no restarts needed.
Troubleshooting
Agent not sending events
- Check the service is running:
Get-Service vector(Windows) orsystemctl status vector(Linux) - Check agent logs:
Get-Content C:\ProgramData\Vector\data\vector.log -Tail 50 - Verify network connectivity:
Test-NetConnection -ComputerName AGGREGATOR_IP -Port 9000 - Validate the config:
& "C:\Program Files\Vector\bin\vector.exe" validate "C:\ProgramData\Vector\vector.toml"
Aggregator not forwarding
- Check container health:
docker compose ps— status should be "healthy" - Check logs:
docker compose logs --tail 100 - Verify events are arriving: Check
component_received_events_totalmetric at:9598/metrics - Verify nano connectivity:
nc -zv your-nano-instance.com 6000 - Check buffer: If
buffer_eventsis growing, the aggregator can't reach nano
Events arriving but not parsed
- Verify the
source_typein your agent config matches what the parser expects - Check Feeds in nano for the source type — if there's no parser, events go to the generic parser
- Review System > Ingestion Errors for parse failures
High buffer usage
If the disk buffer on the agent or aggregator is filling up:
- Agent buffer growing: The aggregator is unreachable or overloaded. Check aggregator health.
- Aggregator buffer growing: nano is unreachable. Check network connectivity to nano on port 6000.
- Increase
max_sizeif the outage will be extended. Events are safe on disk and will drain when connectivity is restored.
Next Steps
- Vector Aggregator — Add syslog, OTLP, Fluent, and file sources to your aggregator
- Detection Rules — Create rules to detect threats in your on-prem logs
- Enrichment — Add GeoIP and threat intel to your log data
- Deployment Architecture — Full production deployment guide