nano SIEM
Search Commands

Search Best Practices

Write faster, more efficient queries by understanding how nano's search engine works under the hood

Search Best Practices

nano's query engine is built on ClickHouse and uses bloom filter indexes to skip large chunks of data. Writing your queries with these indexes in mind can turn a 30-second search into a sub-second one. This guide covers the most impactful patterns.

Use Field Filters Instead of Bare Keywords

The single biggest performance improvement you can make is using field filters instead of free-text keyword searches — especially when your search term contains special characters like dots, slashes, or colons.

Why bare keywords with special characters are slow

When you search for a bare keyword like github.com, the engine can't use the bloom token index because tokens are split on non-alphanumeric characters. Instead, it falls back to a substring scan (iLike '%github.com%') across the entire message column for every event in your time range.

The engine applies a "bloom guard" optimization — it extracts the longest alphanumeric token (e.g., github) and checks the bloom index first, then verifies with the substring match. This helps, but it's still significantly slower than a field filter.

The fix: specify the field

Query1-hour scanWhy
github.com~3 secondsSubstring scan on message with bloom guard
github~400 msPure token — hits bloom index directly
src_host="github.com" OR dest_host="github.com"~400 msExact match on indexed columns
dest_host="github.com"~200 msSingle indexed field, exact match
# Slow — free-text with special characters
github.com

# Fast — field filter with exact match
dest_host="github.com"

# Fast — if you need both directions
src_host="github.com" OR dest_host="github.com"

This applies to any keyword with special characters:

# Slow — dots, slashes, colons in bare keywords
192.168.1.100
/etc/passwd
C:\Windows\System32

# Fast — use the appropriate field
src_ip="192.168.1.100"
file_path="/etc/passwd"
file_path="C:\Windows\System32"

When you search for a bare keyword with special characters, nano shows a warning:

Searching for [keyword] scans the full message text across all log sources. Use a field filter for faster results.

This is the engine telling you to use a field filter. Follow its advice.

Prefer Trailing Wildcards Over Leading Wildcards

Trailing wildcards (admin*) can use bloom filter indexes. Leading wildcards (*admin) cannot — they force a full substring scan.

# Fast — trailing wildcard uses index
user="admin*"
process_name="powershell*"

# Slow — leading wildcard bypasses index
user="*admin"
filename="*.exe"

# Better alternatives for suffix matching
filename ENDSWITH ".exe"

If you need to find a substring in the middle of a value, use CONTAINS — it leverages the ngram bloom filter for better performance than a double-wildcard:

# Good
command_line CONTAINS "encoded"

# Acceptable but slower
command_line="*encoded*"

Filter Early, Aggregate Late

Put your most selective filters in the search expression (before the first pipe), not in a where clause after the pipe. The search expression runs in ClickHouse where it can use indexes. Post-pipe filtering runs on already-fetched results.

# Fast — ClickHouse filters with indexes before returning results
source_type="sysmon" event_type="process_create" user="admin"
| stats count() by process_name

# Slower — fetches all sysmon events, then filters in post-processing
source_type="sysmon" event_type="process_create"
| where user="admin"
| stats count() by process_name

Add Limits Before Expensive Operations

Several commands buffer their entire input in memory before producing output. Without a limit, these can consume unbounded memory and either slow down or crash your query.

dedup

dedup loads all events, sorts them, and removes duplicates. On large result sets, this can exhaust memory.

# Bad — unbounded dedup (Error-severity warning)
* | dedup user

# Good — limit input first
* | head 10000 | dedup user

# Better — aggregate instead if you just want unique values
* | stats count() by user

sort

sort buffers all events to order them. Pre-filter or limit first.

# Bad — sorts entire result set (Warning)
* | sort -bytes_out

# Good — limit first
* | head 10000 | sort -bytes_out

# Better — aggregate to reduce the set before sorting
* | stats sum(bytes_out) as total by src_ip | sort -total

eventstats and streamstats

Both compute window functions over their input. Without filters, they process every matching event.

# Bad — window function over all events (Error-severity warning)
* | eventstats avg(bytes) as avg_bytes

# Good — filter the dataset first
source_type="proxy" dest_port=443
| eventstats avg(bytes) as avg_bytes by src_ip

# Good — limit input
* | head 10000 | streamstats count() as running_count

Always add a by clause to eventstats when possible — it partitions the computation and avoids scanning every row with a single window.

Set Bounds on Transactions

The transaction command groups related events into sessions. Without bounds, a single group can accumulate unlimited events in memory.

# Bad — unbounded transaction (Error-severity warning)
* | transaction session_id

# Good — set both maxspan and maxevents
* | transaction session_id maxspan=1h maxevents=1000

Always set both maxspan and maxevents. maxspan caps the time window, maxevents caps the event count per group. Together they prevent any single transaction from consuming excessive memory.

Choose the Right Timechart Span

Small timechart spans over large time ranges create an enormous number of buckets. Each bucket consumes memory — span=1s over 90 days creates 7.7 million buckets.

# Bad — 1-second buckets over a day = 86,400 buckets
timestamp > now() - INTERVAL 1 DAY
| timechart span=1s count()

# Good — match span to your time range
timestamp > now() - INTERVAL 1 DAY
| timechart span=5m count()

timestamp > now() - INTERVAL 7 DAYS
| timechart span=1h count()

Rule of thumb: Aim for a few hundred buckets at most.

Time rangeReasonable spanBuckets
Last 15 minutesspan=10s~90
Last hourspan=1m60
Last 24 hoursspan=5m288
Last 7 daysspan=1h168
Last 30 daysspan=6h120

Be Aware of Subsearch Result Caps

join and append subsearches are capped at 10,000 rows by default. If your subsearch returns more than this, excess rows are silently discarded — your results will be incomplete without any error.

# Risky — if more than 10,000 IPs match, the join silently drops the rest
* | join src_ip [search event_type="malicious" | fields src_ip]

# Better — ensure the subsearch is well-filtered
* | join src_ip [search event_type="malicious" timestamp > now() - INTERVAL 1 HOUR | fields src_ip]

# Adjust the cap if needed (max 100,000)
* | join src_ip maxout=50000 [search event_type="malicious" | fields src_ip]

Use Exact Matches and CONTAINS Over Regex

Regex matching is the slowest string operation. The engine optimizes simple regex patterns into faster operations (prefix match, suffix match, literal alternation), but complex patterns still require a full regex evaluation.

# Slow — regex
user=/admin.*/

# Fast — wildcard (optimized to startsWith)
user="admin*"

# Slow — regex alternation
event_type=/login|logout|failed/

# Fast — IN list
event_type IN ("login", "logout", "failed")

# Slow — regex for substring
message=/.*error.*/

# Fast — CONTAINS
message CONTAINS "error"

When you do need regex, the engine will show an info-level message suggesting alternatives. Consider whether LIKE, CONTAINS, STARTSWITH, or ENDSWITH can achieve the same result.

Quick Reference: Warning Signals

When nano shows you a cost analysis warning, it's telling you the query can be improved. Here's what each warning means and what to do:

WarningSeverityWhat to do
Wildcard search without field filtersInfoAdd field filters or source_type to narrow the scan
Unindexed keyword searchWarningUse a field filter (src_host="...", src_ip="...") instead of bare keywords with special characters
Regex pattern matchingInfoSwitch to LIKE, CONTAINS, or exact matches
Unbounded sortWarningAdd | head N before sort
Tiny timechart spanWarningIncrease span to match your time range
Unbounded dedupErrorAdd | head N before dedup, or use stats instead
Unbounded transactionErrorAdd maxspan and maxevents
Unbounded eventstatsErrorAdd search filters or | head N before eventstats
Unbounded streamstatsErrorAdd search filters or | head N before streamstats
Subsearch result limitWarningFilter the subsearch more tightly, or increase maxout

Error-severity warnings can optionally be configured to block query execution — see Query Safety Limits for details.

Summary

PracticeImpact
Use field filters for terms with special characters5-10x faster
Trailing wildcards over leading wildcards2-5x faster
Filter in search expression, not where2-5x faster
Limit before dedup/sort/eventstats/streamstatsPrevents OOM
Set maxspan + maxevents on transactionPrevents OOM
Use CONTAINS/IN instead of regex2-3x faster
Match timechart span to time rangePrevents excessive memory use
On this page

On this page