Configure query-level OOM protection with bounded array aggregation, row expansion limits, and post-processing caps

Query Safety Limits

nano includes configurable safety limits that prevent individual queries from exhausting server memory. These limits target specific high-risk query patterns while preserving the flexibility analysts need for threat hunting and incident response.

Why These Limits Exist

Certain nPL query patterns can consume unbounded memory if left unchecked:

Array aggregation (values(), list()) collects all matching values into in-memory arrays per group
Multi-value expansion (mvexpand) multiplies rows via arrayJoin before any LIMIT is applied
Post-processing aggregation — when stats/top/rare commands run after enrichment, they buffer all rows in a HashMap
Streaming cache — large result sets buffered for re-display can consume significant memory

These limits complement the per-query ClickHouse resource settings (see Search Concurrency). While ClickHouse enforces max_memory_usage at the database level, these limits operate at the application level to bound specific patterns that can be expensive even within ClickHouse's memory budget.

Settings

All settings are configurable via Settings > Search > Query Safety Limits. Changes take effect within 60 seconds — no restart required.

Max Array Aggregation Size


Setting	`max_group_array_size`
Default	10,000
Range	100 – 1,000,000

Controls the maximum number of elements in groupArray() and groupUniqArray() calls generated by values() and list() aggregation functions. When the limit is reached, ClickHouse silently truncates the array — no error is thrown, but results may be incomplete.

Affects these nPL patterns:

* | stats values(user) by src_ip        # groupUniqArray(10000)(user)
* | stats list(message) by user          # groupArray(10000)(message)
* | streamstats values(dest_ip) by user  # window function variant
* | transaction user                     # groupArray(1000)(message) for _raw_events

Sizing guidance:

Deployment	Recommended	Rationale
Small (< 50 GB/day)	10,000 (default)	Sufficient for most `values()` use cases
Medium (50–200 GB/day)	25,000	Higher-cardinality environments (more unique users, IPs)
Enterprise (> 200 GB/day)	50,000–100,000	Large-scale hunting where completeness matters

For the transaction command specifically, the array limit for _raw_events is capped at the command's maxevents parameter (default: 1,000) regardless of this setting.

Max Mvexpand Rows


Setting	`max_mvexpand_rows`
Default	100,000
Range	1,000 – 10,000,000

The default row limit applied to mvexpand when the query doesn't specify an explicit limit=N. This is important because arrayJoin() expansion happens before any SQL LIMIT — without a cap, a single field with large arrays can multiply a modest result set into billions of rows.

Example:

* | mvexpand dns_answers              # applies LIMIT 100000
* | mvexpand dns_answers limit=500    # uses explicit limit, ignores this setting

Sizing guidance:

Deployment	Recommended	Rationale
Small	100,000 (default)	Safe for DNS answer expansion, multi-value fields
Medium	250,000	Larger datasets with more multi-value fields
Enterprise	500,000–1,000,000	High-volume environments with complex array fields

Max Post-Processing Groups


Setting	`max_post_processing_groups`
Default	1,000,000
Range	10,000 – 10,000,000

Caps the number of groups in Rust-side post-processing for stats, top, and rare commands. This only applies to post-lateral commands that can't run in ClickHouse SQL (e.g., stats after a lookup enrichment or after prevalence filtering). When the cap is reached, additional groups are silently dropped and a warning is logged server-side.

Most stats/top/rare queries execute directly in ClickHouse and are not affected by this setting — ClickHouse handles GROUP BY cardinality with external aggregation (spill-to-disk).

Sizing guidance:

Deployment	Recommended	Rationale
All sizes	1,000,000 (default)	Matches the default SQL result limit; rarely needs adjustment

Increase only if you see truncation warnings in server logs for legitimate post-enrichment aggregations.

Max Streaming Cache Rows


Setting	`max_streaming_cache_rows`
Default	50,000
Range	1,000 – 1,000,000

Controls how many rows are buffered for the streaming result cache. When a search streams results via SSE, rows are simultaneously collected in memory for caching (so switching tabs and returning doesn't re-execute the query). If the result set exceeds this limit, caching is skipped — the query still streams correctly to the client, but won't be available for instant re-display.

Sizing guidance:

Deployment	Recommended	Rationale
Small (limited RAM)	25,000	Reduce memory footprint per concurrent query
Medium	50,000 (default)	Good balance between cache hit rate and memory
Enterprise (high RAM)	100,000–200,000	More results cached for tab-switching workflows

Block on Cost Analysis Errors


Setting	`block_on_cost_errors`
Default	Off

When enabled, queries that trigger Error-severity cost analysis warnings are rejected before execution. The analyst receives a clear error message with the specific issue and a suggestion for how to fix the query.

Error-severity warnings include:

Warning Code	Trigger
`UNBOUNDED_DEDUP`	`dedup` without prior `head` or aggregation
`UNBOUNDED_TRANSACTION`	`transaction` without both `maxspan` and `maxevents`
`UNBOUNDED_EVENTSTATS`	`eventstats` without prior limit or aggregation
`UNBOUNDED_STREAMSTATS`	`streamstats` without prior limit or aggregation

When to enable: Recommended for shared environments where untrained analysts might run expensive queries. In single-analyst or dev environments, leaving this off allows more flexibility.

When disabled (default), these warnings are still shown in the search results UI as advisory messages — analysts can see the warning and choose to refine their query.

Cost Analysis Warnings

Beyond the configurable limits above, nano's query cost analyzer generates warnings for patterns that may be expensive. These are always shown in the search UI regardless of the block_on_cost_errors setting:

Code	Severity	Description
`WILDCARD_SEARCH`	Info	Query uses `*` without field filters
`REGEX_FILTER`	Info	Query uses regex matching (slower than exact/prefix)
`SEQUENCE_FUNNEL_NO_FILTER`	Info	`sequence`/`funnel` without prior filters
`LARGE_LIMIT`	Info	`head`/`tail` requesting > 50,000 results
`UNBOUNDED_SORT`	Warning	`sort` without prior limit
`UNBOUNDED_MVEXPAND`	Warning	`mvexpand` using server default limit
`TINY_TIMECHART_SPAN`	Warning	`timechart` with span < 10 seconds
`LARGE_BIN_HOP_FANOUT`	Warning	`bin hop` creating > 100 overlapping windows per event
`STATS_VALUES_LIST_UNBOUNDED`	Warning	`values()`/`list()` without prior `head`
`UNBOUNDED_DEDUP`	Error	`dedup` without prior limit or aggregation
`UNBOUNDED_TRANSACTION`	Error	`transaction` without `maxspan` and `maxevents`
`UNBOUNDED_EVENTSTATS`	Error	`eventstats` without prior limit or aggregation
`UNBOUNDED_STREAMSTATS`	Error	`streamstats` without prior limit or aggregation

API

Query safety limits can be managed programmatically:

# Get current limits
curl -H "Authorization: Bearer $TOKEN" \
  https://your-instance/api/settings/search/query-limits

# Update limits
curl -X PUT -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "max_group_array_size": 25000,
    "max_mvexpand_rows": 250000,
    "max_post_processing_groups": 1000000,
    "max_streaming_cache_rows": 100000,
    "block_on_cost_errors": false
  }' \
  https://your-instance/api/settings/search/query-limits

Requires the settings:system permission.

Query Safety Limits

On this page