nano SIEM
Settings

Query Safety Limits

Configure query-level OOM protection with bounded array aggregation, row expansion limits, and post-processing caps

Query Safety Limits

nano includes configurable safety limits that prevent individual queries from exhausting server memory. These limits target specific high-risk query patterns while preserving the flexibility analysts need for threat hunting and incident response.

Why These Limits Exist

Certain nPL query patterns can consume unbounded memory if left unchecked:

  • Array aggregation (values(), list()) collects all matching values into in-memory arrays per group
  • Multi-value expansion (mvexpand) multiplies rows via arrayJoin before any LIMIT is applied
  • Post-processing aggregation — when stats/top/rare commands run after enrichment, they buffer all rows in a HashMap
  • Streaming cache — large result sets buffered for re-display can consume significant memory

These limits complement the per-query ClickHouse resource settings (see Search Concurrency). While ClickHouse enforces max_memory_usage at the database level, these limits operate at the application level to bound specific patterns that can be expensive even within ClickHouse's memory budget.

Settings

All settings are configurable via Settings > Search > Query Safety Limits. Changes take effect within 60 seconds — no restart required.

Max Array Aggregation Size

Settingmax_group_array_size
Default10,000
Range100 – 1,000,000

Controls the maximum number of elements in groupArray() and groupUniqArray() calls generated by values() and list() aggregation functions. When the limit is reached, ClickHouse silently truncates the array — no error is thrown, but results may be incomplete.

Affects these nPL patterns:

* | stats values(user) by src_ip        # groupUniqArray(10000)(user)
* | stats list(message) by user          # groupArray(10000)(message)
* | streamstats values(dest_ip) by user  # window function variant
* | transaction user                     # groupArray(1000)(message) for _raw_events

Sizing guidance:

DeploymentRecommendedRationale
Small (< 50 GB/day)10,000 (default)Sufficient for most values() use cases
Medium (50–200 GB/day)25,000Higher-cardinality environments (more unique users, IPs)
Enterprise (> 200 GB/day)50,000–100,000Large-scale hunting where completeness matters

For the transaction command specifically, the array limit for _raw_events is capped at the command's maxevents parameter (default: 1,000) regardless of this setting.

Max Mvexpand Rows

Settingmax_mvexpand_rows
Default100,000
Range1,000 – 10,000,000

The default row limit applied to mvexpand when the query doesn't specify an explicit limit=N. This is important because arrayJoin() expansion happens before any SQL LIMIT — without a cap, a single field with large arrays can multiply a modest result set into billions of rows.

Example:

* | mvexpand dns_answers              # applies LIMIT 100000
* | mvexpand dns_answers limit=500    # uses explicit limit, ignores this setting

Sizing guidance:

DeploymentRecommendedRationale
Small100,000 (default)Safe for DNS answer expansion, multi-value fields
Medium250,000Larger datasets with more multi-value fields
Enterprise500,000–1,000,000High-volume environments with complex array fields

Max Post-Processing Groups

Settingmax_post_processing_groups
Default1,000,000
Range10,000 – 10,000,000

Caps the number of groups in Rust-side post-processing for stats, top, and rare commands. This only applies to post-lateral commands that can't run in ClickHouse SQL (e.g., stats after a lookup enrichment or after prevalence filtering). When the cap is reached, additional groups are silently dropped and a warning is logged server-side.

Most stats/top/rare queries execute directly in ClickHouse and are not affected by this setting — ClickHouse handles GROUP BY cardinality with external aggregation (spill-to-disk).

Sizing guidance:

DeploymentRecommendedRationale
All sizes1,000,000 (default)Matches the default SQL result limit; rarely needs adjustment

Increase only if you see truncation warnings in server logs for legitimate post-enrichment aggregations.

Max Streaming Cache Rows

Settingmax_streaming_cache_rows
Default50,000
Range1,000 – 1,000,000

Controls how many rows are buffered for the streaming result cache. When a search streams results via SSE, rows are simultaneously collected in memory for caching (so switching tabs and returning doesn't re-execute the query). If the result set exceeds this limit, caching is skipped — the query still streams correctly to the client, but won't be available for instant re-display.

Sizing guidance:

DeploymentRecommendedRationale
Small (limited RAM)25,000Reduce memory footprint per concurrent query
Medium50,000 (default)Good balance between cache hit rate and memory
Enterprise (high RAM)100,000–200,000More results cached for tab-switching workflows

Block on Cost Analysis Errors

Settingblock_on_cost_errors
DefaultOff

When enabled, queries that trigger Error-severity cost analysis warnings are rejected before execution. The analyst receives a clear error message with the specific issue and a suggestion for how to fix the query.

Error-severity warnings include:

Warning CodeTrigger
UNBOUNDED_DEDUPdedup without prior head or aggregation
UNBOUNDED_TRANSACTIONtransaction without both maxspan and maxevents
UNBOUNDED_EVENTSTATSeventstats without prior limit or aggregation
UNBOUNDED_STREAMSTATSstreamstats without prior limit or aggregation

When to enable: Recommended for shared environments where untrained analysts might run expensive queries. In single-analyst or dev environments, leaving this off allows more flexibility.

When disabled (default), these warnings are still shown in the search results UI as advisory messages — analysts can see the warning and choose to refine their query.

Cost Analysis Warnings

Beyond the configurable limits above, nano's query cost analyzer generates warnings for patterns that may be expensive. These are always shown in the search UI regardless of the block_on_cost_errors setting:

CodeSeverityDescription
WILDCARD_SEARCHInfoQuery uses * without field filters
REGEX_FILTERInfoQuery uses regex matching (slower than exact/prefix)
SEQUENCE_FUNNEL_NO_FILTERInfosequence/funnel without prior filters
LARGE_LIMITInfohead/tail requesting > 50,000 results
UNBOUNDED_SORTWarningsort without prior limit
UNBOUNDED_MVEXPANDWarningmvexpand using server default limit
TINY_TIMECHART_SPANWarningtimechart with span < 10 seconds
LARGE_BIN_HOP_FANOUTWarningbin hop creating > 100 overlapping windows per event
STATS_VALUES_LIST_UNBOUNDEDWarningvalues()/list() without prior head
UNBOUNDED_DEDUPErrordedup without prior limit or aggregation
UNBOUNDED_TRANSACTIONErrortransaction without maxspan and maxevents
UNBOUNDED_EVENTSTATSErroreventstats without prior limit or aggregation
UNBOUNDED_STREAMSTATSErrorstreamstats without prior limit or aggregation

API

Query safety limits can be managed programmatically:

# Get current limits
curl -H "Authorization: Bearer $TOKEN" \
  https://your-instance/api/settings/search/query-limits

# Update limits
curl -X PUT -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "max_group_array_size": 25000,
    "max_mvexpand_rows": 250000,
    "max_post_processing_groups": 1000000,
    "max_streaming_cache_rows": 100000,
    "block_on_cost_errors": false
  }' \
  https://your-instance/api/settings/search/query-limits

Requires the settings:system permission.

On this page

On this page