nano SIEM
Search Commands

prevalence

prevalence

Filter or enrich events based on artifact prevalence. Identify rare or suspicious indicators.

Description

The prevalence command leverages nano's prevalence tracking to filter events based on how common artifacts are across your environment. This is powerful for detecting rare file hashes, newly seen domains, or unusual patterns.

Prevalence data tracks how many hosts have observed each artifact and when it was first seen.

Syntax

Filter mode:

... | prevalence <field> <operator> <value> [<field> <operator> <value> ...] [window=<duration>]

Enrich mode:

... | prevalence enrich=true [window=<duration>]

Filter Fields

Used with filter mode (prevalence <field> <operator> <value>):

FieldDescription
hash_prevalenceNumber of hosts that have seen this file hash
domain_prevalenceNumber of hosts that have seen this domain
hash_first_seenTimestamp when hash was first observed
domain_first_seenTimestamp when domain was first observed

Enrichment Fields

When using enrich=true, the following fields are added to each event. These can be used in downstream commands like where, sort, stats, and table.

FieldTypeDescription
host_countnumberNumber of unique hosts that have observed this artifact
is_rareboolean (0/1)Whether the artifact is below the rarity threshold
prevalence_scorenumber (0-100)Rarity score — 0 = never seen, 100 = everywhere. See Scoring below
prevalence_typestringArtifact type: domain, hash, ip, or comma-separated if multiple
prevalence_artifactstringThe actual artifact value being tracked
prevalence_first_seentimestampWhen the artifact was first observed in your environment
prevalence_last_seentimestampWhen the artifact was most recently observed
first_seentimestampAlias for prevalence_first_seen
last_seentimestampAlias for prevalence_last_seen
total_occurrencesnumberTotal number of times this artifact has been seen

Optional Arguments

window
Syntax: window=<duration>
Description: Time window for prevalence calculation
Default: 30d

enrich
Syntax: enrich=true
Description: Add prevalence fields without filtering

Examples

Rare file hashes

file_hash=*
| prevalence hash_prevalence < 5

Newly seen domains

* | prevalence domain_first_seen > now() - INTERVAL 24 HOUR

Combined conditions

* | prevalence hash_prevalence < 3 AND hash_first_seen > now() - INTERVAL 7 DAY window=30d

Enrich with prevalence data

* | prevalence enrich=true
  | table file_hash, hash_prevalence, hash_first_seen

Rare process execution

process_name=*
| prevalence hash_prevalence <= 10 window=7d
| table timestamp, process_name, file_hash, hash_prevalence, src_host

New domain connections

dest_domain=*
| prevalence domain_first_seen > now() - INTERVAL 1 DAY
| stats count() by dest_domain, domain_first_seen

Suspicious downloads

action=file_download
| prevalence hash_prevalence < 5 AND domain_prevalence < 10

Rare and new

* | prevalence hash_prevalence < 3 AND hash_first_seen > now() - INTERVAL 3 DAY
  | where bytes > 1000000

Enrich and filter by host count

* | prevalence enrich=true
  | where host_count < 5

Find rare artifacts

sourcetype=squid_proxy
| prevalence enrich=true
  | where is_rare=1
  | table timestamp, user, prevalence_artifact, prevalence_type, host_count, prevalence_score

Recently seen artifacts sorted by rarity

* | prevalence enrich=true
  | where prevalence_first_seen > "2026-01-01"
  | sort prevalence_score

Aggregate by artifact

sourcetype=squid_proxy
| prevalence enrich=true
  | where host_count < 3
  | stats count by prevalence_artifact, prevalence_type, host_count, prevalence_first_seen

Prevalence Scoring

The prevalence_score field is a 0-100 score that reflects how common an artifact is across your environment. It's calculated relative to your rarity threshold setting (default: 3 hosts).

ScoreBandConditionMeaning
0Never seenhost_count = 0Artifact has never been observed
1-20Very rarehost_count < thresholdBelow your rarity threshold
21-50Rarethreshold ≤ host_count < threshold×2Around the rarity boundary
51-80Uncommonthreshold×2 ≤ host_count < threshold×10Seen on several hosts but not widespread
81-100Commonhost_count ≥ threshold×10Widespread across your environment

Within each band, the score scales linearly. For example, with a rarity threshold of 5:

host_countprevalence_scoreBand
00Never seen
28Very rare
520Very rare (at threshold)
838Rare
3069Uncommon
100+~96-100Common

Configuring the rarity threshold

The rarity threshold controls where the scoring bands start. A higher threshold means more artifacts are classified as rare. You can configure it in Settings > Prevalence or via API:

curl -X PUT https://your-instance/api/settings/prevalence \
  -H "Authorization: Bearer $API_KEY" \
  -d '{"rarity_threshold": 5}'

See Prevalence Settings for all configuration options.

Usage Notes

Automatic calculation: Prevalence is calculated automatically from your log data.

Window parameter: Adjusts the time range for prevalence calculation. Shorter windows are more sensitive.

Performance: Prevalence queries are optimized but may be slower on very large datasets.

Enrichment mode: Use enrich=true to add prevalence fields without filtering.

Multiple conditions: All conditions must be satisfied (AND logic).

Known Limitations

When using enrich=true, commands after prevalence are applied in post-processing.

Commands that work after prevalence enrich

CommandStatusExample
where| where host_count < 5
table| table prevalence_artifact, host_count, prevalence_score
fields| fields + prevalence_*
head/tail| head 100
sort| sort prevalence_score
stats| stats avg(host_count) by src_host
top/rare| top prevalence_artifact
eval| eval rarity = if(host_count < 3, "rare", "common")
rex| rex field=prevalence_artifact "(?<prefix>.{8})"
dedup| dedup prevalence_artifact
rename| rename host_count as rarity
fillnull| fillnull value=0 host_count
timechart| timechart avg(host_count)

Commands that don't work after prevalence enrich

CommandReason
inputlookupRequires separate enrichment pipeline
lookupRequires separate enrichment pipeline
streamstatsNot yet implemented in post-processing

Note: Filter mode (prevalence hash_prevalence < 5) uses optimized SQL JOINs and doesn't have these limitations.

  • where - Additional filtering after prevalence
  • stats - Aggregate prevalence results
  • lookup - Enrich with other data sources
On this page

On this page