nano SIEM
Settings

Disk Pressure & Automatic Eviction

Automatic partition eviction when ClickHouse local disk usage exceeds configurable watermarks

Disk Pressure & Automatic Eviction

nano continuously monitors ClickHouse disk usage and automatically drops the oldest daily partitions when local storage fills up. This prevents disk-full outages without manual intervention.

Disk pressure monitoring is only active in dual database mode (ClickHouse enabled). It runs on the elected leader node in multi-pod deployments.

How It Works

The disk pressure service runs a check cycle on a configurable interval (default: every 60 seconds):

  1. Queries ClickHouse system.disks for current disk usage
  2. Classifies usage into a pressure level based on watermark thresholds
  3. Takes action based on the level — dropping partitions, emitting notifications, or pausing ingestion

Pressure Levels

LevelTriggerBehavior
NormalBelow high watermarkNo action. Resolves any active health issue and resumes ingestion.
ElevatedAbove high watermark (60%)Drops oldest partitions until usage falls below the low watermark.
CriticalAbove critical threshold (85%)Same as elevated, plus emits a disk pressure warning notification.
EmergencyAbove emergency threshold (90%)Same as critical, plus optionally pauses log ingestion.

Partition Drop Strategy

When pressure is elevated or higher, the service drops daily partitions using a FIFO (oldest-first) strategy:

  • Tables affected: logs, signals, ingestion_errors, identity_observations, nat_candidates
  • Safety limit: Maximum 5 partitions dropped per check cycle
  • Cool-down: 2-second pause between drops to let ClickHouse settle
  • Target: Drops continue until usage falls below the low watermark (default 50%)

Dropped partitions cannot be recovered. If you need long-term retention, configure Storage Tiering to move data to S3 before it ages out.

Configuration

All settings are configured via environment variables. The defaults are designed for production use — most deployments won't need to change them.

VariableDefaultDescription
DISK_PRESSURE_CHECK_INTERVAL_SECS60Seconds between disk usage checks
DISK_PRESSURE_HIGH_WATERMARK0.60Fraction of disk usage that triggers partition eviction
DISK_PRESSURE_LOW_WATERMARK0.50Target usage fraction — eviction stops when usage drops below this
DISK_PRESSURE_CRITICAL_THRESHOLD0.85Fraction that triggers critical-level warnings
DISK_PRESSURE_EMERGENCY_THRESHOLD0.90Fraction that triggers emergency-level warnings and optional ingestion pause
DISK_PRESSURE_PAUSE_INGESTIONfalseWhether to pause log ingestion at emergency level

Watermark values are fractions from 0.0 to 1.0 representing the percentage of total disk space used. For example, 0.60 means 60%.

Example: Conservative Settings

For environments with large disks and predictable growth:

export DISK_PRESSURE_HIGH_WATERMARK=0.75
export DISK_PRESSURE_LOW_WATERMARK=0.65
export DISK_PRESSURE_CRITICAL_THRESHOLD=0.90
export DISK_PRESSURE_EMERGENCY_THRESHOLD=0.95

Example: Aggressive Settings

For small-disk environments where space is tight:

export DISK_PRESSURE_HIGH_WATERMARK=0.50
export DISK_PRESSURE_LOW_WATERMARK=0.40
export DISK_PRESSURE_CRITICAL_THRESHOLD=0.75
export DISK_PRESSURE_EMERGENCY_THRESHOLD=0.85
export DISK_PRESSURE_PAUSE_INGESTION=true

Automatic Skip: Storage Tiering and ClickHouse Cloud

Disk pressure eviction is automatically skipped when storage tiering to S3-compatible storage is active. When the system detects that tiering is enabled and in active status, partition drops are bypassed entirely because:

  • ClickHouse TTL rules handle data movement — cold partitions are automatically moved to S3/R2 object storage by ClickHouse itself, freeing local disk space without dropping data
  • Data is preserved — customers using tiering are paying for offsite storage specifically to retain historical data, so dropping partitions would defeat the purpose
  • Local disk pressure is self-correcting — as TTL moves age data off local disk, space is freed naturally

When tiering is active, the disk pressure service logs:

Storage tiering is active — skipping partition drops.
ClickHouse TTL rules will move data to S3/R2 automatically.

The same applies to ClickHouse Cloud deployments where storage is managed by the cloud provider. Since ClickHouse Cloud uses shared object storage under the hood, local disk pressure is not a concern and the eviction system has no partitions to drop.

To configure storage tiering, see Storage Tiering.

Notifications

Disk pressure events generate notifications visible to all admin users in the nano UI.

Disk Pressure Warning

Sent once per pressure episode when critical or emergency level is reached:

  • Title: Disk pressure {severity}: ClickHouse at {percentage}%
  • Link: Redirects to Settings > Retention for admin action

Partition Dropped

Sent after each partition is dropped:

  • Title: Partition {date} dropped due to disk pressure
  • Details: Lists the 5 daily tables affected

Notifications are deduplicated — you won't receive repeated warnings for the same active pressure episode. When usage returns to normal, the health issue is automatically resolved.

Monitoring via API

The disk pressure status is included in the storage overview endpoint:

curl -X GET "http://localhost:3000/api/settings/storage/overview" \
  -H "Authorization: Bearer <token>"

The response includes a disk_pressure object with:

FieldDescription
usage_fractionCurrent disk usage (0.0–1.0)
total_bytes / used_bytes / free_bytesAbsolute disk metrics
levelCurrent pressure level (normal, elevated, critical, emergency)
estimated_retention_daysProjected days of capacity remaining
partitions_droppedCounter of partitions dropped since service start
ingestion_pausedWhether ingestion is currently paused

Audit Trail

All disk pressure actions are logged to the audit table:

  • partition_dropped — Emitted when a partition is dropped, with partition date and affected tables
  • disk_pressure_critical — Emitted when critical or emergency level is reached

These audit events are searchable via the standard search interface, providing a complete history of automatic storage management actions.

On this page

On this page