Disk Pressure & Automatic Eviction
Automatic partition eviction when ClickHouse local disk usage exceeds configurable watermarks
Disk Pressure & Automatic Eviction
nano continuously monitors ClickHouse disk usage and automatically drops the oldest daily partitions when local storage fills up. This prevents disk-full outages without manual intervention.
Disk pressure monitoring is only active in dual database mode (ClickHouse enabled). It runs on the elected leader node in multi-pod deployments.
How It Works
The disk pressure service runs a check cycle on a configurable interval (default: every 60 seconds):
- Queries ClickHouse
system.disksfor current disk usage - Classifies usage into a pressure level based on watermark thresholds
- Takes action based on the level — dropping partitions, emitting notifications, or pausing ingestion
Pressure Levels
| Level | Trigger | Behavior |
|---|---|---|
| Normal | Below high watermark | No action. Resolves any active health issue and resumes ingestion. |
| Elevated | Above high watermark (60%) | Drops oldest partitions until usage falls below the low watermark. |
| Critical | Above critical threshold (85%) | Same as elevated, plus emits a disk pressure warning notification. |
| Emergency | Above emergency threshold (90%) | Same as critical, plus optionally pauses log ingestion. |
Partition Drop Strategy
When pressure is elevated or higher, the service drops daily partitions using a FIFO (oldest-first) strategy:
- Tables affected:
logs,signals,ingestion_errors,identity_observations,nat_candidates - Safety limit: Maximum 5 partitions dropped per check cycle
- Cool-down: 2-second pause between drops to let ClickHouse settle
- Target: Drops continue until usage falls below the low watermark (default 50%)
Dropped partitions cannot be recovered. If you need long-term retention, configure Storage Tiering to move data to S3 before it ages out.
Configuration
All settings are configured via environment variables. The defaults are designed for production use — most deployments won't need to change them.
| Variable | Default | Description |
|---|---|---|
DISK_PRESSURE_CHECK_INTERVAL_SECS | 60 | Seconds between disk usage checks |
DISK_PRESSURE_HIGH_WATERMARK | 0.60 | Fraction of disk usage that triggers partition eviction |
DISK_PRESSURE_LOW_WATERMARK | 0.50 | Target usage fraction — eviction stops when usage drops below this |
DISK_PRESSURE_CRITICAL_THRESHOLD | 0.85 | Fraction that triggers critical-level warnings |
DISK_PRESSURE_EMERGENCY_THRESHOLD | 0.90 | Fraction that triggers emergency-level warnings and optional ingestion pause |
DISK_PRESSURE_PAUSE_INGESTION | false | Whether to pause log ingestion at emergency level |
Watermark values are fractions from 0.0 to 1.0 representing the percentage of total disk space used. For example, 0.60 means 60%.
Example: Conservative Settings
For environments with large disks and predictable growth:
export DISK_PRESSURE_HIGH_WATERMARK=0.75
export DISK_PRESSURE_LOW_WATERMARK=0.65
export DISK_PRESSURE_CRITICAL_THRESHOLD=0.90
export DISK_PRESSURE_EMERGENCY_THRESHOLD=0.95Example: Aggressive Settings
For small-disk environments where space is tight:
export DISK_PRESSURE_HIGH_WATERMARK=0.50
export DISK_PRESSURE_LOW_WATERMARK=0.40
export DISK_PRESSURE_CRITICAL_THRESHOLD=0.75
export DISK_PRESSURE_EMERGENCY_THRESHOLD=0.85
export DISK_PRESSURE_PAUSE_INGESTION=trueAutomatic Skip: Storage Tiering and ClickHouse Cloud
Disk pressure eviction is automatically skipped when storage tiering to S3-compatible storage is active. When the system detects that tiering is enabled and in active status, partition drops are bypassed entirely because:
- ClickHouse TTL rules handle data movement — cold partitions are automatically moved to S3/R2 object storage by ClickHouse itself, freeing local disk space without dropping data
- Data is preserved — customers using tiering are paying for offsite storage specifically to retain historical data, so dropping partitions would defeat the purpose
- Local disk pressure is self-correcting — as TTL moves age data off local disk, space is freed naturally
When tiering is active, the disk pressure service logs:
Storage tiering is active — skipping partition drops.
ClickHouse TTL rules will move data to S3/R2 automatically.The same applies to ClickHouse Cloud deployments where storage is managed by the cloud provider. Since ClickHouse Cloud uses shared object storage under the hood, local disk pressure is not a concern and the eviction system has no partitions to drop.
To configure storage tiering, see Storage Tiering.
Notifications
Disk pressure events generate notifications visible to all admin users in the nano UI.
Disk Pressure Warning
Sent once per pressure episode when critical or emergency level is reached:
- Title:
Disk pressure {severity}: ClickHouse at {percentage}% - Link: Redirects to Settings > Retention for admin action
Partition Dropped
Sent after each partition is dropped:
- Title:
Partition {date} dropped due to disk pressure - Details: Lists the 5 daily tables affected
Notifications are deduplicated — you won't receive repeated warnings for the same active pressure episode. When usage returns to normal, the health issue is automatically resolved.
Monitoring via API
The disk pressure status is included in the storage overview endpoint:
curl -X GET "http://localhost:3000/api/settings/storage/overview" \
-H "Authorization: Bearer <token>"The response includes a disk_pressure object with:
| Field | Description |
|---|---|
usage_fraction | Current disk usage (0.0–1.0) |
total_bytes / used_bytes / free_bytes | Absolute disk metrics |
level | Current pressure level (normal, elevated, critical, emergency) |
estimated_retention_days | Projected days of capacity remaining |
partitions_dropped | Counter of partitions dropped since service start |
ingestion_paused | Whether ingestion is currently paused |
Audit Trail
All disk pressure actions are logged to the audit table:
partition_dropped— Emitted when a partition is dropped, with partition date and affected tablesdisk_pressure_critical— Emitted when critical or emergency level is reached
These audit events are searchable via the standard search interface, providing a complete history of automatic storage management actions.
Related Documentation
- Storage & Retention Settings — TTL configuration, database modes, and storage tiering setup
- Deployment Architecture — Infrastructure planning and disk sizing