Storage & Retention Settings

nano provides flexible storage options to optimize performance, cost, and compliance requirements. This page covers all storage configuration options, including database modes, retention policies, and advanced storage tiering.

Storage Architecture Overview

nano supports two storage architectures:

PostgreSQL Only Mode (Legacy)

All data stored in PostgreSQL with TimescaleDB extension
Best for: Small to medium deployments (< 1TB logs)
Advantages: Simple setup, single database to manage
Limitations: Higher storage costs, slower queries on large datasets

Dual Database Mode (Recommended)

Log telemetry stored in ClickHouse for optimal query performance
Metadata stored in PostgreSQL (rules, alerts, dashboards, settings)
Best for: Medium to large deployments (> 1TB logs)
Advantages: Better performance, lower storage costs, advanced tiering options

Storage Tabs Overview

The Storage & Retention settings page contains multiple tabs based on your configuration:

ClickHouse Tab (Log Storage)

Available when dual database mode is enabled. Manages the primary log storage system.

Storage Statistics

Total Size: Compressed size of all log data with compression ratio
Total Logs: Number of log entries stored
Partitions: Number of date-based partitions (typically one per day)
Parts: Number of data files (ClickHouse's storage units)
Date Range: Oldest and newest log timestamps

TTL Retention Configuration

Retention Period: How long to keep logs (1-3650 days)
Auto-deletion: ClickHouse automatically deletes expired data
Force TTL: Manually trigger immediate deletion of expired data

TTL Behavior:

Data is partitioned by day for efficient deletion
TTL is applied during background merge operations
"Force TTL Now" immediately processes all partitions
Deleted data cannot be recovered

PostgreSQL Tab (Metadata/Legacy)

In Dual Database Mode

Shows information about metadata storage:

Detection rules and configurations
Alert definitions and history
Dashboard configurations
User settings and preferences
Parser configurations
AI/ML model settings

In PostgreSQL Only Mode

Provides full storage management:

Storage Statistics:

Total Size: Size of all tables including indexes
Total Logs: Number of log entries
Chunks: TimescaleDB time-based chunks
Compressed Chunks: Number of compressed chunks
Date Range: Data time span

Retention Policy Configuration:

Enable/Disable: Toggle automatic retention
Retention Period: Days to keep data (1-3650)
Manual Execution: Force retention cleanup immediately

TimescaleDB Features:

Automatic chunk compression after 1 day
Background retention job execution
Efficient time-based data deletion

Storage Tiering Tab (Advanced)

Available only in dual database mode with ClickHouse. Provides cost-effective long-term storage.

Storage Tiering Deep Dive

Storage tiering automatically moves data between storage tiers based on age and access patterns.

Tier Architecture

Hot Tier (Local SSD)

Storage: Local NVMe/SSD storage
Performance: Fastest query response times
Use case: Recent data requiring frequent access
Configuration: 1-365 days (recommended: 7-60 days)

Auto-Move Behavior Options:

TTL Only: Move data based purely on age
90% Full: Also move data when disk reaches 90% capacity
80% Full: Also move data when disk reaches 80% capacity

Warm Tier (S3-backed)

Storage: S3-compatible object storage
Performance: Slower queries but still accessible via ClickHouse
Use case: Historical data for investigations and compliance
Configuration: Total retention period (hot days to warm days)

Cold Tier (Archive)

Storage: S3 archive storage classes
Performance: Export-only, not queryable
Use case: Long-term compliance and backup
Configuration: Optional, for data older than warm tier

S3 Configuration Options

Basic S3 Settings

S3 Bucket: Target bucket for tiered data
Region: AWS region or equivalent for other providers
Custom Endpoint: For MinIO, Cloudflare R2, Backblaze B2
Path-style Access: Required for MinIO and some S3-compatible services

Supported S3 Providers

AWS S3: Native support, all storage classes
MinIO: Self-hosted S3-compatible storage
Cloudflare R2: Cost-effective alternative to S3
Backblaze B2: Budget-friendly cloud storage
Google Cloud Storage: S3-compatible API
Azure Blob Storage: S3-compatible interface

Credentials Management

Access Key ID: S3 access credentials
Secret Access Key: Encrypted storage at rest
Connection Testing: Verify connectivity before applying
Credential Status: Visual indicator of configuration state

Tiering Configuration Process

1. Enable Storage Tiering

Toggle the main switch to enable S3 storage tiering functionality.

2. Configure S3 Settings

Set bucket name and region
Configure custom endpoint if using non-AWS provider
Enable path-style access for MinIO

3. Set Credentials

Enter S3 access key and secret key
Test connection to verify configuration
Credentials are encrypted and stored securely

4. Configure Tier Thresholds

Hot Tier: Set days for local storage (7, 14, 30, 60 days)
Auto-Move: Choose disk utilization trigger
Warm Tier: Set total retention period (90, 180, 365, 730 days)
Cold Tier: Optional archive tier for compliance

5. Apply Configuration

Save configuration to database
Apply to ClickHouse to activate tiering
Monitor status and storage distribution

Storage Distribution Monitoring

When tiering is active, monitor data distribution across tiers:

Hot Tier Metrics

Size: Amount of data on local storage
Row Count: Number of recent log entries
Performance: Query response times

Warm Tier Metrics

Size: Amount of data in S3 storage
Row Count: Number of archived log entries
Cost: S3 storage and request costs

Total Metrics

Combined Size: Total across all tiers
Combined Rows: Total log entries
Cost Savings: Compared to all-local storage

Performance Considerations

Query Performance by Tier

Tier	Query Speed	Use Case	Cost
Hot (Local SSD)	< 100ms	Real-time analysis	High
Warm (S3)	1-10s	Historical investigation	Medium
Cold (Archive)	Export only	Compliance	Low

Storage Cost Optimization

Hot Tier Sizing

Small environments: 7-14 days hot storage
Medium environments: 14-30 days hot storage
Large environments: 30-60 days hot storage

Warm Tier Strategy

Compliance: Match regulatory requirements (90 days, 1 year, 7 years)
Investigation: Keep 6-12 months for threat hunting
Budget: Balance storage costs with access needs

Auto-Move Tuning

Conservative: TTL only (predictable behavior)
Balanced: 90% full trigger (handles traffic spikes)
Aggressive: 80% full trigger (maximizes local storage efficiency)

Retention Best Practices

Initial Configuration

Start conservative: Begin with longer retention periods
Monitor usage: Track query patterns and data access
Adjust gradually: Reduce retention based on actual needs
Test recovery: Verify backup and restore procedures

Compliance Considerations

Regulatory requirements: Match industry standards (SOX, HIPAA, PCI-DSS)
Legal hold: Ability to preserve data for litigation
Data sovereignty: Ensure S3 region compliance
Encryption: Enable S3 encryption for sensitive data

Operational Management

Monitoring: Set up alerts for storage capacity and costs
Backup: Regular backups of configuration and metadata
Testing: Periodic restore tests and disaster recovery drills
Documentation: Maintain records of retention policies and changes

Troubleshooting

Common Issues

ClickHouse TTL Issues

Symptoms: Old data not being deleted Solutions:

Check TTL configuration: SHOW CREATE TABLE logs
Force TTL: Use "Force TTL Now" button
Verify partitions: Check partition dates and sizes
Monitor merges: TTL applies during background merges

Storage Tiering Problems

Symptoms: Data not moving to S3 Solutions:

Verify S3 credentials and permissions
Check bucket accessibility and region
Review ClickHouse logs for S3 errors
Test connection using the test button

Performance Issues

Symptoms: Slow queries on tiered data Solutions:

Optimize query patterns for time-based filtering
Increase hot tier retention for frequently accessed data
Use appropriate S3 storage class for access patterns
Monitor S3 request costs and optimize query frequency

Error Messages

Error	Cause	Solution
"TTL not applied"	Background merges not running	Force TTL or restart ClickHouse
"S3 access denied"	Invalid credentials or permissions	Update S3 credentials and bucket policy
"Partition not found"	Data already deleted	Check retention settings and restore from backup
"Disk space full"	Hot tier storage exhausted	Reduce hot tier retention or add storage

API Configuration

Programmatic storage management via REST API:

# Get storage overview
curl -X GET "http://localhost:8080/api/settings/storage/overview"

# Update PostgreSQL retention
curl -X PUT "http://localhost:8080/api/settings/retention" \
  -H "Content-Type: application/json" \
  -d '{"enabled": true, "retention_days": 90}'

# Update ClickHouse TTL
curl -X PUT "http://localhost:8080/api/settings/clickhouse/retention" \
  -H "Content-Type: application/json" \
  -d '{"retention_days": 365}'

# Configure storage tiering
curl -X PUT "http://localhost:8080/api/settings/tiering" \
  -H "Content-Type: application/json" \
  -d '{
    "enabled": true,
    "s3_bucket": "my-siem-logs",
    "s3_region": "us-east-1",
    "hot_days": 30,
    "warm_days": 365
  }'

# Apply tiering configuration
curl -X POST "http://localhost:8080/api/settings/tiering/apply"

Data Management - User guide for data lifecycle
Deployment Architecture - Infrastructure planning and scaling

Storage & Retention Settings

On this page