nano SIEM
Settings

Storage & Retention Settings

Storage & Retention Settings

nano provides flexible storage options to optimize performance, cost, and compliance requirements. This page covers all storage configuration options, including database modes, retention policies, and advanced storage tiering.

Storage Architecture Overview

nano supports two storage architectures:

PostgreSQL Only Mode (Legacy)

  • All data stored in PostgreSQL with TimescaleDB extension
  • Best for: Small to medium deployments (< 1TB logs)
  • Advantages: Simple setup, single database to manage
  • Limitations: Higher storage costs, slower queries on large datasets
  • Log telemetry stored in ClickHouse for optimal query performance
  • Metadata stored in PostgreSQL (rules, alerts, dashboards, settings)
  • Best for: Medium to large deployments (> 1TB logs)
  • Advantages: Better performance, lower storage costs, advanced tiering options

Storage Tabs Overview

The Storage & Retention settings page contains multiple tabs based on your configuration:

ClickHouse Tab (Log Storage)

Available when dual database mode is enabled. Manages the primary log storage system.

Storage Statistics

  • Total Size: Compressed size of all log data with compression ratio
  • Total Logs: Number of log entries stored
  • Partitions: Number of date-based partitions (typically one per day)
  • Parts: Number of data files (ClickHouse's storage units)
  • Date Range: Oldest and newest log timestamps

TTL Retention Configuration

  • Retention Period: How long to keep logs (1-3650 days)
  • Auto-deletion: ClickHouse automatically deletes expired data
  • Force TTL: Manually trigger immediate deletion of expired data

TTL Behavior:

  • Data is partitioned by day for efficient deletion
  • TTL is applied during background merge operations
  • "Force TTL Now" immediately processes all partitions
  • Deleted data cannot be recovered

PostgreSQL Tab (Metadata/Legacy)

In Dual Database Mode

Shows information about metadata storage:

  • Detection rules and configurations
  • Alert definitions and history
  • Dashboard configurations
  • User settings and preferences
  • Parser configurations
  • AI/ML model settings

In PostgreSQL Only Mode

Provides full storage management:

Storage Statistics:

  • Total Size: Size of all tables including indexes
  • Total Logs: Number of log entries
  • Chunks: TimescaleDB time-based chunks
  • Compressed Chunks: Number of compressed chunks
  • Date Range: Data time span

Retention Policy Configuration:

  • Enable/Disable: Toggle automatic retention
  • Retention Period: Days to keep data (1-3650)
  • Manual Execution: Force retention cleanup immediately

TimescaleDB Features:

  • Automatic chunk compression after 1 day
  • Background retention job execution
  • Efficient time-based data deletion

Storage Tiering Tab (Advanced)

Available only in dual database mode with ClickHouse. Provides cost-effective long-term storage.

Storage Tiering Deep Dive

Storage tiering automatically moves data between storage tiers based on age and access patterns.

Tier Architecture

Hot Tier (Local SSD)

  • Storage: Local NVMe/SSD storage
  • Performance: Fastest query response times
  • Use case: Recent data requiring frequent access
  • Configuration: 1-365 days (recommended: 7-60 days)

Auto-Move Behavior Options:

  • TTL Only: Move data based purely on age
  • 90% Full: Also move data when disk reaches 90% capacity
  • 80% Full: Also move data when disk reaches 80% capacity

Warm Tier (S3-backed)

  • Storage: S3-compatible object storage
  • Performance: Slower queries but still accessible via ClickHouse
  • Use case: Historical data for investigations and compliance
  • Configuration: Total retention period (hot days to warm days)

Cold Tier (Archive)

  • Storage: S3 archive storage classes
  • Performance: Export-only, not queryable
  • Use case: Long-term compliance and backup
  • Configuration: Optional, for data older than warm tier

S3 Configuration Options

Basic S3 Settings

  • S3 Bucket: Target bucket for tiered data
  • Region: AWS region or equivalent for other providers
  • Custom Endpoint: For MinIO, Cloudflare R2, Backblaze B2
  • Path-style Access: Required for MinIO and some S3-compatible services

Supported S3 Providers

  • AWS S3: Native support, all storage classes
  • MinIO: Self-hosted S3-compatible storage
  • Cloudflare R2: Cost-effective alternative to S3
  • Backblaze B2: Budget-friendly cloud storage
  • Google Cloud Storage: S3-compatible API
  • Azure Blob Storage: S3-compatible interface

Credentials Management

  • Access Key ID: S3 access credentials
  • Secret Access Key: Encrypted storage at rest
  • Connection Testing: Verify connectivity before applying
  • Credential Status: Visual indicator of configuration state

Tiering Configuration Process

1. Enable Storage Tiering

Toggle the main switch to enable S3 storage tiering functionality.

2. Configure S3 Settings

  • Set bucket name and region
  • Configure custom endpoint if using non-AWS provider
  • Enable path-style access for MinIO

3. Set Credentials

  • Enter S3 access key and secret key
  • Test connection to verify configuration
  • Credentials are encrypted and stored securely

4. Configure Tier Thresholds

  • Hot Tier: Set days for local storage (7, 14, 30, 60 days)
  • Auto-Move: Choose disk utilization trigger
  • Warm Tier: Set total retention period (90, 180, 365, 730 days)
  • Cold Tier: Optional archive tier for compliance

5. Apply Configuration

  • Save configuration to database
  • Apply to ClickHouse to activate tiering
  • Monitor status and storage distribution

Storage Distribution Monitoring

When tiering is active, monitor data distribution across tiers:

Hot Tier Metrics

  • Size: Amount of data on local storage
  • Row Count: Number of recent log entries
  • Performance: Query response times

Warm Tier Metrics

  • Size: Amount of data in S3 storage
  • Row Count: Number of archived log entries
  • Cost: S3 storage and request costs

Total Metrics

  • Combined Size: Total across all tiers
  • Combined Rows: Total log entries
  • Cost Savings: Compared to all-local storage

Performance Considerations

Query Performance by Tier

TierQuery SpeedUse CaseCost
Hot (Local SSD)< 100msReal-time analysisHigh
Warm (S3)1-10sHistorical investigationMedium
Cold (Archive)Export onlyComplianceLow

Storage Cost Optimization

Hot Tier Sizing

  • Small environments: 7-14 days hot storage
  • Medium environments: 14-30 days hot storage
  • Large environments: 30-60 days hot storage

Warm Tier Strategy

  • Compliance: Match regulatory requirements (90 days, 1 year, 7 years)
  • Investigation: Keep 6-12 months for threat hunting
  • Budget: Balance storage costs with access needs

Auto-Move Tuning

  • Conservative: TTL only (predictable behavior)
  • Balanced: 90% full trigger (handles traffic spikes)
  • Aggressive: 80% full trigger (maximizes local storage efficiency)

Retention Best Practices

Initial Configuration

  1. Start conservative: Begin with longer retention periods
  2. Monitor usage: Track query patterns and data access
  3. Adjust gradually: Reduce retention based on actual needs
  4. Test recovery: Verify backup and restore procedures

Compliance Considerations

  • Regulatory requirements: Match industry standards (SOX, HIPAA, PCI-DSS)
  • Legal hold: Ability to preserve data for litigation
  • Data sovereignty: Ensure S3 region compliance
  • Encryption: Enable S3 encryption for sensitive data

Operational Management

  • Monitoring: Set up alerts for storage capacity and costs
  • Backup: Regular backups of configuration and metadata
  • Testing: Periodic restore tests and disaster recovery drills
  • Documentation: Maintain records of retention policies and changes

Troubleshooting

Common Issues

ClickHouse TTL Issues

Symptoms: Old data not being deleted Solutions:

  • Check TTL configuration: SHOW CREATE TABLE logs
  • Force TTL: Use "Force TTL Now" button
  • Verify partitions: Check partition dates and sizes
  • Monitor merges: TTL applies during background merges

Storage Tiering Problems

Symptoms: Data not moving to S3 Solutions:

  • Verify S3 credentials and permissions
  • Check bucket accessibility and region
  • Review ClickHouse logs for S3 errors
  • Test connection using the test button

Performance Issues

Symptoms: Slow queries on tiered data Solutions:

  • Optimize query patterns for time-based filtering
  • Increase hot tier retention for frequently accessed data
  • Use appropriate S3 storage class for access patterns
  • Monitor S3 request costs and optimize query frequency

Error Messages

ErrorCauseSolution
"TTL not applied"Background merges not runningForce TTL or restart ClickHouse
"S3 access denied"Invalid credentials or permissionsUpdate S3 credentials and bucket policy
"Partition not found"Data already deletedCheck retention settings and restore from backup
"Disk space full"Hot tier storage exhaustedReduce hot tier retention or add storage

API Configuration

Programmatic storage management via REST API:

# Get storage overview
curl -X GET "http://localhost:8080/api/settings/storage/overview"

# Update PostgreSQL retention
curl -X PUT "http://localhost:8080/api/settings/retention" \
  -H "Content-Type: application/json" \
  -d '{"enabled": true, "retention_days": 90}'

# Update ClickHouse TTL
curl -X PUT "http://localhost:8080/api/settings/clickhouse/retention" \
  -H "Content-Type: application/json" \
  -d '{"retention_days": 365}'

# Configure storage tiering
curl -X PUT "http://localhost:8080/api/settings/tiering" \
  -H "Content-Type: application/json" \
  -d '{
    "enabled": true,
    "s3_bucket": "my-siem-logs",
    "s3_region": "us-east-1",
    "hot_days": 30,
    "warm_days": 365
  }'

# Apply tiering configuration
curl -X POST "http://localhost:8080/api/settings/tiering/apply"
On this page

On this page