Storage Architecture
Two-Tier Validator Storage + API Database
Frontier Chain uses different storage systems for validators (consensus) versus the API server (queries):
Validator Storage (Two-Tier)
Validators use a two-tier storage system optimized for consensus performance:
Order Execution (Validators)
│
▼
┌─────────────┐
│ TIER 1 │ ← Memory Cache (Hot Storage)
│ Memory │ • Immediate commits (0.83ms)
│ │ • 100 blocks retention (~10 seconds)
│ │ • 500MB maximum capacity
│ │ • O(1) HashMap lookups
└─────┬───────┘
│
│ Async Persistence (Non-blocking)
▼
┌─────────────┐
│ TIER 2 │ ← RocksDB (Block Storage)
│ RocksDB │ • Persistent block history
│ │ • Background async writes
│ │ • 820K ops/sec throughput
│ │ • Full chain history
└─────────────┘API Server Storage (Optional)
Separately, the API server maintains PostgreSQL for queryable data:
Design Principle: Validators use memory+disk for fast consensus. API server separately maintains queryable database for client applications.
Tier 1: Memory Cache (Hot Storage)
Purpose
Provide immediate commits for consensus without I/O blocking, keeping the most recent blocks in memory for instant access.
Configuration
Memory-first storage is configured with the following parameters:
Hot Retention: 100 blocks (~10 seconds of history) kept in memory
Maximum Memory: 500MB total capacity for hot storage
Backpressure Threshold: 400MB triggers slowdown mechanisms
Emergency Threshold: 450MB triggers aggressive eviction
Persistence Batch Size: 1000 operations written to disk per batch
Data Structures
The memory layer maintains several in-memory maps for fast access:
Block Data:
Hot Blocks: Maps block height to complete block execution data
Hot Orders: Maps order hash to full signed order details
Block Index: Maps block height to list of order hashes in that block
Full Blocks: Complete HotStuff blocks for API synchronization
Tracking Metrics:
Memory Usage: Current and peak memory consumption
Eviction Count: Number of emergency evictions triggered
Height Tracking: Current blockchain height and oldest block retained
Commit Performance
Measured Latency:
Average: 0.83ms for typical block (1,000 operations)
Memory Management
Frontier Chain employs a multi-stage memory management strategy:
Normal Mode (< 400MB)
When memory usage is below 400MB, the system operates at full speed:
Block Commit Process:
Store new block in hot cache (memory maps)
Update block index with order hash list
Track memory usage for the new data
Check retention limit - if beyond 100 blocks, evict oldest block
Send block to async persistence worker (non-blocking)
Performance: Sub-millisecond commits, zero I/O blocking
Backpressure Mode (400-450MB)
When memory usage exceeds 400MB, the system introduces controlled delays:
Adaptive Slowdown:
Normal mode (<400MB): Full speed consensus (0ms delay)
Backpressure mode (400-450MB): 10ms delay per block to allow persistence to catch up
Emergency mode (>450MB): Immediate eviction, then resume normal speed
Purpose: Give async persistence worker time to write blocks to RocksDB before memory exhaustion
Emergency Mode (> 450MB)
When memory exceeds 450MB, aggressive eviction prevents out-of-memory crashes:
Emergency Eviction Process:
Log warning about emergency eviction trigger
Calculate midpoint between oldest and current block
Evict oldest 50% of hot blocks from memory
Update memory usage counters
Increment eviction count metric
Block Eviction:
Remove block data from all memory maps
Decrement memory usage by block size
Update oldest block pointer
Data remains safe in RocksDB (already persisted)
Recovery: Evicted blocks can be retrieved from RocksDB if needed
Memory Statistics
The memory layer tracks comprehensive statistics for monitoring:
Memory Usage Metrics:
Current Usage: Real-time memory consumption in MB
Peak Usage: Maximum memory used since startup
Blocks in Memory: Count of blocks currently cached
Orders in Memory: Count of orders currently cached
Performance Metrics:
Persistence Lag: How many blocks behind disk persistence is
Evictions: Total number of emergency evictions triggered
These metrics are exposed for monitoring dashboards and alerting systems.
Tier 2: RocksDB (Block Storage)
Purpose
Provide durable, persistent storage for full blockchain history without blocking consensus operations.
Async Persistence Worker
A background worker handles non-blocking persistence to RocksDB:
Persistence Tasks:
PersistBlock: Write block data to disk
Serialize block data using Borsh (deterministic encoding)
Store individual orders by hash
Store raw operations for replay/audit capability
Store order index (block height → order hashes)
Flush: Force immediate disk write (explicit request)
Shutdown: Graceful shutdown with final flush
Process Flow:
Worker receives tasks via async channel
Processes tasks sequentially without blocking consensus
Uses Borsh serialization for deterministic encoding
Organizes data with structured key prefixes
Storage Key Scheme
RocksDB uses structured key prefixes to organize data:
Key Prefixes:
block:: Block execution data (by height)
order:: Individual orders (by hash)
index:: Order hashes per block (for retrieval)
meta:: Metadata like current height
rawops:: Raw operations (for audit/replay)
fullblock:: Complete HotStuff blocks
Key Construction:
Block keys:
block:prefix + 8-byte block heightOrder keys:
order:prefix + 32-byte order hashIndex keys:
index:prefix + 8-byte block height
This organization enables efficient range queries and block retrieval.
RocksDB Configuration
RocksDB is tuned for high-throughput blockchain workloads:
Performance Tuning:
Background Jobs: 4 concurrent compaction threads
Sync Interval: 1MB between disk syncs
Write Buffer: 64MB per buffer, 3 buffers total
Compression: Lz4 algorithm for speed and compression ratio
Read Optimization:
Block Cache: 256MB for frequently accessed data
Bloom Filters: Enable fast negative lookups
Storage:
Auto-create database if missing
Organized by key prefixes for efficient queries
Performance Characteristics
Write Throughput:
Sustained Throughput: 820,000 operations/second
Comparison to Memory:
100 ops: 7.0x slower than memory
1,000 ops: 4.2x slower
5,000 ops: 3.6x slower
Average: 6.6x slower (but async, so doesn't block consensus)
Data Retrieval
Data retrieval follows a hot/cold fallback strategy for optimal performance:
Block Retrieval Process:
Check Hot Cache: Look for block in memory (sub-microsecond)
Fallback to Cold Storage: If not in memory, query RocksDB
Deserialize: Convert bytes back to block structure
Return: Provide block data to caller
Order Retrieval Process:
Same hot/cold fallback pattern
Memory lookup first (instant)
RocksDB fallback if needed (milliseconds)
Borsh deserialization for data reconstruction
This approach provides instant access to recent data while maintaining full history on disk.
PostgreSQL/TimescaleDB (API Server Only)
Purpose
Important: This tier runs on the API server, NOT on validators. Validators only use Memory + RocksDB.
The API server maintains PostgreSQL/TimescaleDB to provide queryable order history, trade records, and market data for REST API endpoints and analytics.
Performance Comparison
Commit Latency by Tier
100
0.07ms
0.49ms
7.0x
500
0.22ms
0.71ms
3.2x
1,000
0.31ms
1.30ms
4.2x
2,000
0.96ms
2.52ms
2.6x
5,000
1.70ms
6.05ms
3.6x
10,000
3.55ms
12.20ms
3.4x
Average Memory Advantage: 6.6x faster than RocksDB
Throughput Capacity
Validator Storage:
Memory
1.2M ops/sec
0.83ms
Consensus commits (validators)
RocksDB
820K ops/sec
1-12ms
Block persistence (validators)
API Server Storage (separate system):
PostgreSQL
50K inserts/sec
10-50ms
API query layer (not consensus)
Query Performance (API Server Only)
PostgreSQL Query Times (typical):
User balance lookup: 2-5ms
Order history (100 orders): 10-20ms
Trade history (1000 trades): 20-50ms
24h market stats: 5-10ms (continuous aggregate)
Order book snapshot: 50-100ms (depends on depth)
Note: These queries run on the API server and do not affect validator consensus performance.
Crash Recovery
Recovery Scenarios
1. Memory Loss (Process Restart)
Situation: Node crashes and loses in-memory cache
Recovery Process:
Load latest block height from RocksDB metadata
Load last 100 blocks into memory cache
Restore current height pointer
Resume consensus from latest height
Recovery Time: <1 second (load 100 blocks from RocksDB)
2. RocksDB Corruption
Situation: Disk corruption or database file damage
Recovery Process:
Identify last valid block in RocksDB
Connect to validator peers for missing blocks
Request and persist missing blocks from network
Rebuild memory cache from restored RocksDB data
Recovery Time: 1-10 minutes (depends on extent of corruption)
3. PostgreSQL Failure
Situation: PostgreSQL server offline or data loss
Recovery Process:
Start from genesis block (height 0)
Replay all blocks from RocksDB to PostgreSQL
Re-insert orders, trades, and balances
Progress logging every 1000 blocks
Recovery Time: 1-24 hours (depends on chain height and data volume)
4. Complete Data Loss
Situation: Catastrophic failure - all storage tiers lost
Recovery Process:
Discover and connect to validator network
Request genesis block from peers
Sync full chain history block-by-block
Persist each block to RocksDB
Optionally rebuild PostgreSQL from RocksDB
Recovery Time: Hours to days (full chain sync from network)
Monitoring and Metrics
Storage Health Metrics
The storage system exposes comprehensive metrics for monitoring:
Memory Tier Metrics:
Memory usage (MB and percentage)
Blocks and orders currently cached
Total evictions triggered
RocksDB Tier Metrics:
Persistence lag (blocks behind)
Persistence queue size
Database size on disk (GB)
Write throughput rate
PostgreSQL Tier Metrics:
Active connection count
Query latency (average)
Replication lag if applicable
Overall Metrics:
Total storage consumed across all tiers
Block range available (oldest to newest)
Alert Conditions
Critical Alerts:
Memory usage >90% → Emergency eviction imminent
RocksDB persistence lag >1000 blocks → Risk of memory exhaustion
PostgreSQL offline → Query layer down
Disk space <10GB → Storage exhaustion risk
Warning Alerts:
Memory usage >80% → Approaching backpressure
RocksDB persistence lag >100 blocks → Slower than consensus
PostgreSQL query latency >100ms → Performance degradation
Storage Cost Analysis
Disk Space Requirements
Per Block (average):
Block metadata: 1KB
Order hashes (1000 orders): 32KB
Full orders (1000 orders): 500KB
Execution results: 100KB
Total: ~633KB per block
Yearly Projection (100ms blocks):
Blocks per year: 315,360,000
Storage per year: ~200TB
With compression (Lz4): ~50TB/year
Database Size Growth
PostgreSQL/TimescaleDB:
Orders table: ~1KB per order
Trades table: ~500 bytes per trade
With 1M orders/day: ~365GB/year (orders only)
With compression and retention policies: ~100GB/year
Total Storage (1 year):
RocksDB: 50TB (full history)
PostgreSQL: 100GB (queryable data)
Total: ~50.1TB/year
Hardware Recommendations
Validator Node:
NVMe SSD: 2TB minimum (6 months operation)
RAM: 16GB (for memory cache + OS)
Expansion: Plan for 10TB/year growth
Archive Node (full history):
HDD/SSD: 100TB (multi-year storage)
RAM: 8GB (read-only, less cache needed)
Conclusion
Frontier Chain's storage architecture separates concerns: validators use a high-performance two-tier system (Memory + RocksDB) for consensus, while the API server optionally maintains PostgreSQL for queryable analytics.
Validator Storage Advantages:
Performance: 0.83ms memory commits for consensus
Non-Blocking: Async RocksDB persistence never delays consensus
Durability: Persistent blockchain history on disk
Simplicity: No database dependencies for validators
Recovery: Fast crash recovery with network sync fallback
API Server Benefits:
Queryable order/trade history via PostgreSQL
TimescaleDB compression and time-series optimization
REST API endpoints for client applications
Independent from validator operations
This separation enables Frontier Chain validators to achieve 369,000 operations per second while maintaining full data integrity, with optional rich querying capabilities via the API server.
Last updated