Building WorkingDB: A Toy In-Memory Database with Multi-Protocol Support

working db,toy database,rust database ,4insec,

In the world of database systems, in-memory databases have carved out a crucial niche for applications requiring blazing-fast data operations. Today, I'd like to share the architecture and implementation details of WorkingDB, a from-scratch in-memory database that supports multiple protocols (redis not all, and memcached and plan for sql)and incorporates modern design principles for high performance.

Core Architecture

WorkingDB is built with a modular architecture that separates concerns across distinct components. At its heart is a sharded in-memory storage engine, wrapped with protocol adapters and persistence mechanisms.

In-Memory Storage Engine

The primary storage mechanism is MemTable, which partitions data across multiple shards to optimize for concurrent access:

pub struct MemTable {
    // Sharded hash tables for parallelism
    partitions: Vec<Arc<RwLock<HashMap<Vec<u8>, Entry>>>>,
    
    // Number of partitions (shards)
    partition_count: usize,
}

Each entry in the store contains not just the value, but also metadata for features like time-to-live (TTL):

struct Entry {
    // Actual value bytes
    value: Vec<u8>,
    
    // Optional expiration time
    expires_at: Option<Instant>,
}

The partitioning strategy automatically scales to the system's available CPU cores, ensuring optimal resource utilization without manual configuration. Keys are distributed across partitions using a fast FNV-1a hashing algorithm to minimize collisions while maintaining predictable access patterns.

Multi-Protocol Support

One of WorkingDB's standout features is its ability to speak multiple database protocols. The system includes implementations for:

  1. Redis Protocol (RESP) - Supporting core commands like GET, SET, DEL
  2. Memcached Protocol - Text-based protocol implementation
  3. Query Interface - Foundations for SQL-like query capabilities(half implemented)

The networking layer uses automatic protocol detection by examining the first few bytes of incoming connections:

pub async fn detect_protocol(&mut self) -> Result<Protocol, std::io::Error> {
    // Read initial bytes
    let n = self.socket.peek(&mut self.buffer).await?;
    
    if n == 0 {
        return Ok(Protocol::Unknown);
    }
    
    // Check for Redis protocol
    if self.buffer[0] == b'*' || self.buffer[0] == b'$' || 
       self.buffer[0] == b'+' || self.buffer[0] == b'-' || 
       self.buffer[0] == b':' {
        return Ok(Protocol::Redis);
    }
    
    // Check for Memcached protocol (text-based)
    let commands: [&[u8]; 5] = [b"get ", b"set ", b"add ", b"replace ", b"delete "];
    for cmd in &commands {
        if self.buffer.starts_with(cmd) {
            return Ok(Protocol::Memcached);
        }
    }
    
    // More protocol detection...
}

This approach allows clients using different protocols to connect to the same database instance without any configuration changes.

Persistence Layer

While functioning primarily as an in-memory database, WorkingDB implements durability through its Append-Only File (AOF) mechanism:

pub struct AppendOnlyFile {
    // Path to AOF file
    path: PathBuf,
    // Open file handle
    file: File,
    // Write buffer for batching
    writer: BufWriter<File>,
    // Current file position
    position: u64,
    // Count of records replayed during recovery
    replay_count: usize,
}

Each write operation is logged with a detailed header that includes:

  • CRC64 checksum for data integrity verification
  • Entry size and command type
  • Precise timestamp for the operation
  • Key and value size information
  • Optional TTL information

During startup, the system automatically replays the AOF to restore the in-memory state, providing crash recovery capabilities:

fn replay_existing_entries(&mut self) -> io::Result<()> {
    // Rewind to beginning
    self.file.seek(SeekFrom::Start(0))?;
    
    // Read and process each entry
    // ...
}

Memory Management

The system includes a garbage collection mechanism to handle expired entries based on their TTL settings:

pub fn gc(&self) -> usize {
    let mut total_removed = 0;
    let now = Instant::now();
    
    // Process each partition
    for partition in &self.partitions {
        if let Ok(mut guard) = partition.write() {
            // Find keys to remove
            let to_remove: Vec<Vec<u8>> = guard
                .iter()
                .filter_map(|(k, v)| {
                    if let Some(expires) = v.expires_at {
                        if now > expires {
                            return Some(k.clone());
                        }
                    }
                    None
                })
                .collect();
            
            // Remove expired entries
            for key in to_remove {
                guard.remove(&key);
                total_removed += 1;
            }
        }
    }
    
    total_removed
}

This approach ensures memory is reclaimed from expired entries without affecting concurrent operations.

Resilience Testing with Chaos Engineering

For ensuring system robustness, WorkingDB includes a built-in chaos engine that can deliberately simulate various failure conditions:

pub enum ChaosType {
    // Kill process signals
    ProcessKill,
    
    // Memory pressure (allocation failures)
    MemoryPressure,
    
    // Disk failures (I/O errors)
    DiskFailure,
    
    // Network partitions
    NetworkPartition,
    
    // Clock skew (time jumps)
    ClockSkew,
}

This allows testing how the system responds to different types of failures, helping to build confidence in its resilience.

Asynchronous Runtime

WorkingDB leverages Tokio as its asynchronous runtime, enabling efficient handling of concurrent connections without the overhead of thread-per-connection models:

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Initialize system components
    // ...
    
    // Start TCP server
    let server = TcpServer::new(args.host, args.port, state.clone());
    if let Err(e) = server.run().await {
        eprintln!("💥 Fatal error: {}", e);
        exit(1);
    }
    
    Ok(())
}

Advanced Storage Options( planned )

For high-performance storage needs, WorkingDB includes experimental support for direct device access through the NVMe module:

pub struct NvmeAccess {
    // Device path (e.g., "/dev/nvme0n1")
    path: String,
    
    // Block size for aligned I/O
    block_size: usize,
    
    // File handle (when opened)
    file: Option<File>,
}

This allows bypassing kernel buffer layers for maximum I/O performance when needed.

Current State and Next Steps

While the core functionality is implemented, there are several areas identified for optimization:

  1. Implementing true lock-free data structures to replace the current RwLock approach
  2. Adding buffer pooling for zero-copy command parsing
  3. Batching AOF operations for improved write throughput
  4. Implementing parallel query execution for complex operations

vist https://workingdb.4insec.com to read more and https://github.com/nassdaq/workingdb for source code