Frequently Asked Questions

Honest answers about ALOS DB, its current state, sync modes, concurrency, crash protection, and how it compares to other databases.

General Questions

What is ALOS DB?

Architecture

ALOS DB is a document-oriented database written in Go. It stores data as MessagePack-encoded documents in append-only shard files, using memory-mapped I/O for reads and a single-writer-per-shard pattern for writes. It is designed to be a lightweight, high-performance alternative to MongoDB and SQLite for applications that need document storage without the overhead of a full database server.

Key design goals:

  • Single binary — One executable, no dependencies, no JVM, no Docker required.
  • Document model — Schemaless collections with JSON-like documents stored as binary MessagePack.
  • Fast reads — Memory-mapped files mean the OS handles caching; no deserialization for read-only paths.
  • Simple deployment — Run the binary, point it at a data directory, connect the client.

ALOS DB is not a SQL database. It does not support joins, foreign keys, or complex aggregations. If you need relational features, use PostgreSQL or SQLite.

Is ALOS DB production-ready?

Durability

No. ALOS DB is currently in testing and pre-release. It has not been deployed to production environments handling millions of users. The codebase is actively evolving, APIs may change, and while we perform regular testing, you should not rely on it for critical data without thorough validation in your own environment.

What has been tested:

  • Basic CRUD operations (insert, find, update, delete) across thousands of documents
  • ACID transactions with rollback on failure
  • Client reconnection after network drops
  • Sync and async mode behavior with clean shutdowns
  • Single-node operation only — clustering is experimental

What has not been tested at scale:

  • Multi-terabyte datasets
  • Thousands of concurrent client connections
  • Long-term uptime (weeks/months) without restart
  • Multi-datacenter replication under real-world network partitions

Run your own benchmarks on your hardware before making any deployment decisions. No database is magic; only your measurements matter.

What programming languages are supported?

Architecture

Go is the only officially supported language right now. The server is a single Go binary, and the official client library is also written in Go. The wire protocol uses length-prefixed MessagePack frames over TCP, so in theory any language with TCP and MessagePack support could write a client — but none exist yet.

Planned support:

  • Python — A Python client is in development. No release date yet.

If you want to write a client in another language (Rust, C#, JavaScript, etc.), the protocol is documented in the Go source. However, there are no stability guarantees for the wire protocol yet, as it may change between pre-release versions.

What license is ALOS DB under?

Architecture

ALOS DB is proprietary software. It is not open source. You may use it according to the terms provided by the author. For commercial licensing questions, contact the developer directly.

Where can I get help or report bugs?

Architecture

Join the community on Discord: https://alos.gg/discord

  • Check this documentation first — most common questions are answered here.
  • Enable client logging with alosdbclient.SetClientLogging(true) to diagnose connection issues.
  • Check the server logs — they are written to stdout by default.
  • Report bugs or feature requests in the Discord or contact the developer directly.

Sync Modes

What is the difference between async and sync mode?

Performance Durability

ALOS DB can run in two modes, set at startup via the --sync-mode flag. This is the single most important configuration decision:

Aspect Async Mode Sync Mode
Write speed (remote) Faster — writes buffered in memory Slower — every write fsync'd to disk
fsync behavior Kernel flushes in background Explicit fsync before ack
Crash data loss Last few seconds may be lost Zero loss for acknowledged writes
Latency Lower — ack returned immediately Higher — bounded by disk flush speed
Best for Logs, telemetry, game state, caches User accounts, purchases, inventory
Realistic remote throughput ~70K ops/sec InsertOne, ~548K docs/sec InsertMany (16 workers, pool=64) ~25-35K ops/sec InsertOne (16 workers, pool=64)

In async mode, when the client sends a write, the server appends it to an in-memory buffer and immediately returns success. The buffer is flushed to disk periodically by a background worker. If the machine loses power before the flush, the buffered data is gone. This is the same trade-off MongoDB makes with w: 1, j: false or Redis with appendfsync everysec.

In sync mode, the server calls file.Sync() (fsync on Unix, FlushFileBuffers on Windows) on the data file before sending the success response. The client only proceeds once the data is physically on the storage medium. This guarantees that any acknowledged write survives a crash, at the cost of latency.

You cannot mix sync and async mode in the same server instance. If you need both behaviors, run two ALOS DB instances on different ports — one with --sync-mode async and one with --sync-mode sync.

Can I switch between async and sync mode at runtime?

Durability

No. The sync mode is set at server startup via --sync-mode and cannot be changed without restarting the server. Switching at runtime would be unsafe: inflight async writes might not be on disk yet, and suddenly demanding fsync for every operation would cause a massive latency spike.

Does async mode mean I always lose data on crash?

Durability

No. "Async" means the server does not wait for fsync, but the operating system still holds the data in its page cache. On a clean shutdown (SIGTERM), the OS flushes everything. Data loss only happens on:

  • Power failure or hard reset
  • Kernel panic
  • OS crash before dirty page writeback fires

If your server is on a UPS (uninterruptible power supply) with battery-backed RAID cache or enterprise NVMe (which have capacitors to flush in-flight writes), the window for data loss is extremely small. For game telemetry or session state where losing the last 1-2 seconds is acceptable, async mode is fine. For financial transactions, use sync mode.

How does the write buffer work in async mode?

Performance

ALOS DB uses a multi-stage write pipeline:

  1. Client batching — The Go client buffers requests for up to 1ms (configurable) or until 100 documents are pending, then sends them as a single TCP frame.
  2. Server work queue — Incoming batches are placed on a buffered channel per shard. Each shard has its own worker goroutine.
  3. Worker coalescing — The shard worker dequeues batches, appends them to the shard's data file, and updates the in-memory index.
  4. Background flush — A separate goroutine periodically calls sync_file_range to hint the kernel to flush dirty pages, without blocking the write path.

The write path is designed to be CPU-efficient: documents are serialized to MessagePack once, then written directly to the file with minimal copying.

What are realistic remote performance numbers?

Performance

Here are actual benchmark numbers from the client library over localhost TCP (AMD Ryzen 7 5700X, NVMe SSD, async mode, 16 workers, pool=64):

  • InsertOne: ~70K ops/sec, ~14.3 µs/op
  • InsertMany (50 docs): ~11K ops/sec, ~548K docs/sec
  • FindOne by _id: ~90K reads/sec, ~11.1 µs/op
  • FindMany (~128 docs): ~2.8K ops/sec, ~357K docs/sec
  • UpdateOne: ~55K ops/sec, ~18.3 µs/op
  • UpdateMany (~128 docs): ~639 ops/sec, ~82K docs/sec
  • DeleteOne: ~49K ops/sec, ~20.4 µs/op
  • DeleteMany (~128 docs): ~1.4K ops/sec, ~177K docs/sec
  • UpsertOne: ~51K ops/sec, ~19.6 µs/op
  • UpsertMany (~128 docs): ~1.2K ops/sec, ~157K docs/sec
  • Count: ~16K ops/sec, ~62.7 µs/op
  • Aggregate ($match + $group): ~35K ops/sec, ~28.9 µs/op

These numbers will vary dramatically based on:

  • Network latency (localhost vs. WAN)
  • Disk speed (SATA SSD vs. NVMe vs. HDD)
  • Document size (larger documents = fewer ops/sec)
  • Number of concurrent clients
  • Whether indexes exist on queried fields

In sync mode, write throughput is roughly 2-3x lower because the server fsyncs every write. However, the exact multiplier depends on your disk's fsync latency.

These are single-machine, single-client numbers from testing. They are not production benchmarks. Always run your own benchmarks with your document sizes and query patterns.

Concurrency & File Access

How does ALOS DB handle concurrent reads and writes to the same file?

Concurrency

ALOS DB splits data across multiple independent shards (default 256). Each shard owns:

  • Its own append-only .db data file
  • Its own set of index files (.sidx, .sgidx, .svidx)
  • Its own worker goroutine for writes
  • Its own in-memory hot cache

Because each shard is isolated, there is no global lock on the database. A write to shard 42 never blocks a read from shard 7. Within a single shard:

  • Writes are serialized through the shard's worker goroutine. The worker dequeues batches from a channel and appends them to the data file. No mutex is needed because there is only one writer.
  • Reads use memory-mapped I/O (mmap). A reader sees the data file as it existed at the moment the read began. Even if the writer appends new data, existing mmap'd pages remain valid for the reader.

The only synchronization is the shard worker channel, which is a buffered Go channel. Writers send; the worker receives. No mutex, no futex, no kernel syscalls for locking.

What is the single-writer pattern and why does it matter?

Concurrency

Most databases use a reader-writer lock (like sync.RWMutex) to protect the B-tree or LSM-tree in memory. When a writer holds the lock, all readers block. Under heavy write load, readers can starve.

ALOS DB avoids this by giving each shard exactly one writer goroutine. All writes for that shard go through that goroutine's channel. Because there is only one writer, there is no need for a lock around the data structure. The writer simply appends to the log and updates the index.

Readers access the mmap'd file directly without any locks. They use the index to find offsets, then read the data. The index is an immutable sorted slice that the worker replaces atomically after a batch. Readers hold a pointer to the old index until they finish; Go's garbage collector cleans it up later.

The result: readers never block writers, and writers never block readers. Only write-write conflicts on the same document require coordination, which is handled by MVCC.

Can multiple goroutines read the same collection safely?

Concurrency

Yes. Reads are fully concurrent and lock-free. You can have hundreds of goroutines reading the same document simultaneously, and they will all see a consistent snapshot without blocking each other or the writer.

Each read gets a reference to the current index snapshot (a pointer to an immutable sorted slice). The document data is read directly from the mmap'd file. Because the file is append-only, old data is never overwritten in place. A reader holding a pointer to offset 1,204,832 will always find valid data there, even if the writer has appended megabytes of new documents after it.

How are transactions isolated from concurrent writes?

Concurrency

ALOS DB uses Multi-Version Concurrency Control (MVCC). When a transaction begins, it captures the current database timestamp. All reads inside the transaction use that timestamp to look up the index snapshot that existed at that moment.

While the transaction is running:

  • Other writers can commit new documents. The transaction does not see them.
  • The transaction's own writes are buffered in memory and only visible to itself until commit.
  • On commit, the transaction's writes are merged into the shard worker's queue with the commit timestamp. Future transactions will see them.

If two transactions try to write the same document, the second one to commit detects the conflict (the document's version timestamp changed since the transaction began) and rolls back automatically. This is called optimistic concurrency control.

What happens under extreme write pressure?

Performance

ALOS DB has backpressure mechanisms to prevent unbounded memory growth:

  • Per-shard write queues are bounded. If a shard's channel fills (default 4096 slots), new writes block until the worker catches up. This propagates backpressure to the client naturally.
  • Client batching limits the number of in-flight requests. The client-side batch buffer has a soft limit; exceeding it triggers an immediate flush.
  • Async write semaphore on the server limits the total number of concurrent async writes across all shards.
  • Compaction throttling — Background compaction runs at low priority and yields to foreground writes.

Under sustained load beyond hardware capacity, latency increases gracefully rather than crashing. The server logs warnings when queues exceed 80% capacity.

Durability & Crash Protection

How does crash protection work in sync mode?

Durability

Sync mode provides the strongest durability guarantee:

  1. Every write is appended to the shard's .db file.
  2. The index entry is updated in memory.
  3. file.Sync() (fsync on Unix, FlushFileBuffers on Windows) is called on the data file. This blocks until the OS confirms the data is on the storage medium.
  4. Only after Sync() returns does the server send the success response to the client.

If the server crashes between steps 1 and 3, the client never receives an acknowledgement and can retry. If it crashes after step 3, the data is safe. On restart, ALOS DB replays the append-only log from the last known good offset and reconstructs the indexes. Any partial write at the end of the file is detected via length validation and discarded.

Additionally, sync mode computes a CRC32C checksum for every document. If a document's checksum fails on read, it is treated as corrupted and skipped during index rebuild.

How does crash recovery work in async mode?

Durability

Async mode recovery follows the same replay logic, but with a smaller safety window:

  1. On startup, ALOS DB scans every .db file from the beginning.
  2. It validates each document's header (length prefix + CRC). Documents with invalid CRCs are skipped.
  3. It rebuilds all in-memory indexes from scratch. The on-disk index files are treated as caches and regenerated if corrupt.
  4. It truncates the file to the last valid document boundary, removing any partial write at the end.

The key difference from sync mode is how much data might be at the trailing edge. In async mode, the kernel might have a few seconds of unflushed dirty pages. If power is lost, those pages evaporate. ALOS DB will recover to the last point the kernel actually wrote to disk, which could be slightly behind the last acknowledged write.

This is identical to how MongoDB (without journal) or Redis (with AOF but appendfsync everysec) behaves. It is a deliberate performance trade-off.

What about partial writes or torn pages?

Durability

Modern storage devices guarantee atomic sector writes (typically 4KB or 512 bytes). ALOS DB structures every document so that its header (length + CRC) fits within a single sector. This means:

  • If power fails mid-write, either the sector is fully written or fully not written. You never get a half-header.
  • The CRC catches any bit-rot or DMA corruption.
  • The recovery scanner stops at the first invalid document and truncates everything after it.

For documents larger than the sector size, the data is written as multiple sectors, but the header is always the last sector written. This ensures that if a large document is torn, the header will either be present (document is valid) or absent (document is ignored). You cannot have a valid header with invalid data.

Can I disable CRC for more speed?

Performance

Yes, with --disable-crc. This removes the CRC computation and validation from the hot path. On small documents, this can improve write throughput by a small percentage.

However, you lose protection against:

  • Kernel page cache corruption (rare but documented)
  • DMA errors from faulty RAM
  • Silent data corruption from SSD firmware bugs

Only disable CRC if you have error-correcting RAM (ECC) and you have benchmarked that CRC is actually your bottleneck. For most users, CRC overhead is negligible compared to network and disk I/O.

How do backups work? Can I copy files while the server is running?

Durability

Yes, because ALOS DB files are append-only and never modified in place. You can safely:

  • Copy the entire data directory with cp or rsync while the server is running.
  • Use filesystem snapshots (LVM, ZFS, btrfs) without stopping the server.
  • Run tar on a live database.

The copied files will be a consistent snapshot of the database at the moment the copy started. Any writes that happen during the copy will not appear in the backup. This is the same principle as PostgreSQL's pg_basebackup.

For portable backups, use the built-in Export and Import APIs, which produce a msgpack stream.

Comparisons with Other Databases

How does ALOS DB compare to MongoDB?

Performance Architecture
Feature ALOS DB MongoDB
Deployment Single binary, ~26MB (dev) / ~55MB (Linux prod) Server + tools, ~500MB
Memory footprint Low — mmap'd files + small cache High — WiredTiger cache + heap
Query language Go API with Document maps MongoDB Query Language (MQL)
Aggregation Basic pipeline Full aggregation framework
Transactions MVCC, single-node Multi-document (4.0+)
Replication Experimental cluster sync Mature replica sets
Maturity Pre-release, testing phase Production-grade for 15+ years
Ecosystem Go client only Drivers for every language, Compass, Atlas

Use MongoDB if you need a mature ecosystem, full aggregation pipelines, GUI tools, or multi-document ACID across sharded clusters.

Use ALOS DB if you want a lightweight, single-binary document store with minimal memory usage and simple deployment. Be aware it is pre-release software.

How does ALOS DB compare to PostgreSQL?

Architecture

PostgreSQL is a general-purpose relational database. ALOS DB is a specialized document store. They solve different problems:

  • Schema — PostgreSQL enforces schemas and types. ALOS DB is schemaless.
  • Joins — PostgreSQL has sophisticated query planning for joins. ALOS DB has no joins; embed related data or query separately.
  • JSON — PostgreSQL's JSONB is flexible but slower than ALOS DB for simple key-value or indexed lookups.
  • Write speed — ALOS DB async is faster for unbatched inserts because it avoids WAL fsync and B-tree maintenance per write.
  • Complex queries — PostgreSQL wins for analytics, window functions, CTEs, and arbitrary SQL.

Many architectures use both: PostgreSQL for relational data and ALOS DB for high-velocity event streams.

How does ALOS DB compare to Redis?

Performance

Redis and ALOS DB both prioritize speed, but for different use cases:

  • Redis is an in-memory data structure server. All data must fit in RAM. Persistence is optional (RDB snapshots, AOF). It is unbeatable for caching, pub/sub, rate limiting, and leaderboards.
  • ALOS DB is a disk-backed document database. Data lives on SSD/HDD and is mmap'd into memory on demand. It is designed for datasets larger than RAM.

If your dataset fits in RAM and you can tolerate losing a few seconds of data on crash, Redis is the better cache. If your dataset is larger than RAM or you need durable document storage, ALOS DB is more appropriate.

A common pattern is Redis as the hot cache in front of ALOS DB, with ALOS DB as the durable source of truth.

How does ALOS DB compare to SQLite?

Architecture

SQLite is the gold standard for embedded relational databases. ALOS DB differs in several ways:

  • Concurrency — SQLite uses file locking; writers block readers in WAL mode. ALOS DB uses shard-level single-writer + lock-free reads, so writers never block readers.
  • Scale — SQLite is single-file and single-writer. ALOS DB shards across 256 files and scales with CPU cores.
  • Network — SQLite is embedded (in-process). ALOS DB is a network server with a TCP client.
  • Data model — SQLite is relational (tables, rows, SQL). ALOS DB is document-oriented (collections, documents, msgpack).

Use SQLite for mobile apps, local tools, and small embedded systems. Use ALOS DB when you need a network-accessible database with high concurrency.

How does ALOS DB compare to ScyllaDB / Cassandra?

Architecture

ScyllaDB and Cassandra are distributed wide-column stores for multi-datacenter replication. ALOS DB is a single-node document store.

  • Consistency — ALOS DB offers strong consistency on a single node. Cassandra offers tunable consistency (eventually consistent by default).
  • Deployment — ALOS DB runs on one machine. Cassandra needs at least 3 nodes for production.
  • Replication — ALOS DB has experimental cluster sync. For automatic failover across datacenters, use Cassandra or CockroachDB.

Performance

What hardware do I need for good performance?

Performance

ALOS DB runs well on modest hardware, but for high throughput:

  • CPU — Any modern multi-core CPU. More cores help because writes are sharded across workers. ALOS DB has been tested on AMD Ryzen 7 5700X (8 cores).
  • RAM — 8GB minimum, 16-32GB recommended. The OS uses free RAM to cache mmap'd files. More RAM = more hot data in memory.
  • Disk — NVMe SSD strongly recommended for sync mode. SATA SSD is acceptable for async mode. HDD works but limits throughput significantly.
  • Network — For remote clients, gigabit Ethernet or better. The protocol is efficient, but large documents or high QPS can saturate 100Mbps.

The bottleneck is usually disk I/O in sync mode, or network latency in remote deployments. CPU is rarely the bottleneck unless you are doing complex aggregations.

Why is ALOS DB faster than some other document databases?

Performance

Several design decisions reduce overhead:

  1. No JSON parsing — Documents are stored as MessagePack, a binary format. No string parsing on every read/write.
  2. Append-only log — Updates do not modify existing data. No in-place B-tree updates, no page splitting.
  3. Zero-copy reads — Memory-mapped files mean the OS handles caching. No user-space buffer copies.
  4. Single-writer shards — No mutex contention. Each shard has one writer goroutine and unlimited lock-free readers.
  5. Assembly-optimized hot paths — Critical loops (hashing, index searching) use Go assembly for AMD64.
  6. Minimal allocations — The write path pre-allocates slices and uses value types where possible.

However, these optimizations are specific to ALOS DB's design. A general-purpose database like PostgreSQL or MongoDB does more work per query (query planning, secondary index maintenance, replication logs) because they support more features. Speed is a trade-off against flexibility.

How does the hot cache work?

Performance

The hot cache is an LRU cache of recently accessed documents, stored in deserialized form. When a read request arrives:

  1. Check the hot cache by document ID (O(1) hashmap lookup).
  2. If miss, check the index for the disk offset.
  3. If found on disk, read via mmap and populate the cache.

The cache is per-shard. Writes invalidate affected cache entries. The default size is small; tune it with --cache-mb if you have RAM to spare.

For read-heavy workloads where the same documents are accessed repeatedly, the hot cache can significantly improve read latency by avoiding disk access entirely.

What is compaction and when does it run?

Performance

Because ALOS DB is append-only, updates and deletes leave stale data behind. Over time, data files grow larger than the live data. Compaction rewrites each shard's data file, keeping only live documents.

Compaction triggers when:

  • The stale ratio exceeds 30% (configurable via --compact-stale-ratio)
  • The file is larger than the minimum size (--compact-min-size)
  • And --compaction is enabled

During compaction, the shard briefly locks to swap the old file for the new one. Reads continue via the old mmap'd file until the swap. Writes are buffered and applied after compaction.

Compaction runs in the background at low priority. On fast NVMe, a 1GB shard compacts in a few seconds.

Architecture

How is data sharded? Can I change the number of shards?

Architecture

Documents are sharded by hashing their _id field. The hash modulo --shards determines which shard owns the document.

You set the shard count at database creation time via --shards. Once data exists, changing the shard count requires a full rebuild (export and re-import) because every document would hash to a different shard.

Guidelines:

  • 256 shards (default) — Good for most workloads.
  • 512-1024 shards — Better for many CPU cores or very large datasets. More shards = less data per shard = faster compaction.
  • Too many shards (>2048) wastes memory because each shard has fixed overhead for its index and channel buffers.

What file formats does ALOS DB use?

Architecture

Each database is a directory containing subdirectories per shard:

text
data/
  mydb/
    shard_000.db          # Append-only document log
    shard_000.sidx        # Primary index (id -> offset)
    shard_000.sgidx.0     # Group index segment
    shard_000.sgidx.1     # Group index segment
    shard_000.sgidx.2     # Group index segment
    shard_000.sgidx.3     # Group index segment
    shard_000.svidx.0     # Value index segment
    shard_000.svidx.1     # Value index segment
    shard_001.db
    ...
  • .db — The append-only document store. Each document is prefixed by a 4-byte length and a 4-byte CRC32C.
  • .sidx — The sorted primary index. Rebuilt on startup from the .db file.
  • .sgidx.N — Group indexes for indexed fields. Documents with the same field value are grouped.
  • .svidx.N — Value indexes for range queries. Stores sorted field values with document offsets.

How does indexing work? Do I need to create indexes manually?

Architecture

ALOS DB indexes are fully automatic and zero-config:

  • Primary index — Always exists. Maps _id to disk offset. Built automatically from the .db file on startup.
  • Secondary indexes — Built automatically on first query for any field. No manual CreateIndex needed. Every field is indexed as soon as it is queried.

Indexes are maintained automatically on all writes. You never create, drop, or manage indexes manually.

Queries on indexed fields are O(log n). Unlike MongoDB, ALOS DB does not support compound indexes (multi-field) yet.

In practice, ALOS DB has been tested on a simulated dataset of 16 million Discord message logs with every field automatically indexed. Queries across the entire dataset — including range lookups by timestamp, regex searches within message content, and filtered aggregations — consistently complete in under 15 ms. The auto-indexing system handles all keys without any manual intervention.

What is cluster mode and how does replication work?

Architecture

Cluster mode enables multi-node replication. When enabled:

  1. The primary node accepts writes.
  2. Secondary nodes connect to the primary via --cluster-port.
  3. The primary streams write operations to secondaries.
  4. Secondaries apply writes to their local shards.

Important limitations:

  • Experimental — Cluster mode is not production-ready. It has been tested with small clusters only.
  • Asynchronous — Secondaries lag behind the primary. Not suitable for strong consistency across nodes.
  • No automatic failover — If the primary dies, you must manually promote a secondary.
  • Single primary — Only the primary accepts writes. Secondaries are read-only.

For most users, a single ALOS DB node with good backups is recommended. Use cluster mode only for experimentation or warm standbys.

Client & Network

Why does my client disconnect randomly and how do I debug it?

Architecture

TCP connections can drop for many reasons: idle timeouts, NAT evictions, server restarts, or network hiccups. The Go client has automatic reconnection:

  • If a network error occurs, the transport closes the old socket and opens a new one.
  • The request is retried up to 3 times with exponential backoff.
  • If all retries fail, the error is returned to the caller.

To debug disconnections, enable client logging:

go
import "github.com/guno1928/alosdbclient"

func init() {
    alosdbclient.SetClientLogging(true)
}

Common log patterns:

  • READ_LEN_FAILED err=EOF — Server closed the connection (idle timeout or restart).
  • READ_LEN_FAILED err=i/o timeout — Network latency spike or server overload.
  • CONNECT_FAILED — Server is down or unreachable.

How does connection pooling work?

Architecture

By default, the client creates a single TCP connection. For higher throughput, increase the pool size:

go
db, err := alosdbclient.Connect("localhost:6900",
    alosdbclient.WithPoolSize(8),
)

The pool uses round-robin: each request cycles through available connections. This prevents head-of-line blocking where one slow request stalls others.

Each connection maintains its own TCP socket and encryption state. If one drops, only requests routed to it are affected. Dropped connections are re-established automatically on next use.

For most workloads, 4-8 connections is optimal. More than 16 rarely helps and can exhaust server file descriptors.

How does the encryption handshake work?

Architecture

When credentials are configured, ALOS DB uses a PSK (Pre-Shared Key) handshake instead of TLS. This avoids X.509 certificate parsing overhead:

  1. Client generates a random 32-byte salt and sends it.
  2. Server generates its own salt, derives keys using HKDF-SHA256 from the PSK (derived from username:password), and sends its salt plus a proof.
  3. Client verifies the server's proof, then sends its own proof.
  4. Both sides now have AES-256-GCM keys for sending and receiving.

Every packet is encrypted with AES-256-GCM and a monotonically increasing counter. Replay attacks are impossible because the counter must always increase. The handshake takes ~1ms on modern CPUs.

ESC