ALOS DB indexes are fully automatic. There is nothing to create, nothing to drop, and nothing to tune. Every field you query is indexed on first access and stays indexed forever.
ALOS DB is the only document database with fully automatic index management.
There is no createIndex(), no ensureIndex(), and no
dropIndex(). The index system is entirely self-managing.
The first time you query a field, ALOS DB automatically builds a B-tree index on that field. Every subsequent query on the same field uses the index instantly. This happens transparently — your code never touches index management:
// Just query. Indexes are created automatically. doc, _ := users.FindOne(alosdbclient.Document{ "guild_id": "123456789", "author.id": "987654321", }) // Both "guild_id" and "author.id" are now indexed. // Every future query on these fields is instant.
guild_id,
channel_id, type
author.id,
author.username, author.discriminator
author.global_name,
content, any depth
No manual index management. ALOS DB watches what you query and builds
indexes automatically. You never have to think about createIndex, index
ordering, compound indexes, or index maintenance. It just works.
| Feature | ALOS DB | MongoDB / PostgreSQL |
|---|---|---|
| Index creation | Automatic | Manual createIndex() |
| Index dropping | Not needed | Manual dropIndex() |
| Compound indexes | Automatic multi-field | Manual compound definition |
| Index tuning | Zero config | Manual explain + tuning |
| Nested field indexes | Automatic | Must define dot-notation index |
| Forgotten indexes | Impossible | Common production issue |
ALOS DB delivers sub-15ms query latency at 60 million documents and
sub-25ms at 300 million documents. These numbers hold for complex
multi-condition queries with nested fields, $or, $in,
$regex, $exists, and more.
These benchmarks were measured on a real production dataset — 60 million Discord message documents with 22+ top-level and nested keys per document. Every query was a complex, multi-condition filter. Not synthetic key-value lookups.
$and,
$or, $in, $exists, $ne,
$regex, and nested dot-notation fields
ALOS DB ships with a dedicated test suite of 800 complex queries — 400 hit queries (expected matches) and 400 miss queries (expected zero results). Every query must complete in under 15ms. These are real queries against a real dataset, not synthetic benchmarks.
| Test Suite | Total | Pass (≤15ms) | Pass Rate | Heap (in-use) |
|---|---|---|---|---|
| Hit Queries | 400 | 400 | 100% | — |
| Miss Queries | 400 | 400 | 100% | — |
| Combined | 800 | 800 | 100% | — |
The 800 queries cover every operator and pattern that a real application would use:
guild_id, channel_id, type)author.id,
author.username, author.discriminator
All 800 queries passed in under 15ms across 5 independent iterations (4,000 total runs),
with zero failures. Hit queries averaged 2.03ms (p95 4.6ms, max 7.6ms).
Miss queries averaged 0.35ms (p95 1.5ms, max 3.6ms). Even the most
complex cases — deeply nested $or clauses with large $in
arrays and multiple $regex patterns — comfortably cleared the 15ms
threshold.
At 60 million documents with 22+ keys and sub-keys per document, ALOS DB uses only 1.4 GB of RAM while maintaining sub-15ms query performance. This is orders of magnitude less than what other databases require for the same workload.
snapshotBacking) rather than as Go
slice-of-strings. Each doc group is a (pos, count) pair pointing into the
binary blob, not a heap-allocated list.sync.Pool is used
aggressively for buffers, decoders, and temporary slices.| Component | Typical Size | Notes |
|---|---|---|
| B-tree index per field | 2-5 MB | Keys + tree metadata |
| Snapshot backing (non-unique) | 1-3 MB per field | Binary-encoded doc ID lists |
| Shard overlays (256 shards) | <1 MB total | Drained on each snapshot |
| Offset tables | ~100 MB at 60M docs | Required for O(1) doc lookup |
| Document payloads | 0 MB | Stored on disk, read on demand |
1.4 GB for 60 million documents means ALOS DB uses roughly 24 bytes per document of RAM. Compare that to MongoDB's typical 100-500 bytes per document for WiredTiger cache overhead alone.
Rebuilding the full index for 60 million documents with 22+ keys and sub-keys takes only 8 minutes. This is currently the fastest full index rebuild of any document database at this scale.
A full rebuild re-reads every document from disk, extracts every field value, and reconstructs every B-tree index from scratch. This is the operation that runs when:
RebuildIndex() drops and
reconstructs all indexes for a collectionHashFieldValuesForIndex().
sync.Pool overhead during the
critical scan loop.Queries don't wait for a full rebuild. Phase 1 completes in a fraction of the total time and unlocks all query functionality. Phase 2 runs in the background and only adds optimization layers. Your application is responsive within seconds of a cold start, even with 60 million documents.
Every index is a B-tree with a maximum of 128 keys per node. Leaf nodes form a linked list
for efficient range traversal. The tree supports both unique indexes (one doc per key) and
non-unique indexes (many docs per key via docGroup snapshot references).
// Each index wraps a B-tree with 256 concurrent write shards type Index struct { Field string // "author.id", "guild_id", etc. Unique bool // unique constraint sortedIndex *SortedIndex // B-tree with linked leaf list shards [256]indexShard // concurrent write buffers snapshotBacking []byte // compact binary doc-ID storage } // Write shards prevent lock contention type indexShard struct { mu sync.RWMutex addOverlay map[string][]string // pending additions delOverlay map[string]map[string]struct{} // pending deletions }
Every index is divided into 256 write shards. Each key is assigned to a shard via
FastHash64String(key) & 255. This means 256 goroutines can write to 256
different keys simultaneously with zero lock contention. Read operations access the B-tree
directly with no shard locking required.
Indexes are periodically serialized to disk as binary snapshots. On restart, snapshots are loaded directly into memory — no full rebuild needed if the snapshot is valid. Snapshot format v4 stores: header, sorted keys, doc-ID counts, and packed doc-ID lists. This is faster than rebuilding from raw documents by an order of magnitude.
ALOS DB uses a two-phase index loading strategy that prioritizes query availability over complete index optimization. This is what makes cold starts fast.
The first phase builds the minimum set of indexes required for correct query execution:
Phase 1 performs a single-pass scan of each shard. No MVCC pre-scan is needed. Doc count is
derived from the scan itself, skipping an expensive CountDocuments full-scan.
When phase 1 completes, the indexReady channel closes and all queries can
execute.
The second phase runs in a background goroutine after indexReady closes. It
builds optimization layers that make queries faster but are not required for correctness:
Every query path has a nil-safe fallback for missing secondary indexes. If a bloom filter hasn't been built yet, the query falls back to a shard scan. If a field summary is missing, the range query scans all candidates. The result is always correct — secondary indexes only make it faster.
ALOS DB automatically selects the best execution strategy for every query. There is no
EXPLAIN to run and no query hints to provide. The planner chooses the optimal
path in microseconds.
| Operator | Index Used | Execution |
|---|---|---|
field: value (exact) |
Yes | Single B-tree lookup |
$eq |
Yes | Single B-tree lookup |
$in |
Yes | Multi-key lookup, merge results |
$gt / $gte / $lt / $lte |
Yes | Linked-leaf range traversal |
$regex |
Prefix/Literal | Anchored-prefix B-tree traversal; otherwise cached regex filter |
$exists |
Metadata | Field-summary pruning, then filtered scan only when needed |
$ne / $nin |
Filter Only | Post-filter on the reduced candidate set; full scan only if no other clause narrows work |
$regex is index-accelerated when the pattern has a usable literal prefix such as
"^john" or an exact literal match. Unanchored, suffix, substring, and
case-insensitive regex patterns fall back to cached regex evaluation after candidate
reduction. $exists is not a direct single-key lookup like $eq,
but it still benefits from shard field summaries that can instantly prove a field is absent
everywhere or present everywhere before any document scan starts. $ne and
$nin currently do not select candidate IDs from the B-tree on their own; they
become fast when paired with an indexed or metadata-pruned clause that has already made the
remaining candidate set small.
When a query mixes directly indexed operators, metadata-assisted operators, and fallback filters, ALOS DB uses the best indexed clause first, applies shard-summary pruning where available, and only then evaluates the remaining filters against the reduced candidate set:
// Query: {guild_id: "123", author.username: {$regex: "^john"}, profile.bio: {$exists: true}} // // Step 1: Use exact index on "guild_id" to cut the search space immediately // Step 2: Use anchored-prefix regex index traversal on "author.username" // Step 3: Use field-summary metadata to skip shards that cannot satisfy profile.bio existence // Step 4: Evaluate any remaining filters on the already-reduced candidate set // // "Instant" queries often come from this combination of index traversal + metadata pruning, // not necessarily from every operator being a standalone exact lookup.