both Rust and C++ basically give you two big “map families”:
- ordered map (tree-based): keys kept sorted, predictable iteration order
- unordered map (hash-based): keys not sorted, usually faster average lookups
What an ordered map actually is
BTreeMap (Rust) and std::map (C++) are not arrays.
They are balanced trees (B-tree in Rust, red-black tree in C++).
Conceptually they look like this:
50
/ \
20 80
/ \ / \
10 30 60 90
Ordered maps/sets (tree-based)
- Rust:
BTreeMap<K,V>,BTreeSet<K> - C++:
std::map,std::set(typically red-black tree)
Differences
- Operations: O(log n) lookup/insert/remove
- Keeps keys sorted
- Supports range queries: “all keys between a..b”
- Stable-ish iteration order (sorted)
Searching cost in Ordered Map, memory killer
Searching in Ordered Map is not a simple binary search on a sorted array.
Binary search is fast:
sorted array + random access → O(log n)
But a tree map is not an array.
In an array:
[10, 20, 30, 40, 50, 60, 70]
^
jump directly to middle
You can instantly jump to index n/2.
A tree cannot.
A tree lives in scattered memory:
node(50) -> pointer -> node(80) -> pointer -> node(60)
The CPU cannot jump to “middle element”.
It must follow pointers step-by-step.
So even though the keys are ordered, the computer still has to walk the tree.
The real performance killer: memory access
Here is the part many programmers only learn much later:
For modern CPUs, memory access cost dominates complexity.
| Operation | Rough cost |
|---|---|
| integer add | 1 cycle |
| hash computation | ~5–20 cycles |
| RAM access (pointer chasing) | 100–300 cycles |
Tree maps do this repeatedly:
follow pointer → cache miss → wait follow pointer → cache miss → wait follow pointer → cache miss → wait
Hash maps instead:
compute hash jump to contiguous array maybe 1–2 comparisons
This is why in practice:
HashMap is often 5–20× faster than std::map/BTreeMap
Even though O(1) vs O(log n) sounds like a small difference mathematically.
Unordered maps/sets (hash-based)
- Rust:
HashMap<K,V>,HashSet<K> - C++:
std::unordered_map,std::unordered_set
Differences
- Average O(1) lookup/insert/remove
- No ordering
- Depends on hashing quality + load factor; worst-case can degrade
Choose when
- You mostly do point lookups by key and don’t care about order
- You want best average performance
Why a hash map is O(1)
A hash map does something completely different.
Instead of navigating structure, it computes a memory address.
Step 1 — hashing
We run a hash function:
hash("alice") → 1,874,923,123
Step 2 — bucket index
index = hash % table_size
Example:
table_size = 1024 index = 1,874,923,123 % 1024 = 371
Now we jump directly to bucket 371 in an array.
The size of the hashmap
Do you need to know the table size beforehand?
Good intuition — it looks like you should.
But no, you do not.
You can do:
let mut map = HashMap::new();
and insert millions of elements.
So what’s happening?
The hidden behavior: resizing (rehashing)
A hash map internally stores an array of buckets:
[ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ][ 6 ][ 7 ]
Each key goes to a bucket:
index = hash(key) % bucket_count
At the beginning:
bucket_count = small (like 8)
When it fills up → performance would degrade.
So the map automatically resizes.
What resize does
When load factor becomes too large (~70–80% full):
- allocate bigger table (usually 2×)
- recompute bucket for EVERY key
- move all elements
This is called:
rehashing
This is why hash map insertion is:
amortized O(1)
Because occasionally an insertion costs O(n) (when resize happens).
Then why do with_capacity() exist?
Because resizing is expensive.
If you know approximately:
"I'm going to insert 1 million items"
you should do:
Rust:
HashMap::with_capacity(1_000_000);
C++:
unordered_map.reserve(1'000'000);
Why it matters:
Without it:
8 → 16 → 32 → 64 → 128 → ... → 1,048,576
Each step rehashes everything.
You just paid multiple full O(n) copies.
In performance-critical systems (compilers, databases, zk provers), this matters a lot.
A very practical rule
Use:
HashMap
→ default choice
Switch to:
BTreeMap/std::map only if you need:
- sorted iteration
- ranges
- deterministic output
- next/previous key