implementing a b-tree from scratch in go
why b-trees matter
every database you have ever used is built on a B-tree or one of its variants. sqlite, postgres, mysql, mongodb — they all use B-trees for their primary indexes. understanding B-trees is understanding how databases work at the storage layer.
a B-tree is a self-balancing tree where every node can hold multiple keys and have multiple children. it is designed to minimize disk reads by keeping the tree shallow and packing many keys per node.
the structure
const ORDER = 4 // max children per node
type BTreeNode struct {
keys []int
children []*BTreeNode
isLeaf bool
}
type BTree struct {
root *BTreeNode
}
func NewBTree() *BTree {
return &BTree{
root: &BTreeNode{isLeaf: true},
}
}
a node with ORDER=4 holds up to 3 keys and up to 4 children. all leaf nodes are at the same depth — this is the property that keeps the tree balanced.
searching
search is straightforward: at each node, binary search the keys to find the right child pointer to follow:
func (t *BTree) Search(key int) bool {
return searchNode(t.root, key)
}
func searchNode(node *BTreeNode, key int) bool {
i := 0
for i < len(node.keys) && key > node.keys[i] {
i++
}
if i < len(node.keys) && key == node.keys[i] {
return true // found
}
if node.isLeaf {
return false // not found
}
return searchNode(node.children[i], key)
}
insertion and splitting
insertion is where B-trees get interesting. when a node is full, it must be split before inserting:
func (t *BTree) Insert(key int) {
root := t.root
// if root is full, split it and create a new root
if len(root.keys) == ORDER-1 {
newRoot := &BTreeNode{}
newRoot.children = append(newRoot.children, t.root)
splitChild(newRoot, 0)
t.root = newRoot
}
insertNonFull(t.root, key)
}
func splitChild(parent *BTreeNode, i int) {
child := parent.children[i]
mid := (ORDER - 1) / 2
// new node gets the right half of child's keys
newNode := &BTreeNode{
isLeaf: child.isLeaf,
keys: append([]int{}, child.keys[mid+1:]...),
}
if !child.isLeaf {
newNode.children = append([]*BTreeNode{}, child.children[mid+1:]...)
child.children = child.children[:mid+1]
}
// promote middle key to parent
parent.keys = append(parent.keys, 0)
copy(parent.keys[i+1:], parent.keys[i:])
parent.keys[i] = child.keys[mid]
// insert new node into parent's children
parent.children = append(parent.children, nil)
copy(parent.children[i+2:], parent.children[i+1:])
parent.children[i+1] = newNode
// trim child
child.keys = child.keys[:mid]
}
height and performance
for a B-tree of order M with N keys, the height is O(log_M N). with M=1000 (typical for disk-based B-trees with 4KB pages), a tree with 1 billion keys has height 3. that means at most 3 disk reads to find any key in a billion-key database.
this is why B-trees dominate database storage even after 50 years.
b+ trees
real databases use B+ trees, a variant where:
- only leaf nodes store values (internal nodes store only keys for routing)
- leaf nodes are linked in a doubly-linked list for efficient range scans
type BPlusLeaf struct {
keys []int
values [][]byte // actual data or pointers to data
next *BPlusLeaf
prev *BPlusLeaf
}
the linked leaf list makes range queries like WHERE age BETWEEN 20 AND 30 extremely efficient — find the first matching leaf, then walk the list.