Building RDB from scratch - 01. B+ Tree

1. Why B+ tree?

When starting the construction of a lightweight relational database, the B+ tree emerges as a fundamental data structure. Its inherent ability to efficiently index and retrieve data renders it indispensable for managing structured information. Unlike traditional binary search trees, the B+ tree is specifically optimized for disk-based storage, minimizing the number of I/O operations required to access data—a crucial consideration for any relational database where data typically resides on disk, not solely in memory. Thus, the B+ tree is not merely a beneficial option, but rather a cornerstone for any viable small-scale RDB.
Furthermore, examining the implementation details of a B+ tree, such as those found in the open-source collinglass/bptree library, offers valuable insights into practical database design and implementation, which I will explore in this post.

2. Implementations

1) Core Data Structures

type Tree struct {
	Root *Node
}

type Record struct {
	Value []byte
}

type Node struct {
	Pointers []interface{}
	Keys     []int
	Parent   *Node
	IsLeaf   bool
	NumKeys  int
	Next     *Node
}

Tree represents the entire B+ tree structure, with the Root field serving as the entry point to the tree. It is a basic and clear way to represent the whole B+ tree structure.
Record struct separates keys from their byte array values ([]byte), allowing the library to handle diverse data types as long as they can be serialized.
Node represents a single B+ tree node, which holds Pointers (to child nodes or record values), Keys, a Parent pointer, a flag for IsLeaf, NumKeys tracking the number of valid keys, and a Next pointer for linked list of leaf nodes. The interface{} type for Pointers enables handling both nodes and records, and NumKeys ensures performance optimization to determine the number of valid keys.

2) Insertion

func (t *Tree) Insert(key int, value []byte) error {
	// ...
	if t.Root == nil {
		return t.startNewTree(key, pointer)
	}
	// ...
    leaf = t.findLeaf(key, false)
    
	if leaf.NumKeys < order-1 {
		insertIntoLeaf(leaf, key, pointer)
		return nil
	}

	return t.insertIntoLeafAfterSplitting(leaf, key, pointer)
}

The insertion process first checks for a root node, creating a new tree if necessary. It locates the correct leaf node via t.findLeaf and then inserts the key if there’s capacity, using insertIntoLeaf as a helper method. When the leaf is full, t.insertIntoLeafAfterSplitting handles the node split, utilizing temporary arrays and demonstrating a “copy-on-write” approach where a new node is created during splits to keep the B+ tree balanced.

3) Searching

func (t *Tree) Find(key int, verbose bool) (*Record, error) {
	i := 0
	c := t.findLeaf(key, verbose)
	if c == nil {
		return nil, errors.New("key not found")
	}
	for i = 0; i < c.NumKeys; i++ {
		if c.Keys[i] == key {
			break
		}
	}
	if i == c.NumKeys {
		return nil, errors.New("key not found")
	}

	r, _ := c.Pointers[i].(*Record)

	return r, nil
}

The search starts with the private findLeaf function, which recursively navigates the tree to locate the appropriate leaf node. Within the leaf node, a loop iterates through the keys to find a match. Upon finding a matching key, the associated value is returned after a type assertion to the Record type. This search logic demonstrates how each node acts as a decision point, guiding the traversal along the correct search path to find the target key.

4) Deletion

func (t *Tree) Delete(key int) error {
	key_record, err := t.Find(key, false)
	if err != nil {
		return err
	}
	key_leaf := t.findLeaf(key, false)
	if key_record != nil && key_leaf != nil {
		t.deleteEntry(key_leaf, key, key_record)
	}
	return nil
}

The deletion process begins by using t.Find and t.findLeaf to locate the target node and key. Once found, the t.deleteEntry method is invoked, which orchestrates a complex deletion procedure. This involves removeEntryFromNode to remove the key from the node, and then a series of other operations: adjustRoot to handle root node modifications, and either coalesceNodes or redistributeNodes to maintain the B+ tree’s balance through merging or redistributing nodes, respectively.

· database, RDB, B+ tree, data structures, algorithms, implementation, software engineering