=================================
Red-black Trees (rbtree) in Linux
=================================
:Author: Rob Landley <rob@landley.net>
What are red-black trees, and what are they for?
------------------------------------------------
Red-black trees are a type of self-balancing binary search tree, used for
storing sortable key/value data pairs. This differs from radix trees (which
are used to efficiently store sparse arrays and thus use long integer indexes
to insert/access/delete nodes) and hash tables (which are not kept sorted to
be easily traversed in order, and must be tuned for a specific size and
hash function where rbtrees scale gracefully storing arbitrary keys).
Red-black trees are similar to AVL trees, but provide faster real-time bounded
worst case performance for insertion and deletion (at most two rotations and
three rotations, respectively, to balance the tree), with slightly slower
(but still O(log n)) lookup time.
To quote Linux Weekly News:
There are a number of red-black trees in use in the kernel.
The deadline and CFQ I/O schedulers employ rbtrees to
track requests; the packet CD/DVD driver does the same.
The high-resolution timer code uses an rbtree to organize outstanding
timer requests. The ext3 filesystem tracks directory entries in a
red-black tree. Virtual memory areas (VMAs) are tracked with red-black
trees, as are epoll file descriptors, cryptographic keys, and network
packets in the "hierarchical token bucket" scheduler.
This document covers use of the Linux rbtree implementation. For more
information on the nature and implementation of Red Black Trees, see:
Linux Weekly News article on red-black trees
http://lwn.net/Articles/184495/
Wikipedia entry on red-black trees
http://en.wikipedia.org/wiki/Red-black_tree
Linux implementation of red-black trees
---------------------------------------
Linux's rbtree implementation lives in the file "lib/rbtree.c". To use it,
"#include <linux/rbtree.h>".
The Linux rbtree implementation is optimized for speed, and thus has one
less layer of indirection (and better cache locality) than more traditional
tree implementations. Instead of using pointers to separate rb_node and data
structures, each instance of struct rb_node is embedded in the data structure
it organizes. And instead of using a comparison callback function pointer,
users are expected to write their own tree search and insert functions
which call the provided rbtree functions. Locking is also left up to the
Data nodes in an rbtree tree are structures containing a struct rb_node member::