understanding the linux scheduler
what a scheduler actually does
the linux scheduler decides which process runs on which CPU at any given moment. it runs thousands of times per second. getting it wrong means laggy UIs, poor throughput, or wasted CPU cycles.
modern linux uses CFS — the Completely Fair Scheduler — since kernel 2.6.23.
virtual runtime
CFS tracks a vruntime counter for every runnable task. when a task runs, its vruntime increases. the scheduler always picks the task with the lowest vruntime — the task that has received the least CPU time relative to its weight.
struct sched_entity {
struct load_weight load;
struct rb_node run_node; // position in the red-black tree
u64 vruntime;
// ...
};
the red-black tree
CFS stores all runnable tasks in a red-black tree sorted by vruntime. the leftmost node is always the next task to run. insertion and removal are O(log n).
this is why linux scheduling has predictable overhead even with thousands of runnable tasks.
cgroups and scheduling classes
CFS is one of several scheduling classes in linux. real-time tasks use SCHED_FIFO or SCHED_RR and always preempt CFS tasks. cgroups let you assign CPU weight to groups of processes — useful for containers.
what this means for your code
if you have latency-sensitive work, SCHED_FIFO with a real-time priority will preempt everything else. for batch work, nice values adjust CFS weight. understanding the scheduler helps you understand why your process gets CPU time when you expect it — and why it doesn't when you don't.