writing a unix shell from scratch in c
what a shell actually does
a shell is a program that reads commands from the user, parses them, and executes them. that's it. the mystique around shells disappears once you build one.
under the hood, a shell does four things in a loop:
- print a prompt and read a line
- parse the line into tokens
- fork a child process
- exec the command in the child
the repl
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/wait.h>
#define MAX_ARGS 64
#define MAX_INPUT 1024
void run_command(char **args) {
pid_t pid = fork();
if (pid == 0) {
// child process
if (execvp(args[0], args) == -1) {
perror("exec failed");
exit(EXIT_FAILURE);
}
} else if (pid > 0) {
// parent waits for child
int status;
waitpid(pid, &status, 0);
} else {
perror("fork failed");
}
}
int main() {
char input[MAX_INPUT];
char *args[MAX_ARGS];
while (1) {
printf("mysh> ");
fflush(stdout);
if (!fgets(input, MAX_INPUT, stdin)) break;
// strip newline
input[strcspn(input, "\n")] = 0;
// tokenize
int argc = 0;
char *token = strtok(input, " ");
while (token && argc < MAX_ARGS - 1) {
args[argc++] = token;
token = strtok(NULL, " ");
}
args[argc] = NULL;
if (argc == 0) continue;
// built-in: exit
if (strcmp(args[0], "exit") == 0) break;
// built-in: cd
if (strcmp(args[0], "cd") == 0) {
if (args[1]) chdir(args[1]);
continue;
}
run_command(args);
}
return 0;
}
why fork + exec
fork() creates an exact copy of the current process. exec() replaces the current process image with a new program. together they give you a clean environment for each command.
the reason for separating fork and exec is power: between fork and exec, you can redirect file descriptors, set environment variables, set process groups — all before the new program starts running.
implementing pipes
pipes connect the stdout of one process to the stdin of the next. the pipe() syscall gives you a pair of file descriptors:
void run_pipeline(char **left_args, char **right_args) {
int pipefd[2];
pipe(pipefd); // pipefd[0] = read end, pipefd[1] = write end
pid_t left = fork();
if (left == 0) {
// left command writes to pipe
dup2(pipefd[1], STDOUT_FILENO);
close(pipefd[0]);
close(pipefd[1]);
execvp(left_args[0], left_args);
exit(1);
}
pid_t right = fork();
if (right == 0) {
// right command reads from pipe
dup2(pipefd[0], STDIN_FILENO);
close(pipefd[0]);
close(pipefd[1]);
execvp(right_args[0], right_args);
exit(1);
}
// parent closes both ends and waits
close(pipefd[0]);
close(pipefd[1]);
waitpid(left, NULL, 0);
waitpid(right, NULL, 0);
}
signal handling
without signal handling, pressing Ctrl+C kills your shell instead of the running command. fix it:
#include <signal.h>
void sigint_handler(int sig) {
// do nothing in the parent — the signal propagates to the child
write(STDOUT_FILENO, "\n", 1);
}
// in main, before the repl loop:
signal(SIGINT, sigint_handler);
what to build next
- I/O redirection:
cmd > fileandcmd < fileusingopen()+dup2() - background jobs:
cmd &— don't callwaitpid(), track PIDs separately - job control:
fg,bg,jobsusing process groups andSIGTSTP - history: readline integration or a simple circular buffer
building a shell teaches you more about unix process management than any other project. every mysterious shell behavior suddenly makes sense.