Building a Linux Shell in C

How does it interpret user input?Looking through the process, we built a rudimentary flowchart for how a simple shell works.

This ended up becoming the skeleton for our shell, and looked something like this:Display the prompt, using PS1.

Get user inputParse input into individual tokensCheck tokens (a.

Aliases, b.

Built-in commands, c.

the PATH)ExecuteStart all over!This is a very simple skeleton and it ignores some functionalities that are present in both our shell and the simple shell.

Nevertheless, it will serve nicely to demonstrate how a simple shell works.

Let’s walk through each step.

Welcome to our shell!1.

Display the prompt, using PS1When you run a shell, the first thing you see is a prompt.

In bash, you might see something like:vagrant@vagrant-ubuntu-trusty-64:~$In our shell, we simply show:$The shell populates its prompt with the PS1 built-in shell variable.

PS1 stands for Prompt String 1.

(Note: PS2, PS3, and PS4 also exist.

If you’re curious about their values, type echo “$PS2”, or 3, or 4, into your terminal).

2.

Get user inputAfter the prompt appears, the shell captures input, up until the user hits return.

In our shell, we use the getline function for this, which in turn uses the syscall read.

That input is then saved as a string.

For demonstration purposes, let’s imagine our user types in:$ls -lWe now have the following string saved: “ls -l.”.

That new line comes from the return key, and we need to replace it with a terminating null byte before moving on.

3.

Parse input into individual tokensNext, the prompt takes our string, and breaks it into individual tokens.

Those tokens can be stored in an array and acted on separately from each other, if need be.

Our former string is now separated into 2 strings: “ls” and “-l”.

4.

Check TokensAfter we have our arguments stored as tokens, we need to go through each of them, and check them against our aliases, built-in commands, and the PATH, in that order.

Let’s take a deeper look at each of these.

4a.

AliasesAn alias is a string that we assign to something else, a bit like a nickname.

When we type something into our terminal, we want to first check to see if it stands for something else, and replace it.

Hypothetically, we could have previously made ls an alias for another command.

If we had done that, this step would replace ls with that command.

For now, let’s assume that our ls is not an alias for something else.

We can then move on to our next steps in the pipeline.

4b.

Built-in CommandsAfter switching out any aliases, we want to next look for built-in commands.

Built-in commands are commands that exist within the shell.

They aren’t outside programs living in other directories.

Theoretically, they can run out of the box.

In the simple shell, some of the built-ins include exit and cd.

In the shell that we made, we also have built-in commands such as setenv (which sets environmental variables).

If we find built-in commands within our tokens, we want to run them, and then return to our prompt for more user input.

In our case, ls is not a built-in, so we can continue through our steps.

4c.

the PATHThe PATH is an environment variable which specifies directories for executables in an operating system.

Once we’ve determined that the tokens aren’t built-in commands.

We want to check the PATH, and see if we have any matching executables there.

When someone types ls, for example, our shell checks the PATH and finds that ls is a program existing in the directory /bin/.

If we append ls to the directory that contains it, we have its absolute path: /bin/ls.

(Note: We use the syscall stat for this, which returns 0 if the file is found).

/bin/ls is a program that we can run regardless of our current working directory.

5.

ExecuteFinally, we can run our command.

In our shell, we fork our current process and run the /bin/ls program in our child process, using the execve syscall.

The /bin/ls program is run, with the optional flag, -l, which lists files in long format.

If the program didn’t exist, we would print an error.

We did it!6.

Start all over!Once the shell finishes executing the user’s command, it goes back to the beginning, prints the prompt, and waits for more user input.

This cycle continues until the user runs the exit built-in command, or enters ctrl-D.

Final ThoughtsThis is how we pulled apart the simple shell and started working on our own.

When you really look deep, a seemingly simple command like ls -l actually has a lot going on under the hood.

You just have to peer inside and tinker around a bit to find it.

Feel free to check out our results here, and give our shell a spin.

Photo by Courtney Hedger on Unsplash.

. More details

Leave a Reply