Bonuses added for lab 3 are at the bottom of this document, starting [here](#navigable-arrow-keys).
# general code structure
The main objective of this assignment was to tokenize given inputs and process the resulting tokens according to the given grammar. We decided to seperate the two steps of tokenizing and processing almost entirely, which decouples the input parser from the execution very neatly. The one drawback of this method is that each input is, more or less, processed twice - but we considered this a fair trade-off, as the input for a shell can generally be expected to not be incredulously large.
The crux of our approach lies in a slight addition to the `ListNode` struct: we have added a single `Enum Type` attribute which places each token within one of a few categories, roughly mapping to the grammar we are implementing.
The possible values of the `Type` enum are:
Value | Definition | meaning
--- | --- | ---
EXECUTABLE | any non-[operator](#operators) | A command in either `$CWD` or `$PATH`.
OPTION | any non-[operator](#operators) | Any options to pass to an `EXECUTABLE` or `BUILTIN`.
COMPOSITION | '&&', '\|\|', ';', '\n' | Special characters which signal the end of a chain.
BUILTIN | a [builtin](#builtins) | One of a few pre-defined builtin executables.
PIPELINE | '\|' | A pipeline, indicating the preceding and following commands ought to be piped together.
REDIRECT | '<', '>' | Signals `stdin` and/or `stdout` should be handled differently.
BACKGROUND | '&' | Tells the shell to run a given chain as non-blocking (run in the background).
FILENAME | any non[-operator](#operators) | A file (for redirection).
### builtins
A builtin for our shell means any specific command which should be handled by the shell internally, as opposed to calling an external executable.
So far, we have implemented:
Builtin | Result
--- | ---
status | prints the return status of the last run command.
exit | exits the shell.
cd | changes the current working directory.
debug | prints debugging information to `stdout`.
### operators
An operator is a special token which tells the shell to process the preceding (and following) chain differently.
The standard operators are (or will be) implemented by this shell:
Operator | Name | Effect
--- | --- | ---
"&" | Background operator | Tells the shell ro run a chain in the background.
"&&" | Conditional and | Tells the shell to execute the next chain iff the status of the previous chain equals `0`.
"\|\|" | Conditional or | Tells the shell to execute the next chain iff the status of the previous chain does not equal `0`.
";" | Seperator | Tells the shell when a chain ends.
"<" | Redirect in | Tells the shell to take input from a non-`stdin` stream.
">" | Redirect out | Tells the shell to print output to a non-`stdout` stream.
"\|" | Pipe operator | Tells the shell to chain commands together.
## processing
After parsing a complete line and assigning each token a Type, the next step of the program is the actual processing.
To aid in this, we have constructed a few different types - defined in `cmd.h` - which model various parts of a typical shell execution.
The `Command` struct models one executable and its' options, arranged in `char *` and `char **` to easily pass them to an `exec()` system call.
```c
typedef struct Command {
char **arguments;
char *command;
int capacity;
int numArguments;
} Command;
```
The `CommandList` struct models a list of Commands, and will be used when multiple commands are to be executed together (as is the case when piping commands, or using the background operator).
```c
typedef struct CommandList {
Command *commands;
int capacity;
int numCommands;
} CommandList;
```
The `Chain` struct models one complete chain to be executed. It conains a list of commands to execute, and extra information related to I/O - any redirects. These are set to `NULL` if no redirect was given.
```c
typedef struct Chain {
CommandList commands;
bool runInBackground;
char *in;
char *out;
} Chain;
```
These structs all come with some library functions for ease of use, which allow us to build up commands and chains, and execute them appropriately.
## main
The driver code behind the program lives (obviously) in `main()`.
The code is mostly found in the while loop, which runs until the value of `do_loop` is set to false, which only happens when the program reaches the EOF or the BUILTIN command exit is given. `inputLine` is the line that the user enters the shell. We declare `tokenList` to store all of the tokens that are picked up from the input.
`readInputLine()` reads the entire input line. We modified this function so that it returns NULL on EOF. Before we start, `inputLine` is quickly checked to see whether or not its NULL. If it is not, the function `getTokenList()` is run to extract all the tokens from the input and place them into `tokenList`.
Since the `tokenList` will be edited, we create two copies of the list, `cpy` and `og`. Before looking through any tokens, the `parseInputList()` function is utilised to check the validity of `tokenList`. It is also checked whether or not the list is NULL. If these are both valid, we can begin executing the line, else we print an invalid syntax error message.
The execution is done within a while loop that runs until the `cpy` list is entirely consumed and is NULL. Within this while loop, we implemented a switch case which, at this level, should only accept the following enum types:
- EXECUTABLE
- COMPOSITION
- BUILTIN
If the code encounters any other types at this level, it means that the syntax of the input is incorrect. If this happens, an error message is printed out and `cpy` is set to NULL in order to exit the loop.
In the end, we use `free()` and `freeTokenList()` to free the memory.
# bonuses
We have enhanced the functionality of our shell by implementing a few bonuses. These are described in this section.
## cd builtin
Any self-respecting shell needs a way to change the current working directory. Luckily for the end-user, we have implemented exactly that!
Use the builtin `cd` to move around the system. We even added support for moving to the `$HOME` directory - accomplished by witholding any command options (just `cd` will get you home).
Note that this builtin has side-effects, and might therefore edit `status`. If changing directories fails for whatever reason (insufficient permissions, destination doesn't exist, etc), `status` *will* be set to -1.
This builtin provides interactions like these ('>' is input, added in post):
```shell
>realpath .
/home/djairoh/Nextcloud/opsys/assignments/1/src
>cd /home/djairoh/Documents
>realpath .
/home/djairoh/Documents
>cd
>realpath .
/home/djairoh
```
## debug mode
To get an insight into how the shell works under the hood, we've added a custom builtin:
with the `debug` flag, all executed commands will be laid out in glorious JSON format.
This builtin is a toggle, meaning a second use will disable it once more. Provided below is some example input/output with this builtin on. Lines starting with '>' are input (the '>' was manually added post-fact).
```shell
>debug
Toggled debug to 1.
>echo "hello, world!"
{
"commandList": {
"commands": [
{
"arguments": [
"echo",
"hello, world!",
"(null)"
],
"command": "echo",
"capacity": 4,
"numArguments": 3
}],
"capacity": 4,
"numCommands": 1
},
"runInBackground": 0
}
hello, world!
```
## shell prompt (coloured)
Standard in any shell is a prompt. These can range from minimalist (the `sh` prompt only shows the current version of the shell), to things as complicated and customizable as the [Starship](https://starship.rs/) prompt - of which an example is provided here:

While we didn't quite get around to this level of complexity, we have implemented a little prompt consisting of three components:
* the username of the person who started the shell
* the current working directory
* the status/status of the last command, indicated by '>'
The end result looks like this:

## navigable arrow keys
Another standard shell feature, available in everything from the modern fish and zsh, to shells as far back as sh, is that of the arrow key. Because our shell takes input using `getchar()` the user is unable to navigate back in an unfinished prompt. By switching to the GNU readline library, we can make use of the `readline()` function - allowing us to navigate an unfinished prompt in the expected manner.
This is a bit hard to illustrate using text/images, so we recommend you build the shell yourself if you wish to play around with this a little bit.
## tab completion
Another perk of the GNU readline library is it quite neatly allows for tab completion, bash-style. Use tab while entering a command and watch as the shell tries its best to complete it for you.
Note that this only works on filenames - the bash-style tab-completion of variables and executable commands is not implemented. Included below is an example; after entering 'cat' we then pressed tab to showcase the completion, which prints all (matching) files in CWD to the shell.

## background operation
Everyone who has ever used a shell before will be familiar with the background operator, but here's a quick recap anyay: Whenever a command is suffixed with a single ampersand (`&`), it is run in the background and control of the shell is immediately returned to the user. The output streams of the command are still connected to the shell, (ergo, `echo 7 &` will still print '7' to stdout, which might mess with the prompt a little), but the command is run as non-blocking.
Because of the structured approach we took towards chain/command design, it was trivially easy for us to implement this bonus: a simple if statement (checking for the background operator) and we are good to go!
The relevant snippet of code lives in `executeChain` and reads as follows:
```c
// wait on children
int stat;
#if EXT_PROMPT
if (chain.runInBackground) return;
#endif
for (int i=0; i < numCommands; i++) {
waitpid(pids[i], &stat, 0);
if (WIFEXITED(stat)) {
*status = WEXITSTATUS(stat);
}
}
```