Source Code

WARNING: All code was developed using either Ubuntu 18.04 or Windows Subsystem for Linux (WSL). The build/test environments are containerized using Docker. There is no obvious reason why anything should fail with other configurations; however, no other environments have not been tested.

Prerequisites

In order to run the code locally, you will need:

GitHub Repository

Every algorithm/data structure on this site has accompanying C code that demonstrates the functionality. All code is located within a single GitHub Repository. The file layout is as follows:

├ docker                    <- Defines the build/test environment for all the c folders below
│   └ c
│     ├ Dockerfile
│     ├ Makefile            <- Defines all build configurations
│     ├ coverage.sh         <- Creates a unit test coverage report
│     └ validate.sh         <- Runs unit tests for every build configuration
├ [section]                 <- There are many section folders such as sorting or data_structures.
│   └ c                     <- Each c folder behaves like a stand-alone project
│     └ includes            <- Holds header files from shared libraries, populated by running libs.sh
│        └ ...
│     ├ [Algorithm].c       <- Each algorithm/data structure has a dedicated c and header file
│     ├ [Algorithm].h
│     ├ [AlgorithmTest].c   <- Unit tests for each algorithm/data structure
│     ├ algo_timer.c        <- Optional: Records the actual run time required to perform n operations
│     ├ compare_times.py    <- Optional: Arranges data generated by algo_timer.c into charts and tables
│     ├ time_charts.sh      <- Optional: Executes compare_times.py
│     ├ TestRunner.c        <- Test runner for all unit tests within the folder
│     └ ...
├ ci_validate.sh            <- Script executed by the CI server to validate entire project
├ libs.sh                   <- Builds the shared libraries and populates the c/includes folders
├ run_c.sh                  <- Builds and tests indvidual c folders
└ ...

Clone the repository locally with the command below.

git clone https://github.com/dalealleshouse/algorithms.git

Building The Code

WARNING: Docker requires permission to map a volume from inside the container to your local drive. If you are using windows, this requires some extra settings. Please refer to the File Sharing section located here. Other configurations may require different settings.

Builds utilize make and clang which come preconfigured in a Docker image. First obtain the algo_test_runner_c Docker image:

// Either pull from docker hub with this command
docker pull dalealleshouse/algo_test_runner_c

// or build locally by navigating to the docker folder and running this command
docker build -t dalealleshouse/algo_test_runner_c -t dalealleshouse/algo_test_runner_c ./ 

Each c folder acts a stand-alone project; however, they all use the same Docker build image. Files are shared between projects via static libraries which aren’t checked into source control. To build the shared static libraries, navigate to the root directory and execute the libs.sh shell script which will populate an include folder inside each c folder. This only needs to happen once.

./libs.sh

Finally, to build and run included unit tests, navigate to the c code folder you wish to build and run one of the following command:

// execute the pre-made shell script 
../../run_c.sh 

// or run the docker command
directly docker run --privileged --rm -it -v $(pwd):/src dalealleshouse/algo_test_runner_c 

Validating the Code

Validation is achieved by two primary means: unit tests and clang sanitizers.

All units tests are written with cunit. Each algorithm/data structure code file has an associated test file that is suffixed with Test. Each test must be referenced in TestRunner.c in order for the test framework to run it. To execute tests using the default configuration, navigate to the desired c folder execute the run_c.sh shell script located in the root directory.

../../run_c.sh

The following clang Sanitizers are configured in docker/Makefile

Due to the nature of sanitizers, they cannot run concurrently1. Therefore, the code must be built separately for each sanitizer. The docker/validate.sh shell script will run the unit tests against each individual sanitizer. To execute all sanitizers, navigate to the desired c folder and run the following command.

../../run_c.sh ./validate.sh

In order to run validate.sh on every c folder, navigate to the root directory and execute the ci_validate.sh shell script. This is the script run by the Continuous Integration (CI) server on every check in to ensure code quality.

./ci_validate.sh

The docker\coverage.sh shell script generates a code coverage report. To execute, run the command below from within the desired c folder. A subdirectory named output will be created with a full html coverage report.

 ../../run_c.sh ./coverage.sh

As an aside, the following clang utilities are enabled in the docker/Makefile in order to enable added run time protection.

Actual Run Times

Where appropriate, actual run time data is provided in order to demonstrate the temporal impact of algorithm and data structure choices. This is instrumental for comprehension of asymptotic complexity. The data is generated by three code files, which are located within the c directories.

  1. algo_timer.c: Records the run time incurred by performing operations with various input sizes. Returns the median of three separate executions.
  2. compare_times.py: Arranges data generated by algo_timer.c into charts and tables
  3. time_charts.sh: Executes compare_times.py

All run time data was collected using a docker container running on a Surface Book 2 Laptop (Intel Core i7, 16GB RAM). You can regenerate the data on your machine simply by executing the time_charts.sh shell script from within the desired c directory.

As a matter of note, the actual run-time is of little importance because it applies only to the machine on which the test was run. The pertinent detail is the comparison between algorithms. The generated charts will have a similar shape regardless of the machine or even implementation language. This is a key concept in algorithmic analysis.

WARNING: Milliseconds are close to non-consequential for most applications. Remember the words of the great Donald Knuth - “pre-mature optimization is the root of all evil.”

Implementation Quality

Every attempt was made to ensure the quality of the provided C code. For instance, each has associated unit tests. The code is subjected to several clang sanitizers as well as clang tidy and attempts are made to return proper error codes. However, this is NOT production hardened code. Before using C in production scenarios, please do additional research to ensure the safety of your code.

Below are a some known shortcomings; there are likely more.

  • Several of the implementations utilize recursion. Recursion provides an elegant means of demonstrating algorithms; however, it causes stack overflow exceptions when the input is sufficiently large2. For many purposes, this is not an issue; however, use in production scenarios should be thoroughly scrutinized.
  • The C code makes extensive use of the rand() function. For many reasons, this is not ideal. Eternally Confused has a great article about what this is true. Furthermore, rand() is never seeded. This ensures that the actual run time comparisons between algorithms accepting randomly sorted arrays are using the exact same inputs. For the purposes of this project, rand() is sufficient. However, a different solution is required for production scenarios.
  • There is little to no thought to given to C code portability. It is assumed that it will always run in the supplied docker container on a 64-bit machine.

Pull Requests

I will happily accept pull request that meet the following criteria:

  1. The request is made against the dev branch
  2. There are associated unit tests
  3. ci_validate.sh executes without error

If I accept your pull request, I will provide proper attribution on this site.

  1. The Undefined Behavior sanitizer can run with any other sanitizer. However, the Memory, Thread, and Address sanitizers cannot run with each other. 

  2. This is not an issue with tail call optimized languages given that the function is written correctly. Clang does have a tail call optimization feature which could be used to mitigate this issue. However, it would require testing.