| assets | ||
| input | ||
| output | ||
| src | ||
| tests | ||
| .gitignore | ||
| Cargo.lock | ||
| Cargo.toml | ||
| README.md | ||
Rust implementation of a neural network.
Introduction
I chose as SAP project to build a neural network in rust.
The idea comes from the video : building a neural network from scratch, this video explains the steps to train a neural network "from scratch" i.e. without a machine learning libraries like Pytorch or Tensorflow. This implementation is a solution submitted to the competition Digit Recognizer. The goal of this competition is to correctly identify digits from a dataset of tens of thousands of handwritten images (MNIST dataset).
I used video and the notebook as a reference implementation. I also use the notebook to generate expected results for tests.
Multi Layer Perceptron
Multi Layer Perceptron (MLP) is a type of neural network composed of multiple layers of neurons. There are at least 3 layers: one input layer (input values), one output layer (output values), and one or many hidden layers. The output of a perceptron y is a combination of all the outputs of the previous layer x_{i}, each output multiplied by a weigh w_{i}.
y = \Sigma_{i}(x_{i}w_{i} + b)
Each perceptron output is passed through an activation function before being used by the perceptron of the next layer. It introduces non-linearity into the model, allowing the network to learn and represent complex patterns in the data. The activation functions used are the Rectified Linear Unit (ReLU) for input and hidden layers and Softmax for the output layer.
$$ReLU(x) =
\begin{cases}
x &\text{if } x > 0 \
0 &\text{else }
\end{cases}
def ReLU(X):
return np.maximum(X,0)
softmax(x_{i}) = \frac{e^{x_{i}}}{\sum_{j=1}^K e^{x_{j}}} \\
def softmax(X):
exp = np.exp(X - np.max(X))
return exp / exp.sum(axis=0)
Training
Multi-Layer Perceptron training iteratively fine-tunes the model. An iteration is composed of :
- The forward propagation: during this step, the input data is fed through the network. The output of the last layer gives us the model prediction. We can compare the difference between the predicted output and the actual value.
- The backward propagation: starting from the output layer, the propagates this error through the network. The goal is to evaluate, for each weight, how much this weight contributed to the final error and how we can adjust his weight to improve the final prediction. This value is called the gradient.
- Update each weight with his computed gradient (weighted by the learning rate).
Rust implementation
Architecture
─ src
├─ model // multilayer perceptron struct
├─ train // model training functions (forward, backward...)
├─ dataset // dataset struct
├─ arguments // parse and manage script arguments
├─ factory // model / dataset factories (csv / npy parser )
└─ main.rs
─ tests
├─ python_scripts // to generate expected values
└─ tests
Crates
Rust libraries are called crates, this is non-exhaustive list of crates I used :
- nalgebra : for data structure as matrix and vector, and function to manipulate this structures
- clap : to parse arguments (i.e iteration, output destination...)
- egui and eframe : for demo gui
- csv : to parse csv
- ndarray-npy : to read and write npy files (numpy storage format)
Evolution
- parallelize operations : matrix multiplication, update weights, etc.
- implement convolutional layers (example). Convolutional layer applies filters to the input image to extract features. Convolution layer values are learned during training in the same way as the perceptron weights.
Run the project
Setup
- Install rust
- Extract dataset and pretrained model
unzip ./input/train.zip -d ./input/
unzip ./output/example_model_pretrained.zip -d ./output/example_model_pretrained
Run
Build and run the projet (model training + demo):
cargo run
Default project process :
- load the input dataset
- shuffle the dataset data
- split the dataset into a training set and a test set
- train the model on the train set
- evaluate the model on the test set
Alternative use
Alternative independent possible steps :
- load a pretrained model
- skip the training
- skip the evaluation
Examples :
Run only the demo with a pretrained model :
cargo run -- --skip-training --test-rate 0 --input_model ./output/example_model_pretrained
Only evaluate a pretrained model :
cargo run -- --skip-training --test-rate 100 --input_model ./output/example_model_pretrained
Documentation :
cargo run -- --help
Test
Run test suite :
cargo test

