This project aims at building a machine learning framework using C-plus-plus with minimal dependencies and a modular code structure. The framework is named Apollo and it provides necessary classes and operations for Neural Networks. The modular nature of code allows user to define custom layers type with custom activations and loss functions. We have demonstrated the framework application by building a binary classifier that detects a spam email.
This repository contains the standalone code for Neural Network operations. The GUI is available in a separate repository here.
The GUI is built using Qt and it provides a user friendly interface to train and test the model. The GUI is shown below.
Create a new model
Add layers to the model
Train the model
This is the base class for all layers. It is an abstract class, and cannot be instantiated directly. It provides the basic functionality for all layers such as numNeurons, numInputs, dW, and dB. It also provides the abstract methods forward and backward as pure virtual functions. These methods must be implemented by all derived classes.
Layer(int numNeurons, int numInputs, int numOutputs);
Layer(int numNeurons, int *shape);
Layer(Eigen::MatrixXd weights, Eigen::VectorXd biases, int numOutputs);
Layer(int numNeurons, int numInputs, int numOutputs);
Args:
numNeurons
: the number of neurons in the layer.
Layer(int numNeurons, int *shape);
Args:
numNeurons
: the number of neurons in the layer.
Layer(Eigen::MatrixXd weights, Eigen::VectorXd biases, int numOutputs);
Args:
weights
: the weights of the layer.
biases
: the biases of the layer.
numOutputs
: the number of outputs of the layer.
int numNeurons;
int numInputs;
int numOutputs;
float learningRate = 0.01;
Eigen::MatrixXd weights;
Eigen::VectorXd biases;
Eigen::MatrixXd inputs;
Eigen::MatrixXd outputs;
Eigen::MatrixXd gradients;
Eigen::MatrixXd weightsGradients;
Eigen::VectorXd biasesGradients;
virtual void update(float learningRate) = 0;
virtual void forward(Eigen::MatrixXd inputs) = 0;
virtual void backward(Eigen::MatrixXd gradients) = 0;
virtual void summary() = 0;
virtual int getTrainableParams() = 0;
void saveWeights(std::string const &path, bool append = false);
void saveBiases(std::string const &path, bool append = false);
void saveGradients(std::string const &path, bool append = false);
void saveLayer(std::string const &path, bool append = false);
void update(float learningRate) = 0;
Args:
learningRate
: the learning rate of the layer.
This method updates the weights and biases of the layer using the gradients calculated in the backward method.
void forward(Eigen::MatrixXd inputs) = 0;
Args:
inputs
: the inputs of the layer.
This method calculates the output of the layer given the input.
void backward(Eigen::MatrixXd gradients) = 0;
Args:
gradients
: the gradients of the layer.
This method calculates the gradients of the layer given the gradients from the next layer.
void summary() = 0;
This method prints a summary of the layer.
int getTrainableParams() = 0;
This method returns the number of trainable parameters in the layer.
void saveWeights(std::string const &path, bool append = false);
Args:
path
: the path to the file where the weights will be saved.
append
: whether to append the weights to the file or not.
This method saves the weights of the layer to a file.
void saveBiases(std::string const &path, bool append = false);
Args:
path
: the path to the file where the biases will be saved.
append
: if true, the biases will be appended to the file, otherwise the file will be overwritten.
This method saves the biases of the layer to a file.
void saveGradients(std::string const &path, bool append = false);
Args:
path
: the path to the file where the gradients will be saved.
append
: if true, the gradients will be appended to the file, otherwise the file will be overwritten.
This method saves the gradients of the layer to a file.
void saveLayer(std::string const &path, bool append = false);
Args:
path
: the path to the file where the layer will be saved.
append
: if true, the layer will be appended to the file, otherwise the file will be overwritten.
This method saves the layer to a file.
This is a fully connected layer. It is derived from the Layer class. It provides the forward and backward methods for a fully connected layer.
Dense(int numNeurons, int numInputs, int numOutputs);
Dense(int numNeurons, int *shape);
Dense(Eigen::MatrixXd weights, Eigen::VectorXd biases, int numOutputs);
Dense(int numNeurons, int numInputs, int numOutputs);
Args:
numNeurons
: the number of neurons in the layer.
numInputs
: the number of inputs of the layer.
numOutputs
: the number of outputs of the layer.
Dense(int numNeurons, int *shape);
Args:
numNeurons
: the number of neurons in the layer.
shape
: the shape of the layer.
Dense(Eigen::MatrixXd weights, Eigen::VectorXd biases, int numOutputs);
Args:
weights
: the weights of the layer.
biases
: the biases of the layer.
numOutputs
: the number of outputs of the layer.
void update(float learningRate) override;
void forward(Eigen::MatrixXd inputs) override;
void backward(Eigen::MatrixXd gradients) override;
void summary() override;
int getTrainableParams() override;
void update(float learningRate);
Args:
learningRate
: the learning rate of the layer.
This method updates the weights and biases of the layer using the gradients calculated in the backward method.
void forward(Eigen::MatrixXd inputs);
Args:
inputs
: the inputs of the layer.
This method calculates the output of the layer given the input.
void backward(Eigen::MatrixXd gradients);
Args:
gradients
: the gradients of the layer.
This method calculates the gradients of the layer given the gradients from the next layer.
void summary();
This method prints a summary of the layer.
int getTrainableParams();
This method returns the number of trainable parameters in the layer.
This is an activation layer. It is derived from the Layer class. It provides the forward and backward methods for an activation layer.
This is a sigmoid activation layer. It is derived from the Activation class. It provides the forward and backward methods for a sigmoid activation layer.
Sigmoid(Eigen::MatrixXd &inputs);
Sigmoid(int numInputs, int numOutputs);
Sigmoid(int *shape);
Sigmoid(Eigen::MatrixXd &inputs);
Args:
inputs
: the inputs of the layer.
Sigmoid(int numInputs, int numOutputs);
Args:
numInputs
: the number of inputs of the layer.
numOutputs
: the number of outputs of the layer.
Sigmoid(int *shape);
Args:
shape
: the shape of the layer.
Eigen::MatrixXd inputs;
Eigen::MatrixXd outputs;
Eigen::MatrixXd gradientsOut;
void forward(Eigen::MatrixXd &inputs);
void backward(Eigen::MatrixXd &gradients);
int *getInputShape();
int *getOutputShape();
void summary();
void forward(Eigen::MatrixXd &inputs);
Args:
inputs
: the inputs of the layer.
This method calculates the output of the layer given the input.
void backward(Eigen::MatrixXd &gradients);
Args:
gradients
: the gradients of the layer.
This method calculates the gradients of the layer given the gradients from the next layer.
int *getInputShape();
This method returns the shape of the input matrix to the layer.
int *getOutputShape();
This method returns the shape of the output matrix from the layer.
void summary();
This method prints a summary of the layer.
Loss is a namespace that contains the loss functions and their derivatives.
enum lossFunction{
BCE,
MSE
};
This is the binary cross entropy loss function. It is used for binary classification problems.
namespace Loss {
...
std::tuple<Eigen::MatrixXd, float> BCE(Eigen::MatrixXd &outputs, Eigen::MatrixXd &targets);
float BCEValue(Eigen::MatrixXd &loss);
...
}
std::tuple<Eigen::MatrixXd, float> BCE(Eigen::MatrixXd &outputs, Eigen::MatrixXd &targets);
outputs
: output matrix from the network.
targets
: target matrix.
This function calculates the binary cross entropy loss and its derivative.
std::tuple<Eigen::MatrixXd, float>
: gradients and loss value.
float BCEValue(Eigen::MatrixXd &loss);
loss
: loss matrix.
This function calculates the binary cross entropy loss value.
float
: loss value.
Dataloader is a class that is used to load data from a file. It is used to load data from a csv file. It is derived from the DataLoader class. It provides the methods to load data from a csv file.
Dataloader(Eigen::MatrixXd data, Eigen::MatrixXd labels, int batchSize);
Dataloader(Eigen::MatrixXd data, Eigen::MatrixXd labels);
Dataloader(std::string const &path);
Dataloader(std::string const &path, float trainSplit);
// NOTE: Make sure the file contains trainLabels in the first column
Dataloader(Eigen::MatrixXd data, Eigen::MatrixXd labels, int batchSize);
Args:
data
: the data matrix.
labels
: the labels matrix.
batchSize
: the batch size.
Dataloader(Eigen::MatrixXd data, Eigen::MatrixXd labels);
Args:
data
: the data matrix.
labels
: the labels matrix.
Dataloader(std::string const &path);
Args:
path
: the path to the csv file.
Dataloader(std::string const &path, float trainSplit);
Args:
path
: the path to the csv file.
trainSplit
: the split ratio for the train and test data.
Eigen::MatrixXd trainData;
Eigen::MatrixXd trainLabels;
Eigen::MatrixXd valData;
Eigen::MatrixXd valLabels;
int batchSize;
int numBatches;
Eigen::MatrixXd &getTrainLabels();
Eigen::MatrixXd &getTrainData();
Eigen::MatrixXd &getValData();
Eigen::MatrixXd &getValLabels();
void head(int n);
int* getTrainDataShape();
int* getTrainLabelsShape();
int * getValDataShape();
int* getValLabelsShape();
void head(int n);
Args:
n
: the number of rows to print.
This method prints the first n rows of first n columns of the training data and labels.
Model is a class that is used to create a neural network. It is derived from the Model class. It provides the methods to create a neural network.
Model();
Model(int *inputShape, bool verb, float learningRate = 0.001, int numClasses = 1);
Model(int *inputShape, bool verb, float learningRate = 0.001, int numClasses = 1);
Args:
inputShape
: the shape of the input matrix.
verb
: the verbosity of the model.
learningRate
: the learning rate of the model.
numClasses
: the number of classes in the dataset.
Some attributes are:
vector<variant<Dense, Sigmoid>> layers;
float loss;
void addLayer(MultiType *layer);
void compile();
void fit(Eigen::MatrixXd &inputs, Eigen::MatrixXd &labels, int epochs, enum lossFunction, bool verb);
void
fit(Eigen::MatrixXd &trainX, Eigen::MatrixXd &trainY, Eigen::MatrixXd &valX, Eigen::MatrixXd &valY, int epochs,
enum lossFunction, bool verb);
void
fit(Eigen::MatrixXd &trainX, Eigen::MatrixXd &trainY, Eigen::MatrixXd &valX, Eigen::MatrixXd &valY, int epochs,
enum lossFunction, bool verb, bool saveEpoch, string filename);
void
fit(Eigen::MatrixXd &trainX, Eigen::MatrixXd &trainY, Eigen::MatrixXd &valX, Eigen::MatrixXd &valY, int epochs,
enum lossFunction, bool verb, bool saveEpoch, string filename, bool earlyStopping, int threshold);
Eigen::MatrixXd predict(Eigen::MatrixXd inputs);
void evaluate(Eigen::MatrixXd inputs, Eigen::MatrixXd labels, enum lossFunction lossType);
int *getLastLayerOutputShape();
int *getLastLayerInputShape();
void summary();
void saveModel(const std::string &path);
void loadModel(const std::string &path);
MultiType = variant<Dense,Sigmoid>;
std::variant
is a type-safe union introduced in c++ 17. It is used to store different types of layers in a single vector.
void addLayer(MultiType *layer);
Args:
layer
:pointer to the layer to be added.
This method adds a layer to the model.
void compile();
This method assert the input and output shapes of consecutive layers and throws an error if they are not compatible.
void fit(Eigen::MatrixXd &inputs, Eigen::MatrixXd &labels, int epochs, enum lossFunction, bool verb);
Args:
inputs
: the input matrix.
labels
: the labels matrix.
epochs
: the number of epochs.
lossFunction
: the loss function to be used.
verb
: the verbosity of the model.
This method trains the model.
enum lossFunction{
BCE,
MSE
};
void fit(Eigen::MatrixXd &trainX, Eigen::MatrixXd &trainY, Eigen::MatrixXd &valX, Eigen::MatrixXd &valY, int epochs, enum lossFunction, bool verb);
Args:
trainX
: the training input matrix.
trainY
: the training labels matrix.
valX
: the validation input matrix.
valY
: the validation labels matrix.
epochs
: the number of epochs.
lossFunction
: the loss function to be used.
verb
: the verbosity of the model.
This method trains the model.
void fit(Eigen::MatrixXd &trainX, Eigen::MatrixXd &trainY, Eigen::MatrixXd &valX, Eigen::MatrixXd &valY, int epochs, enum lossFunction, bool verb, bool saveEpoch, string filename);
Args:
trainX
: the training input matrix.
trainY
: the training labels matrix.
valX
: the validation input matrix.
valY
: the validation labels matrix.
epochs
: the number of epochs.
lossFunction
: the loss function to be used.
verb
: the verbosity of the model.
saveEpoch
: whether to save the model after every epoch.
filename
: the name of the file to save the model.
This method trains the model.
void fit(Eigen::MatrixXd &trainX, Eigen::MatrixXd &trainY, Eigen::MatrixXd &valX, Eigen::MatrixXd &valY, int epochs, enum lossFunction, bool verb, bool saveEpoch, string filename, bool earlyStopping, int threshold);
Args:
trainX
: the training input matrix.
trainY
: the training labels matrix.
valX
: the validation input matrix.
valY
: the validation labels matrix.
epochs
: the number of epochs.
lossFunction
: the loss function to be used.
verb
: the verbosity of the model.
saveEpoch
: whether to save the model after every epoch.
filename
: the name of the file to save the model.
earlyStopping
: whether to use early stopping.
threshold
: the threshold for early stopping.
This method trains the model.
Eigen::MatrixXd predict(Eigen::MatrixXd inputs);
Args:
inputs
: the input matrix.
Returns:
outputs
: the output matrix.
This method predicts the output for the given input.
void evaluate(Eigen::MatrixXd inputs, Eigen::MatrixXd labels, enum lossFunction lossType);
Args:
inputs
: the input matrix.
labels
: the labels matrix.
lossType
: the loss function to be used.
This method evaluates the model.
int *getLastLayerOutputShape();
This method returns the output shape of the last layer.
int *getLastLayerInputShape();
This method returns the input shape of the last layer.
void summary();
This method prints the summary of the model including details of each layer including its neurons, input/output shape and total number of trainable params.
void saveModel(const std::string &path);
Args:
path
: the path to save the model.
This method saves the model to the given path.
void loadModel(const std::string &path);
Args:
path
: the path to load the model from.
This method loads the model from the specified path.
Preprocessing is a namespace that contains functions to preprocess the data before training or evaluating the model.
namespace Preprocessing {
Eigen::MatrixXd normalize(Eigen::MatrixXd matrix);
Eigen::MatrixXd standardize(Eigen::MatrixXd matrix);
Eigen::MatrixXd spamPreprocessingFile(const std::string &path);
Eigen::MatrixXd spamPreprocessing(const std::string &email);
}
Eigen::MatrixXd normalize(Eigen::MatrixXd matrix);
Args:
matrix
: the matrix to be normalized.
This function normalizes the input matrix by dividing each element by the maximum value in the matrix.
Eigen::MatrixXd standardize(Eigen::MatrixXd matrix);
Args:
matrix
: the matrix to be standardized.
This function standardizes the input matrix by subtracting the mean and dividing by the standard deviation.
Eigen::MatrixXd spamPreprocessingFile(const std::string &path);
Args:
path
: the path to the file to be preprocessed.
This function preprocesses the data from the file and returns the input matrix containing the frequency of 3000 predefined words in the email in a specific order.
This function is exclusively for the spam email classification problem.
Eigen::MatrixXd spamPreprocessing(const std::string &email);
Args:
email
: the email to be preprocessed.
This function preprocesses the data from the email string and returns the input matrix containing the frequency of 3000 predefined words in the email in a specific order.
This function is exclusively for the spam email classification problem.
Dataset used for the spam email classification problem can be found here. Check for more details.
Header of the dataset must be copied to Preprocessing/files
directory.
linalg is a namespace which provides some linear algebra functions used in the training process.
namespace linalg {
Eigen::MatrixXd broadcast(Eigen::MatrixXd matrix, int size, int axis);
Eigen::MatrixXd broadcast(Eigen::MatrixXd matrix, Eigen::MatrixXd shape, int axis);
}
Eigen::MatrixXd broadcast(Eigen::MatrixXd matrix, int size, int axis);
Args:
matrix
: the matrix to be broadcasted.
size
: the size of the matrix to be broadcasted to.
axis
: the axis along which the matrix is to be broadcasted.
This function broadcasts the input matrix to the given size along the given axis.
Eigen::MatrixXd broadcast(Eigen::MatrixXd matrix, Eigen::MatrixXd shape, int axis);
Args:
matrix
: the matrix to be broadcasted.
shape
: the shape of the matrix to be broadcasted to.
axis
: the axis along which the matrix is to be broadcasted.
This function broadcasts the input matrix to the given shape along the given axis.