L2 • 🤖

A multidimensional array and deep learning library implemented from scratch in C++

Installation • Documentation • Contributing • Authors • License • Acknowledgements

Made by Bilal Khan • https://bilal.software

What is L2?
Quick start
Design choices
Documentation
Contributing
Authors
License
Acknowledgements

What is L2?

L2 is named after the L2 or Euclidean distance, a popular measure in deep learning

L2 is a deep learning library written in C++17 using only the standard library. It contains a multidimensional array class, Tensor, with support for strided arrays, numpy-style array slicing, broadcasting, and most major math operations (including matrix multiplication!). On top of this, the Parameter, Layer, Sequential, Loss, Optimizer, and Trainer classes allow for running high-level machine learning experiments without worrying about the low-level implementations. L2

Quick start

L2::Tensor<double> x = L2::Tensor<double>({100, 10}).normal(0, 1);

L2::Tensor<double> w = L2::Tensor<double>({10, 1}).normal(0, 1);
L2::Tensor<double> b = L2::Tensor<double>({1}).normal(0, 1);

L2::Tensor<double> y = L2::matmul(x, w) + b;

L2::nn::loss::MSE<double> *criterion = new L2::nn::loss::MSE<double>();
L2::nn::optimizer::SGD<double> *optimizer = new L2::nn::optimizer::SGD<double>(0.05);

L2::nn::Sequential<double> *sequential = new L2::nn::Sequential<double>({
    new L2::nn::Linear<double>(10, 1) //
});

L2::trainer::Trainer<double> trainer = L2::trainer::Trainer<double>(sequential, criterion, optimizer);

trainer.fit(x, y, 10, 10);

L2::Tensor<double> y_hat = trainer.predict(x);

y_hat.print();

Design choices

L2 only supports a cpu backend at the moment since I'm not familiar enough with c++ to start working with CUDA and cudnn. Version 1 of the library primarily uses pass-by-value to reduce complexity at the result of reduced efficiency. Version 2 of the library will focus on making the Tensor class more efficient. Currently, only the Linear and Sigmoid layers, the MSE loss, and the SGD optimizer have been implemented, but V2 will add more layers and modules.

Documentation

L2

L2::Tensor<double>({3, 3})

Create a tensor

// Create a tensor of zeros with a shape of 3x3
L2::Tensor<double> x = L2::Tensor<double>({3, 3});

// Create a tensor from a vector with a shape of 3x3
std::vector<double> vector{1, 2, 3, 4, 5, 6, 7, 8, 9};
x = L2::Tensor<double>(vector, {3, 3});

Numpy style array slicing

// Get the first row from Tensor x
L2::Tensor<double> y = x({{0, 1}}); // slices (0, 1] and (0, -1]

// Get the first column from Tensor x
L2::Tensor<double> y = x({{0, -1}, (0, 1)}); // slices (0, -1] and (0, 1]

// Get the first two columns and first two rows from Tensor x
L2::Tensor<double> z = x({{0, 2}, {0, 2}}); // slices (0, 2] and (0, 2]

Change dimensions of a Tensor

// Change the shape of a tensor (Strided arrays let you change the user-visible shape without changing the order of the data elements)
L2::Tensor<double> y = x.view({9}); // shape: (9)

// Reshape to -1
L2::Tensor<double> y = x.view({-1}); // shape: (9)

// Add a dimension to a Tensor
// shape: (3, 3)
L2::Tensor<double> y = x.unsqueeze(0); // shape: (1, 3, 3)

// Transpose a Tensor
// shape: (4, 3)
L2::Tensor<double> y = x.transpose(); // shape: (3, 4)

Get information about a Tensor

// Print info about the Tensor to std::cout
>>> x.print();
data:

0, 0, 0, 0, 0, 0, 0, 0, 0,

size:

3, 3,

strides:

3, 1,

dtype:

double
>>>

// Get the shape
std::vector<int> shape = x.get_shape(); // [3, 3]

// Get the data
std::vector<double> data = x.get_data(); // [1, 2, 3, 4, 5, 6, 7, 8, 9]

// Get the number of elements in the Tensor
int length = x.length(); // 9

// Get the type of the Tensor
std::string type = x.type(); // double

Operations on tensors (with broadcasting!)

// Concatenate Tensors
L2::Tensor<double> x = L2::Tensor<double>({3, 3}).zeros();
L2::Tensor<double> y = L2::Tensor<double>({4, 3}).zeros();

L2::Tensor<double> z = L2::cat({x, y}, 0); // shape: (7, 3)

// Add values to all elements in a Tensor
L2::Tensor<double> y = x + 1;

// Inplace operations
x += 2.0;

// exp(), log(), sqrt()
L2::Tensor<double> y = x.log();

// inplace version
x.log_();

// Add a tensor to a tensor
L2::Tensor<double> x = L2::Tensor<double>({3, 3}).normal(0, 1);
L2::Tensor<double> y = L2::Tensor<double>({3}).normal(0, 1);

L2::Tensor<double> z = x + y; // y is added to each column of x

// Sum up all values in a Tensor
L2::Tensor<double> y = x.sum(); // y has a shape of 1

// Sum up all values along a dimension
L2::Tensor<double> y = x.sum(0); // y has a shape of 3

Initialize a tensor

// fill with zeros
L2::Tensor<double> x = L2::Tensor<double>({3, 3}).zeros();

// fill from a normal distribution with a specified mean and stddev
L2::Tensor<double> x = L2::Tensor<double>({3, 3}).normal(0, 1); // mean of 0, stddev of 1

// fill from a uniform distribution with specified limits
L2::Tensor<double> x = L2::Tensor<double>({3, 3}).uniform(-1, 1); // lower bound of -1, upper bound of 1

Linear algebra functions

// matrix multiplication
L2::Tensor<double> x = L2::Tensor<double>({2, 4}).zeros();
L2::Tensor<double> y = L2::Tensor<double>({4, 5}).zeros();

L2::Tensor<double> z = L2::matmul({x, y}); // shape: (2, 5)

L2::Parameter<double>(tensor)

// The Parameter class is used to store a Tensor and its gradient
L2::Parameter<double> y = L2::Parameter<double>(x);

nn

Layer

Creating your own layers

All layers subclass the Layer virtual class and override three methods and have access to the build_param function and cached Tensor from Layer<T> for storing values needed for the backward pass:

Tensor<T> forward(Tensor<T> tensor)
- The implementation of the forward pass for the layer, taking a Tensor as an input and returning a Tensor
Tensor<T> backward(Tensor<T> derivative)
- The implementation of the backwards pass for the layer, taking the derivative of the loss with respect to the previous layer and returning the derivative of the loss with respect to the current layer
void update(L2::nn::optimizer::Optimizer<T> *optimizer)
- The implementation of how parameters in the layer get updated by the optimzier

Steps:

1: Subclass Layer and create private Parameters for each parameter that will be updated through gradient descent
2: In the constructor, create and initialize Tensors for each parameter. Call Layer<T>::build_param(tensor) to intialize a parameter from the Tensor and save it to your instance parameters
3: In forward(), define the computations necessary for the forward pass of the layer. Use Layer<T>::cached to store any intermediate values needed for the backward pass
4: In backward(), compute the derivatives with respect to each parameter and with respect to the inputs of the layer. Add the derivatives with respect to the parameters to the grad instance variable of each parameter, and update Layer<T>::parameters with the current states of the parameters. Return the derivatives with respect to the inputs of the layer.
5: In update(), call Layer<t>::update(optimizer) and update the parameter instance variables with the current state of Layer<T>::parameters

L2::nn::Linear<double>(c_in=32, c_out=64)

A linear feed-forward layer. Uses Kaiming uniform initialization for the weights and zero initialization for the bias

Arguments:

c_in (int): The number of input channels
c_out (int): The number of output channels

L2::nn::Sigmoid<double>()

The sigmoid activation, squashes all values between 0 and 1

L2::nn::Sequential<double>(layers)

Takes as input a vector of layers and automatically handles calling forward(), backward(), and update() on each

Arguments:

layers (std::vector<Layer *>): A vector of pointers to layers

Loss

Methods:

Tensor<T> forward(Tensor<T> pred, Tensor<T> label)
- Calculates the loss value for the predicted inputs
- Arguments
  - pred (Tensor): The output of the last layer of the model
  - label (Tensor): The ground truth label for the predictions
- Returns
  - (Tensor<T)>: The loss for the inputs
Tensor<T> backward()
- Calculates and returns the gradient of the loss
- Returns
  - (Tensor): The gradient of the loss

L2::nn::loss::MSE<double>()

The MSE (Mean Squared Error) loss.

Optimizer

Arguments:

lr (double): The learning rate

Methods:

Parameter<T> update(Parameter<T> param)
- Takes a parameter and returns the updated version
- Arguments
  - param (Parameter): The parameter to update
- Returns
  - (Parameter): The updated parameter

L2::nn::optimizer::SGD<double>(lr=0.1)

The SGD (Stochastic Gradient Descent) optimizer

L2::nn::trainer::Trainer<double>(model, criterion, optimizer)

Handles training a neural network with a given loss function and optimizer

Arguments:

model (*L2::nn::Sequential): A pointer to a Sequential object
loss (*L2::nn::loss::Loss): A pointer to a Loss object
optimizer (*L2::nn::optimizer::Optimizer): A pointer to an Optimizer object

Methods:

void fit(Tensor<T> x, Tensor<T> y, int epochs, int batch_size)
- Trains a network on x and y for epochs epochs with a batch size of batch_size
- Arguments
  - x (Tensor): The data to use to make a prediction
  - y (Tensor): The ground truth labels for the data
  - epochs (int): The number of epochs to train for
  - batch_size (int): The amount of images to send through the network at a time
Tensor<T> predict(Tensor<T> x)
- Predicts on a dataset using the trained model
- Arguments
  - x (Tensor): The data to use to make a prediction
- Returns
  - (Tensor): The predicted labels for the input

Contributing

This repository is still a work in progress, so if you find a bug, think there is something missing, or have any suggestions for new features, feel free to open an issue or a pull request. Feel free to use the library or code from it in your own projects, and if you feel that some code used in this project hasn't been properly accredited, please open an issue.

Authors

Bilal Khan - Initial work

License

This project is licensed under the MIT License - see the license file for details

Acknowledgements

The fast.ai deep learning from the foundations course (https://course.fast.ai/part2) teaches a lot about how to make your own deep learning library

Some of the blog posts I used when writing this library include:

Other deep learning libraries from scratch include:

This README is based on:

I used carbon.now.sh with the "Shades of Purple" theme for the screenshot at the beginning of this README

This project contains ~3300 lines of code

rishiosaur / l2 Goto Github PK

l2's Introduction

L2 • 🤖

A multidimensional array and deep learning library implemented from scratch in C++

What is L2?

Quick start

Design choices

Documentation

L2

Create a tensor

Numpy style array slicing

Change dimensions of a Tensor

Get information about a Tensor

Operations on tensors (with broadcasting!)

Initialize a tensor

Linear algebra functions

nn

Layer

Creating your own layers

Loss

Optimizer

Contributing

Authors

License

Acknowledgements

l2's People

Contributors

Watchers

Recommend Projects

Recommend Topics

Recommend Org