Coder Social home page Coder Social logo

l2's Introduction

L2 • 🤖

A multidimensional array and deep learning library implemented from scratch in C++

L2 - A  deep learning library implemented from scratch in C++ | Product Hunt Embed forthebadge code style: prettier License: MIT PRs Welcome GitHub pull requests

InstallationDocumentationContributingAuthorsLicenseAcknowledgements

Made by Bilal Khanhttps://bilal.software

What is L2?

L2 is named after the L2 or Euclidean distance, a popular measure in deep learning

L2 is a deep learning library written in C++17 using only the standard library. It contains a multidimensional array class, Tensor, with support for strided arrays, numpy-style array slicing, broadcasting, and most major math operations (including matrix multiplication!). On top of this, the Parameter, Layer, Sequential, Loss, Optimizer, and Trainer classes allow for running high-level machine learning experiments without worrying about the low-level implementations. L2

Quick start

L2::Tensor<double> x = L2::Tensor<double>({100, 10}).normal(0, 1);

L2::Tensor<double> w = L2::Tensor<double>({10, 1}).normal(0, 1);
L2::Tensor<double> b = L2::Tensor<double>({1}).normal(0, 1);

L2::Tensor<double> y = L2::matmul(x, w) + b;

L2::nn::loss::MSE<double> *criterion = new L2::nn::loss::MSE<double>();
L2::nn::optimizer::SGD<double> *optimizer = new L2::nn::optimizer::SGD<double>(0.05);

L2::nn::Sequential<double> *sequential = new L2::nn::Sequential<double>({
    new L2::nn::Linear<double>(10, 1) //
});

L2::trainer::Trainer<double> trainer = L2::trainer::Trainer<double>(sequential, criterion, optimizer);

trainer.fit(x, y, 10, 10);

L2::Tensor<double> y_hat = trainer.predict(x);

y_hat.print();

Design choices

L2 only supports a cpu backend at the moment since I'm not familiar enough with c++ to start working with CUDA and cudnn. Version 1 of the library primarily uses pass-by-value to reduce complexity at the result of reduced efficiency. Version 2 of the library will focus on making the Tensor class more efficient. Currently, only the Linear and Sigmoid layers, the MSE loss, and the SGD optimizer have been implemented, but V2 will add more layers and modules.

Documentation

L2

Create a tensor
// Create a tensor of zeros with a shape of 3x3
L2::Tensor<double> x = L2::Tensor<double>({3, 3});

// Create a tensor from a vector with a shape of 3x3
std::vector<double> vector{1, 2, 3, 4, 5, 6, 7, 8, 9};
x = L2::Tensor<double>(vector, {3, 3});
Numpy style array slicing
// Get the first row from Tensor x
L2::Tensor<double> y = x({{0, 1}}); // slices (0, 1] and (0, -1]

// Get the first column from Tensor x
L2::Tensor<double> y = x({{0, -1}, (0, 1)}); // slices (0, -1] and (0, 1]

// Get the first two columns and first two rows from Tensor x
L2::Tensor<double> z = x({{0, 2}, {0, 2}}); // slices (0, 2] and (0, 2]
Change dimensions of a Tensor
// Change the shape of a tensor (Strided arrays let you change the user-visible shape without changing the order of the data elements)
L2::Tensor<double> y = x.view({9}); // shape: (9)

// Reshape to -1
L2::Tensor<double> y = x.view({-1}); // shape: (9)

// Add a dimension to a Tensor
// shape: (3, 3)
L2::Tensor<double> y = x.unsqueeze(0); // shape: (1, 3, 3)

// Transpose a Tensor
// shape: (4, 3)
L2::Tensor<double> y = x.transpose(); // shape: (3, 4)
Get information about a Tensor
// Print info about the Tensor to std::cout
>>> x.print();
data:

0, 0, 0, 0, 0, 0, 0, 0, 0,

size:

3, 3,

strides:

3, 1,

dtype:

double
>>>

// Get the shape
std::vector<int> shape = x.get_shape(); // [3, 3]

// Get the data
std::vector<double> data = x.get_data(); // [1, 2, 3, 4, 5, 6, 7, 8, 9]

// Get the number of elements in the Tensor
int length = x.length(); // 9

// Get the type of the Tensor
std::string type = x.type(); // double
Operations on tensors (with broadcasting!)
// Concatenate Tensors
L2::Tensor<double> x = L2::Tensor<double>({3, 3}).zeros();
L2::Tensor<double> y = L2::Tensor<double>({4, 3}).zeros();

L2::Tensor<double> z = L2::cat({x, y}, 0); // shape: (7, 3)

// Add values to all elements in a Tensor
L2::Tensor<double> y = x + 1;

// Inplace operations
x += 2.0;

// exp(), log(), sqrt()
L2::Tensor<double> y = x.log();

// inplace version
x.log_();

// Add a tensor to a tensor
L2::Tensor<double> x = L2::Tensor<double>({3, 3}).normal(0, 1);
L2::Tensor<double> y = L2::Tensor<double>({3}).normal(0, 1);

L2::Tensor<double> z = x + y; // y is added to each column of x

// Sum up all values in a Tensor
L2::Tensor<double> y = x.sum(); // y has a shape of 1

// Sum up all values along a dimension
L2::Tensor<double> y = x.sum(0); // y has a shape of 3
Initialize a tensor
// fill with zeros
L2::Tensor<double> x = L2::Tensor<double>({3, 3}).zeros();

// fill from a normal distribution with a specified mean and stddev
L2::Tensor<double> x = L2::Tensor<double>({3, 3}).normal(0, 1); // mean of 0, stddev of 1

// fill from a uniform distribution with specified limits
L2::Tensor<double> x = L2::Tensor<double>({3, 3}).uniform(-1, 1); // lower bound of -1, upper bound of 1
Linear algebra functions
// matrix multiplication
L2::Tensor<double> x = L2::Tensor<double>({2, 4}).zeros();
L2::Tensor<double> y = L2::Tensor<double>({4, 5}).zeros();

L2::Tensor<double> z = L2::matmul({x, y}); // shape: (2, 5)
// The Parameter class is used to store a Tensor and its gradient
L2::Parameter<double> y = L2::Parameter<double>(x);

nn

Layer
Creating your own layers

All layers subclass the Layer virtual class and override three methods and have access to the build_param function and cached Tensor from Layer<T> for storing values needed for the backward pass:

  • Tensor<T> forward(Tensor<T> tensor)

    • The implementation of the forward pass for the layer, taking a Tensor as an input and returning a Tensor
  • Tensor<T> backward(Tensor<T> derivative)

    • The implementation of the backwards pass for the layer, taking the derivative of the loss with respect to the previous layer and returning the derivative of the loss with respect to the current layer
  • void update(L2::nn::optimizer::Optimizer<T> *optimizer)

    • The implementation of how parameters in the layer get updated by the optimzier

Steps:

  • 1: Subclass Layer and create private Parameters for each parameter that will be updated through gradient descent
  • 2: In the constructor, create and initialize Tensors for each parameter. Call Layer<T>::build_param(tensor) to intialize a parameter from the Tensor and save it to your instance parameters
  • 3: In forward(), define the computations necessary for the forward pass of the layer. Use Layer<T>::cached to store any intermediate values needed for the backward pass
  • 4: In backward(), compute the derivatives with respect to each parameter and with respect to the inputs of the layer. Add the derivatives with respect to the parameters to the grad instance variable of each parameter, and update Layer<T>::parameters with the current states of the parameters. Return the derivatives with respect to the inputs of the layer.
  • 5: In update(), call Layer<t>::update(optimizer) and update the parameter instance variables with the current state of Layer<T>::parameters

A linear feed-forward layer. Uses Kaiming uniform initialization for the weights and zero initialization for the bias

Arguments:

  • c_in (int): The number of input channels
  • c_out (int): The number of output channels

The sigmoid activation, squashes all values between 0 and 1

Takes as input a vector of layers and automatically handles calling forward(), backward(), and update() on each

Arguments:

  • layers (std::vector<Layer *>): A vector of pointers to layers

Loss

Methods:

  • Tensor<T> forward(Tensor<T> pred, Tensor<T> label)

    • Calculates the loss value for the predicted inputs
    • Arguments
      • pred (Tensor): The output of the last layer of the model
      • label (Tensor): The ground truth label for the predictions
    • Returns
      • (Tensor<T)>: The loss for the inputs
  • Tensor<T> backward()

    • Calculates and returns the gradient of the loss
    • Returns
      • (Tensor): The gradient of the loss

The MSE (Mean Squared Error) loss.

Optimizer

Arguments:

  • lr (double): The learning rate

Methods:

  • Parameter<T> update(Parameter<T> param)

    • Takes a parameter and returns the updated version
    • Arguments
      • param (Parameter): The parameter to update
    • Returns
      • (Parameter): The updated parameter

The SGD (Stochastic Gradient Descent) optimizer

Handles training a neural network with a given loss function and optimizer

Arguments:

  • model (*L2::nn::Sequential): A pointer to a Sequential object
  • loss (*L2::nn::loss::Loss): A pointer to a Loss object
  • optimizer (*L2::nn::optimizer::Optimizer): A pointer to an Optimizer object

Methods:

  • void fit(Tensor<T> x, Tensor<T> y, int epochs, int batch_size)

    • Trains a network on x and y for epochs epochs with a batch size of batch_size
    • Arguments
      • x (Tensor): The data to use to make a prediction
      • y (Tensor): The ground truth labels for the data
      • epochs (int): The number of epochs to train for
      • batch_size (int): The amount of images to send through the network at a time
  • Tensor<T> predict(Tensor<T> x)

    • Predicts on a dataset using the trained model
    • Arguments
      • x (Tensor): The data to use to make a prediction
    • Returns
      • (Tensor): The predicted labels for the input

Contributing

This repository is still a work in progress, so if you find a bug, think there is something missing, or have any suggestions for new features, feel free to open an issue or a pull request. Feel free to use the library or code from it in your own projects, and if you feel that some code used in this project hasn't been properly accredited, please open an issue.

Authors

  • Bilal Khan - Initial work

License

This project is licensed under the MIT License - see the license file for details

Acknowledgements

The fast.ai deep learning from the foundations course (https://course.fast.ai/part2) teaches a lot about how to make your own deep learning library

Some of the blog posts I used when writing this library include:

Other deep learning libraries from scratch include:

This README is based on:

I used carbon.now.sh with the "Shades of Purple" theme for the screenshot at the beginning of this README

This project contains ~3300 lines of code

l2's People

Contributors

bilal2vec avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.