benny-nottonson / voodoo Goto Github PK

View Code? Open in Web Editor NEW

123.0 123.0 4.0 2.55 MB

A working machine learning framework in pure Mojo 🔥

License: Apache License 2.0

Mojo 100.00%

autograd data-science deep-learning deep-neural-networks machine-learning ml mojo neural-networks

voodoo's Introduction

Hi 👋! My name is Benny and I'm a student from California

voodoo's People

Contributors

Stargazers

Watchers

Forkers

cyrillzadra jamesthesnake codingonion drujensen

voodoo's Issues

Add more MNIST data

Rewrite chained If statements to use Dictionaries

Is your feature request related to a problem? Please describe.
The coded uses a lot of chained if else for generic functions, this can be removed once dictionary types are supported in Mojo

Describe the solution you'd like
A very simple hash map to chain for example, the string representation of a loss function, and its actual generic struct.

Describe alternatives you've considered
The @parameter tag for chained if statements leads to a bug that I do not know how to fix, where the wrong value is returned.

Add noise shape functionality to dropout

Is your feature request related to a problem? Please describe.
Dropout should also have a noise shape option that allows for a mask to be used on top of the noise, this likely means it will need to be moved to conv and maxpool kernel file since it will rely on another tensor.

Describe the solution you'd like
Dropout should be rewritten to allow for the optional use of a noise mask.

Improve documentation for activation function implementation

Is your feature request related to a problem? Please describe.
This will allow the removal of comments from the code as the information can be stored in the documentation.

Add constraints

Is your feature request related to a problem? Please describe.
Dense layers and other related layers should have the option to add contraints

Describe the solution you'd like
These should be implemented similar to activation functions, more research is done to see how this related to the overall function of a node.

Fix working on mac

It no no work

Error checking

Enhance error checking, long term

Use generic functions for Optimizers

Is your feature request related to a problem? Please describe.
There are currently only 3 optimizers but they should in theory all be using the same generic structure since they are only modifying the training rate.

Describe the solution you'd like
The loss functions should be encapsuled into one generic structure like what was done with Activations.

Describe alternatives you've considered
The current implementation is fine but adds unneeded complexity.

Extract Node initialization functions into a kernel file

Is your feature request related to a problem? Please describe.
Tensors / Nodes are currently initialized for weights and biases using code that is held within Tensor and Node, rather than making a unary operation to do so, this should be changed.

Describe the solution you'd like
The initialization kernels should be held within a seperate kernel file or added to an existing one, and they should use vectorized SIMD operations.

Describe alternatives you've considered
The current implementation works but is significantly slower than other related functions and makes the structure of the code confusing.

Integrate PyTorch and Tensorflow Training Time

Describe the solution you'd like
Hello, Quick request for the README.md. Would you mind putting the training time for Pytorch / Tensoflow? It would provide a good baseline on how efficient these Mojo libraries are.
Thanks!

Layer Weight Regularizers

Layer weight regularizers
L1 class
L2 class
L1L2 class
OrthogonalRegularizer class

Improve code quality of Graph and Node

Is your feature request related to a problem? Please describe.
No immediate changes are needed, but the code quality could be improved by further combining generic functions to minimize overhead.

Describe the solution you'd like
Functions like generic_loss are a good example of what needs to be done to other functions.

Describe alternatives you've considered
The current code works fine, just harder to debug.

Rewrite Pool kernels

Dropout BW does not work

Describe the bug
The backwards gradient for dropout does not work since the behavior is technically random, find a way around this

Expected behavior
The bw gradient of the Dropout function should be trainable, need to do research on how

Rewrite to use builtin Tensor

Huge issue, required a ton of work and time

Clone Keras Layers objects

https://keras.io/api/layers/

This issue will be split into multiple layer, but it holds the list of layers and related functions.

Layer Weight Initializers

Layer weight initializers
RandomNormal class
RandomUniform class
TruncatedNormal class
Zeros class
Ones class
GlorotNormal class
GlorotUniform class
HeNormal class
HeUniform class
Orthogonal class
Constant class
VarianceScaling class

Rewrite Conv kernels

Add regularizers

Is your feature request related to a problem? Please describe.
Dense layers and other related layers should have the option to add regularizers, such as L1 and L2.

Describe the solution you'd like
These should be implemented similar to activation functions, more research is done to see how this related to the overall function of a node.

Add more optimizers

Is your feature request related to a problem? Please describe.
There are currently only 3 optimizers and I believe only one of them to be production ready, add more.

Describe the solution you'd like
There should be a wide range of optimizers, see Keras and Pytorch for example functions that could be implemented.

Describe alternatives you've considered
The current implementation is a good start but lacks versatility.

Conv and Pool layers

Is your feature request related to a problem? Please describe.
Layers and functions that rely on a kernel and a main tensor are currently not supported, these need to be added for Conv and Pool to work.

Describe the solution you'd like
These should be similar in structure to operation functions since they take in 3 nodes and store data in the first, but more research needs to be done to optimize these, potentially using tiling or vectorized operations in parallel.

Rewrite operations to use generics

Is your feature request related to a problem? Please describe.
The Operation cpu kernel file currently relies on multiple structs that augment the data in some way, this one will be harder to achieve since the operations are so different, but somehow encapsulate these into different generic structs.

Describe the solution you'd like
The functions should be put into generic structures like what is done with the Activation kernels.

Describe alternatives you've considered
This may be complicated to implement since they rely on somewhat different logic, but SIMD operations could likely be used for functions like Copy and Transpose

Rewrite kernels with lambda functions

Is your feature request related to a problem? Please describe.
All of the kernels in CPU kernels could benefit from being rewritten with lambda functions, but this is not currently supported in Mojo.

Describe the solution you'd like
The functions could all be rewritten using simple lambda functions once support is added, essentially any function tagged with @always_inline would likely benefit from this.

Describe alternatives you've considered
The current implementation is fine, but created unneeded code and likely performance overhead (At least during compilation)

Make Kernels a compile time constant

Is your feature request related to a problem? Please describe.
I believe this issue is impossible to do without hashmaps, but I could be wrong. Currently Kernels are instantiated for every graph that is made, though they are really just a psuedo hashmap that could be a compile time constant.

Describe the solution you'd like
Rewrite graph and node to use a hash map version of kernels once mojo supports it.

Describe alternatives you've considered
The current implementation works but likely comes with some performance overhead.

Dynamically generate memory pool size

Use the defined layers or some means of estimating what a correctly sized memory pool would be to prevent segfault issues with higher complexity models.

Generic model struct to encapsulate layers

Is your feature request related to a problem? Please describe.
Layers are currently declared one by one, and then called in forward layers one by one, it would be better dev ex to have a struct to encapsulate a variable amount of layers and chain the forward and backwards operations.

Describe the solution you'd like
A structure to hold arbitrary layers.

Describe alternatives you've considered
The current implementation works fine, it just isn't ideal.

Layer Activations

Layer activations
relu function
sigmoid function
softmax function
softplus function
softsign function
tanh function
selu function
elu function
exponential function
leaky_relu function
relu6 function
silu function
gelu function
hard_sigmoid function
linear function
mish function
log_softmax function

Learn Mojo

https://developer.modular.com

Reshape Layer

Add a layer to be used as reshape, takes in newShape as an input and performs a check, then continues

Use generics for initializers

Is your feature request related to a problem? Please describe.
As described in #15 , the initialization functions are currently all separated, although they rely on similar logic, they could likely be grouped into a generic struct like what was done with activation functions, and more parameters need to be added to some.

Describe the solution you'd like
The initialization functions should be rewritten to use a generic structure that is called on by others like what is done in cpu_kernels/activations

Describe alternatives you've considered
The current implementation works but adds unneeded complexity and makes it harder to debug / trace code

Additional context
This may be simpler to tackle after #15, though the inverse may also be true.

Bug with Elu, Softmax, and Logsoftmax

Check why values are nan or not approaching a minimum loss, likely issue with grad