Coder Social home page Coder Social logo

avishakeadhikary / neural-networks-from-scratch Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 1.0 34.73 MB

Implementing neural networks from scratch for a deeper understanding of concepts, featuring a Jupyter notebook with derivative-based implementations.

License: MIT License

Jupyter Notebook 99.58% Python 0.42%
artificial-intelligence machine-learning neural-network python gpt mlp

neural-networks-from-scratch's Introduction

Neural Networks from Scratch

Neural Networks from Scratch Banner

This repository contains a sequence of Jupyter notebooks, each representing a step in understanding and implementing neural networks from scratch. The primary focus is on understanding the underlying concepts and implementing them without relying on external libraries. The lecture series takes you from basic concepts to advanced architectures, demonstrating the conversion of traditional programming approaches to modern neural network techniques.

Lecture Series Overview

  1. Neural Networks with Derivatives: In this introductory notebook, we build a complete neural network from scratch, focusing on understanding derivatives and their role in neural network operations.
  2. NameWeave: Beginning with a single-layer bigram model, we gradually transition from traditional machine learning to neural network approaches for name generation.
  3. NameWeave - Multi Layer Perceptron: We expand the NameWeave model into a multi-layer perceptron (MLP), increasing the complexity of the model.
  4. NameWeave (MLP) - Activations, Gradients & Batch Normalization: Continuing from the previous notebook, we enhance the MLP model with activation functions, gradient handling, and batch normalization techniques.
  5. NameWeave - Manual Back Propagation: This notebook breaks down the MLP model into atomic pieces of code, emphasizing the importance of understanding backpropagation.
  6. NameWeave - WaveNet: Inspired by WaveNet architecture, we modify the previous model to resemble convolutional neural networks (CNNs).
  7. GPT from Scratch: Implementing all concepts learned previously, we introduce self-attention and decoder-only architecture to generate text, demonstrating the capability of modern architectures like Transformers. The notebook generates infinite Harry Potter-like text based on the provided dataset.
  8. GPT Tokenizer Notebook: A new addition to the repository, this notebook focuses on building a tokenizer for preprocessing text data, specifically designed for use with the GPT model.

Table of Contents

Introduction

In this repository, I explore the implementation of neural networks from scratch. The primary goal is to deepen the understanding of neural network concepts and learn how to implement them without relying on external libraries.

Files

  • GPT from Scratch.ipynb: Jupyter notebook where a GPT model is built from scratch, generating text based on the Harry Potter dataset.
  • GPT Tokenizer Notebook.ipynb: New notebook focusing on building a tokenizer for preprocessing text data.
  • NameWeave (MLP) - Activations, Gradients & Batch Normalization.ipynb: Jupyter notebook enhancing the NameWeave model with activations, gradients, and batch normalization.
  • NameWeave - Manual Back Propagation.ipynb: Jupyter notebook demonstrating manual backpropagation through the NameWeave model.
  • NameWeave - Multi Layer Perceptron.ipynb: Jupyter notebook expanding the NameWeave model into a Multi Layer Perceptron with multiple layers.
  • NameWeave - WaveNet.ipynb: Jupyter notebook implementing a WaveNet-like architecture for name generation.
  • Neural Network with Derivatives.ipynb: Jupyter notebook containing the implementation of a neural network with derivatives.
  • NNFS-GitHub Banner.gif: Banner image for the repository.
  • README.md: This file, providing an overview of the repository.
  • LICENSE: The license information for this repository.
  • Datasets/: Directory containing datasets used in the notebooks.
    • Harry_Potter_Books.txt: Dataset used in the GPT from Scratch.ipynb notebook.
    • Indian_Names.txt: Dataset used for all other notebooks.
    • Tokenizer/: Directory for tokenization related files.
      • tokenizer_train.txt: Dataset used for training the tokenizer.
  • ExplanationMedia/Images/: Directory containing images used for explaining the notebooks.
  • ExplanationMedia/Videos/: Directory containing videos used for explaining the notebooks.
  • GPT Scripts/: Directory containing raw Python scripts created for building the GPT model in GPT from Scratch.ipynb.
    • Tokenizer/: Directory for tokenizer related scripts.
      • tokenizer_train.py: Script used for training the tokenizer.

Usage

To explore the content of the lecture series, simply open the respective Jupyter notebook files using a compatible environment.

Install Required Dependencies

You can use Google Colab to view and run these files on the cloud.

OR

To view and run these files locally you need to run to install Jupyter Notebook via PyPi of Python:

Install Jupyter Notebook:

pip install notebook

Run Jupyter Notebook:

jupyter notebook

Future Updates

Stay tuned for additional features, improvements, and possibly new lecture series exploring more advanced topics in neural networks and machine learning.

License

This project is licensed under MIT - see the LICENSE file for details.

neural-networks-from-scratch's People

Contributors

avishakeadhikary avatar

Watchers

 avatar

Forkers

jcmsbits

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.