Coder Social home page Coder Social logo

dl-resourced's Introduction

DL-resources

Collection of useful DL resources found online

DL courses online

  1. Practical ML

Model Architectures

Transformer architectures

Resnet architectures and variants

  • Simple google search

  • efficientnet vs resnet

  • ![Resnet vs Resnext](images/Resnet_vs_ Resnext.jpg)

  • Squeeze and excitation

  • ![Efficient net - part1](images/EfficientNet_ part1.jpg)

  • ![Efficient net - part2](images/EfficientNet _part2.jpg)

LSTM

CV strategy

Learning rate finder

LR schedulers

  • https://www.jeremyjordan.me/nn-learning-rate/

  • Cosine Annealing - Papers with code

  • Medium post on SGDR

  • With Higher learning rates, learning is faster (good until we start diverging)

  • As we train longer, we tend to approach global minima, so need to reduce lr. Annealing is the part where we train with lower lr to find stable region (envisioned as plateau, where small changes in input doesnt lead to much change in loss function).

  • Cyclic Learning rates - start from one end of sprectrum and increase / decrease the lr using a linear / exp / cosine like function

  • warm restarts - Periodically Reseeting lr to lr_max helps us avoid overfitting regions, saddle points. Warm refers to the point that we continue to use the weights obtained after training for some time and not starting from some pre-defined initialization (random, zero etc)

  • lr_max is found using the lr_range test proposed by Leslie Smith.

  • Two major options

    • Cosine Annealing with warm restarts (Cosine is more aggressive annealing strategy)
    • One cycle lr (across the entire training cycle - linear / cosine annealing)
    • Generally One cycle lr is less over fitting than Cosine Annealing with warm restarts
  • Lower Batch size has a regulaizing effect (less generalization error); CON - higher training time

  • SGD converges little slower but Adam is fast and overfits a litte

  • SWA is a method to kinda average the model as it approaches minima, SGD often settles around the local minima but takes longer to find it

  • Transformers work well - Attention is all you need paper

Image Augumentation techniques

Loss functions

  • Cross Entropy ???
  • label smoothing CE ???
  • Bitempered logistic loss function ???
  • Focal loss function ???
  • Taylor loss function ???

Pytorch lightning

Optimizers

  • Comparison of different Optimizers and lr_schedulers
  • SGD with momentum and wieght decay, Warm restarts is generally SOTA, but takes more time to converge compared to Adam
  • SGD with Stochastic Weighted averaging gives better results
  • Adam converges faster but overfits lightly, different variants available -
  • AdamW ???
  • Ranger ???

Other topics

Youtube channels

dl-resourced's People

Contributors

suryajayaraman avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.