Coder Social home page Coder Social logo

hmunachi / nanodl Goto Github PK

View Code? Open in Web Editor NEW
267.0 9.0 11.0 45.44 MB

A Jax-based library for designing and training transformer models from scratch.

License: MIT License

Python 100.00%
attention-mechanism deep-learning distributed-training gpt jax llama machine-learning nlp transformer attention

nanodl's People

Contributors

afrendeiro avatar hmunachi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nanodl's Issues

Jax version

With the latest jax=='0.4.25' and jaxlib=='0.4.25' I get:

import nanodl
# AttributeError: module 'jax.random' has no attribute 'KeyArray'

Deleted package detected

I'm a Cyber Security researcher and developer of PackjGuard [1] to address open-source software supply chain attacks.

Issue

During my research, I detected a deleted package in this repository.

Details

Specifically, the package nanodl mentioned in file README at line 43 does not exist on the public PyPI registry. A bad actor can hijack this package to propagate malicious code.

Impact

Not only your apps/services using https://github.com/HMUNACHI/nanodl repo code are vulnerable to this attack, but the users of your open-source Github repo could also fall victim.

You could read more about such attacks here: https://medium.com/@alex.birsan/dependency-confusion-4a5d60fec610

Remediation

Please highlight this in file README and register a placeholder package for nanodl on public PyPI soon to remediate.

To automatically fix such issues in future, please install PackjGuard Github app [1].

Thanks!

  1. PackjGuard is a Github app that monitors your repos 24x7, detects vulnerable/malicious/risky open-source dependencies, and creates pull requests for auto remediation: https://github.com/marketplace/packjguard

Gradient synchronization in data-parallel trainers

Hey, great job with nanodl!

I was just looking through the code and noticed that when in Lambda's Trainer the gradients are not being averaged across devices here:

loss, grads = jax.value_and_grad(loss_fn)(state.params)
state = state.apply_gradients(grads=grads)

Not sure if this is happening elsewhere but usually to keep the weights in sync you apply a jax.lax.pmean over the gradients before passing them to apply_gradients, e.g.

grads = jax.lax.pmean(grads, axis_name='devices')

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.