Coder Social home page Coder Social logo

Some things I have written or participated in:

Aria

https://github.com/EleutherAI/aria Wanna make some music?

Thing

https://github.com/honglu2875/thing Catch your tensors quietly in your running codes and send them to your python console for inspection.

Mistral model in JAX

https://github.com/honglu2875/mistral_jax You know what it is if you are familiar with OSS LLM ;)

YaRN, a context length extension of RoPE

Together with Bowen, Jeff and Enrico, we posted a preprint regarding how to extend the context window of models using RoPE embedding (such as Llama families). Enrico trained a few amazing models such as this Llama-2 128k-context. It is quite amazing. I tried to feed it with the whole Pride and Prejudice and did manual Q&A of the novel. It did great! Bowen tried Sherlock Holmes. It wasn't perfect but it was definitely working!

Our research (i.e, hacky) implementation: https://github.com/jquesnelle/yarn.

an IR that generates pytorch and JAX codes (WIP)

Tired of jumping back and forth between PyTorch and JAX? I'm making an amateur shot here by starting with an Intermediate Representation as a graph and performs codegen on PyTorch and JAX. Still a lot of things to sort out but a basic version is starting to work... Still not usable (will remove it once I'm happy by myself)

LLM foundation model at Multi Tech Inc.

(models and codes are not available for now)

Trained multiple 3B-13B foundation model using various datasets totalling 1.5 trillion tokens. Focused on the ability of code synthesis and financial document generation/retrieval. Long context length (8192). FSDP with 640 A-100.

OpenELM (Evolution through Large Models)

https://github.com/CarperAI/OpenELM

This is a work in progress with Carper AI to replicate the paper Evolution through Large Models in open-source domain.

ELM makes use of LLM's capacity for code generations and mutations to perform evolution methods such as MAP-Elites. In turn, it uses the generated data and RL finetuning to further align LLM with the given task. We have implemented the evolution and LM pipelines except for the RL finetuning components. We use models finetuned on Github commits, and we will also gradually release them.

Architext

https://github.com/CarperAI/ArchitextRL

Helping with the integration of Architext with ELM.

Architext is a Carper AI project that makes use of Language models to generate presentations of architecture designs.

Reinforcement learning of Hironaka's polyhedra game

https://github.com/honglu2875/hironaka

Human intuition favors spaces that are locally modelled by products of coordinate lines (locally $\mathbb R^n$). They are called smooth spaces, manifolds, locally Euclidean spaces, etc. depending on your math background. But there are many other spaces that cannot be described like that, and we call them singularities. A common way to handle them is to convert singularities back to the smooth points: resolution of singularities. The existence of resolution of singularites in characteristic $0$ was a Fields medal result by Hironaka, as this process has been deeply weaved into algebraic geometry and influenced other branches of geometry.

An old but overlooked angle about this is that: Resolving singularities can be a Markov Decision Process. With the rise of modern deep reinforcement learning, we present the repo that implements multiple deep RL methods (gym+stablebaseline3; DQN with PyTorch DDP + MAP-Elites; AlphaZero using JAX) applied on resolution of singularities.

some side projects

https://github.com/honglu2875/fmlang_env (planned to do "RL with interpreter feedback")

https://github.com/honglu2875/Bookit-proof-of-concept.git (was learning Kotlin with a hands-on project)

Honglu Fan's Projects

elm icon elm

Evolution Through Large Models Implementation

flask-awscognito icon flask-awscognito

Extension for Flask that adds support for AWSCognito into your application

go icon go

Implement RL (MCTS) on Go.

gpt-neox icon gpt-neox

An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.

hironaka icon hironaka

A utility package for Hironaka game of local resolution of singularities

hironaka_v2 icon hironaka_v2

This is a clean redo using only JAX and we reconstruct a simpler design.

jag icon jag

Just Another deep learninG framework

jaxformer icon jaxformer

Minimal library to train LLMs on TPU in JAX with pjit().

jaxlm icon jaxlm

A study playground for components of LLM in JAX

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.