Coder Social home page Coder Social logo

llama2.jl's Introduction

Llama2.jl

Stable Dev Build Status Coverage

This is a port of Andrew Karpathy's llama2.c to Julia.

Important

This is part of the JuliaML course at Technical University Berlin. Therefore, the repository will not be maintained for a long time.

Features

  • Read tokenizer and model weights from .bin files specified in llama2.c
  • Inference, generation loop and chat loop
  • argmax, multinomial and top-p sampling
  • Tokenizer for encoding text to LLM input and decoding LLM output to text
  • Multi-threading in transformer forward function (#37)
  • Compatibility tested with all of Andrew Karpathy's models

Getting started

Add the package to your local environment via Pkg by running

add https://github.com/kleincode/Llama2.jl

To get started, check out the docs.

llama2.jl's People

Contributors

kleincode avatar janik072 avatar thomasfischer11 avatar johanni2000 avatar github-actions[bot] avatar

Stargazers

Alessandro Cheli avatar Yıldırım Akbal avatar  avatar

Watchers

 avatar

llama2.jl's Issues

Use in-place operations in forward function

  • Replace matrix multiplications with mul!(out, in1, in2)
  • Add a @view where applicable, maybe even an @views in front of the whole forward function? But make sure to copy token_embedding_table into x in line 144!
  • Replace rmsnorm, softmax, swiglu with in-place versions
  • Remove mutable from mutable struct RunState

Feedback before Code Review 2

Hi all,

here's some quick feedback before the second code review session next week.
I'm opening this as a single issue to not spam your repository, which already looks really good!

Documentation

  • Please describe in the README what the repo is about and that it is part of course work at TU Berlin
  • Your docstrings still contain TODOs and notes to yourselves (e.g. https://kleincode.github.io/Llama2.jl/dev/#Llama2.open_file-Tuple{String}). Try to write your documentation for potential users, not for yourselves!
  • Please add a second doc page with a small "Getting started" guide for the second code review session. Even just demonstrating your tokenizer is enough.

Code

  • It's a bit odd to keep the Tokenizer struct and Tokenizer constructor in separate places. Use a single docstring for both and document the way you expect users to use your struct.
    struct Tokenizer

    function Tokenizer(tokenizer_path::String, vocab_size::Int)
  • You could use DocStringExtensions.jl to generate parts of your documentation (see example here)
  • Most of your structs hardcode element types to be Float32. I understand that this makes sense for a direct port of the C code base. However, in general, a parametric type Sampler{T} would make sense here.

    Llama2.jl/src/sampler.jl

    Lines 20 to 23 in a4c3241

    struct Sampler
    temperature::Float32
    topp::Float32
    rng_state::MersenneTwister
  • Your type annotations are a bit too strong in general:
    function softmax(x::Vector{Float32})::Vector{Float32}
    • The code looks perfectly valid for any AbstractArray, not just Vector{Float32}
    • Return types generally don't need to be annotated, since they force Julia to cast outputs, which can hide bugs and slow down code
    • During development, I sometimes also like to use very restrictive types to catch initial bugs, but types have to be relaxed after a while. To give you an example why: your softmax function isn't compatible with ForwardDiff.jl's forward mode AD, since that uses a custom dual number type, which you prohibit.

Tests

  • Your test suite looks great!

Documentation for functions

A parameter list for the individual functions would be helpful.
F.E.
rmsnorm
x [point in space]
weight [weight for multiplication]

Real dummy example, but with a more complex function that does not do some obvious math it might be helpful.

weights download

document that it's necessary to download the weights or better yet do that automatically when installing the package.

Math Methods - Testing

Maybe test small valued vectors on math functions. Especially with multiplication.
Maybe even try to brute force bad behaviour.

Setup structs

  • lines 19-75 in llama2.c
  • lines 77-96, 111-140 should be implemented as external constructors

documentation

add documentation how set up and use the package
also a example would be nice

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.