Coder Social home page Coder Social logo

learn-llm's Introduction

learn-llm

This project is dedicated to learning and developing large language models (LLMs). It encompasses various stages of training and testing, including pretraining, supervised fine-tuning (SFT), and reinforcement learning (RL).

Repos

  • pretrain: pretrain code
  • sft: Supervised fine tunning
  • rl: Reinforcement learning
  • agents: some agents

learn-llm's People

Contributors

hengjiustc avatar

Stargazers

 avatar suhejian avatar  avatar Swift avatar  avatar wangjp avatar W Y avatar VeyC avatar  avatar  avatar  avatar liujingkang avatar yxqAIxp avatar  avatar  avatar 爱可可-爱生活 avatar  avatar  avatar Lei Zhao avatar 牧游人 avatar  avatar yjlin avatar  avatar Andre Wu avatar  avatar Wen-Ding Li avatar Fuxu Liu avatar Sheng avatar  avatar huaiyuan Wang avatar angelkawaii2 avatar  avatar stepbystep avatar mh zhang avatar yh avatar Jc Guo avatar  avatar Ray Sun avatar lululuLu avatar  avatar  avatar zhan li-ming avatar HY Max avatar Borui Xu avatar zhaojiaqi avatar Meteor-x avatar  avatar  avatar Jxl avatar ldwang avatar  avatar  avatar  avatar EvanJaye avatar  avatar yourName avatar  avatar  avatar  avatar mikey avatar edward avatar  avatar Wendong Gan avatar  avatar  avatar  avatar studyinglover avatar

Watchers

 avatar Wendong Gan avatar

learn-llm's Issues

时间

您好,请问1B 训练这个FineWeb,1个epoch 大概要多少显卡 跑多久呀

疑问

作者您好,我看了很多案例我发现,在pretrained 的时候,很多作者对input ids填充的时候,没有把label中的pad_token_id变成-100,所以导致loss很快的降低了,现在有一个疑问 到底需不需要把 label中的pad_token_id变成-100,这样修改会导致loss很高,并且很难降低。

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.