Coder Social home page Coder Social logo

large-language-models-for-code's Introduction

Large Language Models(LLMs) for Code

A collection of LLMs of Code.

Continuous update.:running:

If there are any errors or some missing models, please contact us new issue or by e-mail:[email protected].

Contents

Code LLMs

Large Language Models(LLMs) for Code.

The Code LLMs referred to here is a large language model specially trained for code-related tasks. Its training corpus may not only contain code, but also natural language.

The Code LLMs here does not include the general large language model, although the general large language model also has the ability to complete code-related tasks.

Timeline of Code LLMs

image-20230424054009632

Parameters of Code LLMs

LLMS

Models

  • Codex [OpenAI] [2021.07] [Close]

    πŸ“ƒEvaluating Large Language Models Trained on Code

    🏴introduction

    Model Architecture: Decoder Only,GPT Family
    Params: 12B
    Training Data: Collected[Code:159GB]
    Training Time: -
    Languages: Python[Multilingual]
    Evaluation: HumanEval, APPS
    Supported Tasks: Code Generation,Docstring Generation
  • Tabnine [Close]

    πŸ”—AI assistant for software developers

    🏴introduction

    Model Architecture: LLM
    Params: -
    Training Data: -
    Training Time: -
    Languages: -
    Evaluation: -
    Supported Tasks: Whole line completions,Full-function completions,Natural language to code completions
  • AlphaCode [DeepMind] [2022.03] [Close]

    πŸ“ƒCompetition-Level Code Generation with AlphaCode

    🏴introduction

    Model Architecture: Encoder-Decoder
    Params: 41B
    Training Data: Collected[Code:715.1GB]
    Training Time: -
    Languages: 12langs
    Evaluation: HumanEval,APPS,CodeContest
    Supported Tasks: Competition-Level Code Generation
  • PaLM-Coder [Google] [2022.04] [Close]

    πŸ“ƒPaLM: Scaling Language Modeling with Pathways

    🏴introduction

    Model Architecture: Decoder Only
    Params: 8B, 62B, 540B
    Training Data: Collected[Text: 741B tokens Code: 39GB(780B tokens trained)]
    Training Time: 6144 TPU
    Languages: Multiple
    Evaluation: HumanEval,MBPP,TransCoder,DeepFix
    Supported Tasks: Code Generation,Code Translation,Code Repa 
  • PolyCoder [CMU] [2022.02] [Open]

    πŸ“ƒA Systematic Evaluation of Large Language Models of Code

    🏴introduction

    Model Architecture: Decoder Only,GPT Family
    Params: 2.7B
    Training Data: Collected[Code: 253.6GB]
    Training Time: - 
    Languages: 12 langs
    Evaluation: HumanEval
    Supported Tasks: Code Generation
  • GPT-Neo [EleutherAI] [2021.03] [Open]

    πŸ“ƒGPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow

    🏴introduction

    Model Architecture: Decoder Only,GPT Family
    Params: 1.3B, 2.7B
    Training Data: The Pile[Text: 730GB Code: 96GB(400B tokens trained)]
    Training Time: -
    Languages: Multiple
    Evaluation: HumanEval
    Supported Tasks: Code Generation
  • GPT-NeoX [EleutherAI] [2022.04] [Open]

    πŸ“ƒGPT-NeoX-20B: An Open-Source Autoregressive Language Model

    🏴introduction

    Model Architecture: Decoder Only,GPT Family
    Params: 20B
    Training Data: The Pile[Text: 730GB Code: 95GB(473B tokens trained)]
    Training Time: -
    Languages: Multiple
    Evaluation: HumanEval
    Supported Tasks: Code Generation
  • GPT-J [EleutherAI] [2021.06] [Open]

    πŸ”—GPT-J-6B: 6B JAX-Based Transformer

    🏴introduction

    Model Architecture: Decoder Only,GPT Family
    Params: 6B
    Training Data: The Pile[Text: 730GB Code: 96GB(473B tokens trained)]
    Training Time: -
    Languages: Multiple
    Evaluation: HumanEval
    Supported Tasks: Code Generation
  • Incoder [Meta] [2022.04] [Open]

    πŸ“ƒInCoder: A Generative Model for Code Infilling and Synthesis

    🏴introduction

    Model Architecture: Decoder Only
    Params: 1.3B,6.7B
    Training Data: Collected[Code: 159GB StackOverflow: 57GB(60B tokens trained)]
    Training Time: -
    Languages: 28 langs
    Evaluation: HumanEval,MBPP,CodeXGLUE
    Supported Tasks: Infilling Lines Of Code (HumanEval),Docstring Generation (CodeXGLUE), Return Type Prediction,Varible Name Predic
  • CodeGen [Salesforce] [2022.03] [Open] 🌟popular

    πŸ“ƒCodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis

    🏴introduction

    Model Architecture: Decoder Only
    Params: 6.1B, 16.1B
    Training Data: The Pile,BigQuery,BigPython[Code:150B tokens Text:355B tokens]
    Training Time: -
    Languages: CodeGen-Multi(6 langs),CodeGen-Mono(python)
    Evaluation: HumanEval, MTPB
    Supported Tasks: Single-Turn Code Generation,Multi-Turn Code Generation
  • CodeGeeX [THU] [2022.09] [Open]

    πŸ“ƒCodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Evaluations on HumanEval-X

    🏴introduction

    Model Architecture: Decoder Only,GPT Family
    Params: 13B
    Training Data: The Pile,CodeParrot,Collected[Code: 158B tokens(850B tokens trained)]
    Training Time: 1536 Ascend 910 AI processors (32GB) with Mindspore (v1.7.0),two months
    Languages: 23 langs
    Evaluation: HumanEval-X,HumanEval,MBPP,CodeXGLUE,XLCoST
    Supported Tasks: Multilingual Code Generation,Code Translation
  • AiXcoder [PKU] [Close]

    πŸ”—AixCoder

    🏴introduction

    Model Architecture: -
    Params: 13B?
    Training Data: -
    Training Time: -
    Languages: Multiple
    Evaluation: -
    Supported Tasks: Code Generation,Code Completion,Code Search
  • Pangu-Coder [Huawei Noah’s Ark Lab] [2022.07] [Close]

    πŸ“ƒPanGu-Coder: Program Synthesis with Function-Level Language Modeling

    🏴introduction

    Model Architecture: PANGU-Ξ± architecture,Decoder Only
    Params: 2.6 B
    Training Data: Collected(147GB)
    Training Time: - 
    Languages: python
    Evaluation: HumanEval,MBPP
    Supported Tasks: Code Generation
  • ERNIE-Code [Baidu] [2022.12] [Close]

    ⚠️Don't think ERNIE-Code is a Code LLMs

    πŸ“ƒERNIE-Code: Beyond English-Centric Cross-lingual Pretraining for Programming Languages

    🏴introduction

    Model Architecture: Encoder-Decoder,T5-base
    Params: 560M
    Training Data: CodeSearchNet,NL Corpus
    Training Time: - 
    Languages: Multiple
    Evaluation: mCoNaLa,Bugs2Fix,Microsoft Docs
    Supported Tasks: Multilingual Code-to-Text, Text-to-Code, Code-to-Code, and Text-to-Text Generation.

Improve Code LLMs

Dataset

Benchmark

Future

​ Future development

large-language-models-for-code's People

Contributors

wanghanbinpanda avatar

Stargazers

stophobia avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.