Coder Social home page Coder Social logo

stablecode's Introduction

StableCode: Stability AI Developer Productivity/Developer

Stochastic Parrot
“ “a colorful parrot with glasses typing away at a computer, flat icon, vector” – SDXL 0.9 ” — Stable Diffusion XL

This repository contains Stability AI's ongoing development of the StableCode series of code models and will be continuously updated with new checkpoints. The following provides an overview of all currently available models. More coming soon.

News

January 16, 2024

  • Released the next version of Stable Code for developers. Catch the release blog post here.

August 9, 2023

August 8, 2023

Released the initial suite of StableCode-Alphas. Catch the release blog post here. Which includes,

  • StableCode-Completion-Alpha-3B
  • StableCode-Completion-Alpha-3B-4k
  • StableCode-Instruct-Alpha-3b

Models

StableCode-Completion-Alpha suite of Models

  • StableCode-Completion-Alpha-3B - StableCode-Completion-Alpha-3B is a 3 billion parameter decoder-only code completion model pre-trained on a diverse set of programming languages that were the top used languages based on the 2023 stackoverflow developer survey with a context length of 16k. Trained on a special augmented version of the starcoder-dataset.
  • StableCode-Completion-Alpha-3B-4k - StableCode-Completion-Alpha-3B-4K is a 3 billion parameter decoder-only code completion model pre-trained on diverse set of programming languages that topped the stackoverflow developer survey with a context length of 4k.

Training Details

Following similar work, we use a multi-stage approach to context length extension (Nijkamp et al., 2023), scheduling 390 billion tokens at context length 4096 followed by 100 billion tokens at 16k tokens. We found that sequence length warmup (Li et al., 2022) helped stabilize early spikes during the first ~80 billion tokens of pre-training. However, it was not applied to the final runs due to significant throughput penalties as length shapes grew across the curriculum.

Training Data

The training is done in two stages, initial pretraining with the top 12 languages, which we got inspired by Stackoverflow developer survey, The initial subsets - Java, Javascript, Python, TypeScript, PHP, SQL, Rust, C, MarkDown, Go, C++, Shell.

This is then followed by continued pretraining with top 6 languages to be an expert in those languages - Java, Javascript, Python, C, C++, Go

Evaluation

The following zero-shot evaluations are performed with the awesome BigCode Evaluation Harness,

Name HuggingFace Name Type Context Length Human-Eval
pass@1
StableCode-Completion-Alpha-3B stabilityai/stablecode-completion-alpha-3b Base 3B 16384 20.18
StableCode-Completion-Alpha-3b-4k stabilityai/stablecode-completion-alpha-3b-4k Base 3B 4096 17.68
Stablecode-Instruct-Alpha-3b stabilityai/stablecode-instruct-alpha-3b Instruction Tuned 3B 4096 26.89
StableCode-Completion-Alpha-3B v1.1 [email protected] Base 3B 16384 22.06

Quickstart

All StableCode models are hosted on the Hugging Face hub. Check out this notebook to run inference with limited GPU capabilities.

Get started on generating code with StableCode-Completion-Alpha by using the following code snippet:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, StoppingCriteria, StoppingCriteriaList

tokenizer = AutoTokenizer.from_pretrained("stabilityai/stablecode-completion-alpha-3b")
model = AutoModelForCausalLM.from_pretrained("stabilityai/stablecode-completion-alpha-3b")
model.half().cuda()

class StopOnTokens(StoppingCriteria):
    def __call__(self, input_ids: torch.LongTensor, scores: torch.FloatTensor, **kwargs) -> bool:
        stop_ids = set([50278, 50279, 50277, 1, 0])
        return input_ids[0][-1] in stop_ids


prompt = f"import torch\nimport torch.nn as nn"

inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
tokens = model.generate(
  **inputs,
  max_new_tokens=64,
  temperature=0.7,
  do_sample=True,
  stopping_criteria=StoppingCriteriaList([StopOnTokens()])
)
print(tokenizer.decode(tokens[0], skip_special_tokens=True))

Licenses

  • Base model checkpoints (StableCode-Completion-Alpha-3B) are licensed under Apache 2.0

  • Instruct-tuned checkpoints (StableCode-Instruct-Alpha-3B) are licensed under StableCode Research License Copyright (c) Stability AI Ltd. All Rights Reserved

  • All code in this repository is licensed under the Apache License 2.0 license.

stablecode's People

Contributors

reshinthadithyan avatar ncoop57 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.