Coder Social home page Coder Social logo

claude2-alpaca's Introduction

Claude2-Alpaca: Instruction tuning datasets distilled from Claude2

This is the repo for the Claude2-Alpaca project, which aims to build and share an instruction-following LLaMA model. The repo contains 52k prompts and responses. The repo contains:

  • The 52k claude-2 data used for finetuning
  • The code for generating the data
  • The code for finetuning 7B and 13B models

Overview

The current open-sourced instruction tuning datasets usually distilled the data from the GPT families, e.g., WizardLM distills the data from GPT-3.5-turbo; GPT4LLM distills the data from GPT-4 and Alpaca distills the data from the Text-Davinci-003. We would like to increase the diversity of the instruction tuning dataset and provide the community with more options!

In this repo, we use the same Alpaca 52k prompts to query the Claude-2 and obtain the claude2-alpaca dataset. We also include the instruction-tuned LLaMA-2 models, training code for re-implementation, and the results.

Training

We include the training script for 7B and 13B models:

# make sure the path in the train.sh is correct (use your own path to llama-2's weight and the output path.)
bash train.sh

Data Generation

export ANTHROPIC_API_KEY=xxx # export your claude key here
python generate_data.py

Results

We use the generated data to fine-tune 7B/13B LLaMA-2 models and show the results here:

Average ARC HellaSwag MMLU TruthfulQA Alpaca_Eval Avg Length
Llama-2-7b-chat 56.335 52.9 78.55 48.32 45.57 71.37 1479
Llama-2-13b-chat 59.935 59.04 81.94 54.64 44.12 81.09 1513
claude_alpaca-7b 57.78 56.66 81.17 46.58 46.71 71.23 1066
claude_alpaca-13b 61.29 61.18 84.08 55.74 44.18 78.93 1127

Compared to the llama2-chat, our models can have better average performance.

The claude2 alpaca dataset used for training can be found here: claude2_alpaca
The trained models can be found here: claude_alpaca-7b and claude_alpaca-13b

Authors

All grad students below contributed equally.

Special thanks to Ping-yeh Chiang for sharing their FSDP model fine-tuning script, which we utilized in this project.

Citation

Please cite the repo if you use the data or code in this repo.

@misc{claude2-alpaca,
  author = {Lichang Chen and Khalid Saifullah and Ming Li and Tianyi Zhou and Heng Huang},
  title = {Claude2-Alpaca: Instruction tuning datasets distilled from claude},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/Lichang-Chen/claude2-alpaca}},
}

TODO:

We include the TODO list for our project. If you also are interested in the following research project, do not hesitate to contact us (we are open to any kind of collaboration)!

  • Investigate the bias of the current model-based evaluations. GPT-4 and Claude-2 may have a preference for its distilled models.
  • Transfer Attack
  • The synergy of different models distilled from different sources.
  • Project Page

claude2-alpaca's People

Contributors

lichang-chen avatar decryptunknowngeek avatar khalidsaifullaah avatar mingliiii avatar

Stargazers

joonhyung-lee avatar Astro avatar Ilia Funtov avatar Diwank Singh Tomer avatar ML Wiz avatar Arturo Sirvent avatar Roche Media avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.