Claude2-Alpaca: Instruction tuning datasets distilled from Claude2

This is the repo for the Claude2-Alpaca project, which aims to build and share an instruction-following LLaMA model. The repo contains 52k prompts and responses. The repo contains:

The 52k claude-2 data used for finetuning
The code for generating the data
The code for finetuning 7B and 13B models

Overview

The current open-sourced instruction tuning datasets usually distilled the data from the GPT families, e.g., WizardLM distills the data from GPT-3.5-turbo; GPT4LLM distills the data from GPT-4 and Alpaca distills the data from the Text-Davinci-003. We would like to increase the diversity of the instruction tuning dataset and provide the community with more options!

In this repo, we use the same Alpaca 52k prompts to query the Claude-2 and obtain the claude2-alpaca dataset. We also include the instruction-tuned LLaMA-2 models, training code for re-implementation, and the results.

Training

We include the training script for 7B and 13B models:

# make sure the path in the train.sh is correct (use your own path to llama-2's weight and the output path.)
bash train.sh

Data Generation

export ANTHROPIC_API_KEY=xxx # export your claude key here
python generate_data.py

Results

We use the generated data to fine-tune 7B/13B LLaMA-2 models and show the results here:

	Average	ARC	HellaSwag	MMLU	TruthfulQA	Alpaca_Eval	Avg Length
Llama-2-7b-chat	56.335	52.9	78.55	48.32	45.57	71.37	1479
Llama-2-13b-chat	59.935	59.04	81.94	54.64	44.12	81.09	1513

claude_alpaca-7b	57.78	56.66	81.17	46.58	46.71	71.23	1066
claude_alpaca-13b	61.29	61.18	84.08	55.74	44.18	78.93	1127

Compared to the llama2-chat, our models can have better average performance.

The claude2 alpaca dataset used for training can be found here: claude2_alpaca
The trained models can be found here: claude_alpaca-7b and claude_alpaca-13b

Authors

All grad students below contributed equally.

Special thanks to Ping-yeh Chiang for sharing their FSDP model fine-tuning script, which we utilized in this project.

Citation

Please cite the repo if you use the data or code in this repo.

@misc{claude2-alpaca,
  author = {Lichang Chen and Khalid Saifullah and Ming Li and Tianyi Zhou and Heng Huang},
  title = {Claude2-Alpaca: Instruction tuning datasets distilled from claude},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/Lichang-Chen/claude2-alpaca}},
}

TODO:

We include the TODO list for our project. If you also are interested in the following research project, do not hesitate to contact us (we are open to any kind of collaboration)!

Investigate the bias of the current model-based evaluations. GPT-4 and Claude-2 may have a preference for its distilled models.
Transfer Attack
The synergy of different models distilled from different sources.
Project Page

lichang-chen / claude2-alpaca Goto Github PK

claude2-alpaca's Introduction

Claude2-Alpaca: Instruction tuning datasets distilled from Claude2

Overview

Training

Data Generation

Results

Authors

Citation

TODO:

claude2-alpaca's People

Contributors

Stargazers

Watchers

Forkers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent