Coder Social home page Coder Social logo

frederickngoiya / gpt-4-llm Goto Github PK

View Code? Open in Web Editor NEW

This project forked from instruction-tuning-with-gpt-4/gpt-4-llm

0.0 0.0 0.0 84.71 MB

Instruction Tuning with GPT-4

Home Page: https://instruction-tuning-with-gpt-4.github.io/

License: Apache License 2.0

HTML 85.24% Jupyter Notebook 14.76%

gpt-4-llm's Introduction

Instruction Tuning with GPT-4

Baolin Peng*, Chunyuan Li*, Pengcheng He*, Michel Galley, Jianfeng Gao (*Equal Contribution)

[Project Page] [Paper]


Pronounced as "GPT-4-LLM" or "GPT-for-LLM", image is generated by GLIGEN

Code License Data License

This is the repo for the GPT-4-LLM, which aims to share data generated by GPT-4 for building an instruction-following LLMs with supervised learning and reinforcement learning. The repo contains:

  • English Instruction-Following Data generated by GPT-4 using Alpaca prompts for fine-tuning LLMs.
  • Chinese Instruction-Following Data generated by GPT-4 using Chinese prompts translated from Alpaca by ChatGPT.
  • Comparison Data ranked by GPT-4 to train reward models.
  • Answers on Unnatural Instructions Data from GPT-4 to quantify the gap between GPT-4 and instruction-tuned models at scale.

Usage and License Notices: The data is intended and licensed for research use only. The dataset is CC BY NC 4.0 (allowing only non-commercial use) and models trained using the dataset should not be used outside of research purposes.

๐Ÿ”ฅ News

  • [2023.04.17] Visual instruction tuning with GPT-4 is released! Please check out the multimodal model LLaVA: [Project Page] [Paper] [Demo] [Code] [Data] [Model]
  • [2023.04.15] Updated comparision data, including three model responses and GPT-4 evaluation scores.
  • [2023.04.06] Paper and data are released.

Overview

Large Language Models (LLMs) have shown impressive generalization capabilities such as in-context-learning and chain-of-thoughts reasoning. To enable LLMs to follow natural language instructions and complete real-world tasks, researchers have been exploring methods of instruction-tuning of LLMs. To advance the state of the art of instruction-tuning for LLMs, we present the first attempt to use GPT-4 to generate instruction-following data for LLM finetuning.

Data Release

  • alpaca_gpt4_data.json contains 52K instruction-following data generated by GPT-4 with prompts in Alpaca. This JSON file has the same format as Alpaca data, except the output is generated by GPT-4:

    • instruction: str, describes the task the model should perform. Each of the 52K instructions is unique.
    • input: str, optional context or input for the task.
    • output: str, the answer to the instruction as generated by GPT-4.
  • alpaca_gpt4_data_zh.json contains 52K instruction-following data generated by GPT-4 with Alpaca prompts translated into Chinese by ChatGPT. This JSON file has the same format.

  • comparison_data.json ranked responses from three models, including GPT-4, GPT-3.5 and OPT-IML by asking GPT-4 to rate the quality.

    • user_input: str, prompts used for quering LLMs.
    • completion_a: str, a model completion which is ranked higher than completion_b.
    • completion_b: str, a different model completion which has a lower quality score.
  • unnatural_instruction_gpt4_data.json contains 9K instruction-following data generated by GPT-4 with prompts in Unnatural Instruction. This JSON file has the same format as Alpaca data.

How Good is the Data

Human evaluation was performed on model generation results using Amazon Mechanical Turk following Helpfulness, Honestness and Harmlessness criteria by Anthropic AI. The results are summarized as follows:

  • Two instruction-tuned LLaMA models were compared, fine-tuned on data generated by GPT-4 and GPT-3 respectively.
  • LLaMA-GPT-4 performs substantially better than LLaMA-GPT-3 in the "Helpfulness" criterion.
  • LLaMA-GPT-4 performs similarly to the original GPT-4 in all three criteria, suggesting a promising direction for developing state-of-the-art instruction-following LLMs.

LLaMA-GPT4 vs Alpaca (i.e., LLaMA-GPT3) LLaMA-GPT4 vs GPT-4

Fine-tuning with the data

We follow the same reciple to fine-tune LLaMA as Alpaca using standard Hugging Face training code.

To reproduce our results with LLaMA 7B, first setup Alpaca repo and run the following CMDs:

## cmd we used to train LLaMA on 16*V100
torchrun --nproc_per_node=16 
--master_port=12345 train.py 
--model_name_or_path PATH/TO/LLaMA
--data_path ./data/alpaca_gpt4_data.json 
--output_dir PATH/TO/SAVE
--num_train_epochs 3 
--per_device_train_batch_size 1 
--per_device_eval_batch_size 1 
--gradient_accumulation_steps 4 
--evaluation_strategy "no" 
--save_strategy "steps" 
--save_steps 200 
--save_total_limit 1 
--learning_rate 2e-5 
--weight_decay 0. 
--warmup_ratio 0.03 
--lr_scheduler_type "cosine" 
--logging_steps 1 
--deepspeed configs/ds_config.json

To evaluate the results, we highly recommend users refer to Vicuna as they have provided awesome serving scripts and evaluation piplelines.

Collect results and reproduce figure plots

The results can be plotted using the included IPython notebook plots/main_plots.ipynb. Start the IPython Notebook server:

$ cd plots
$ ipython notebook

Select the main_plots.ipynb notebook and execute the included code. Note that without modification, we have copyed our extracted results into the notebook, and script will output figures in the paper. Some related data for plots have been provided in data, the generated plots are saved in plots/output If you've run your own training and wish to plot results, you'll have to organize your results in the same format instead.

Shortcut: to skip all the work and just see the results, take a look at this notebook with cached plots.

Citation

@article{peng2023instruction,
  title={Instruction Tuning with GPT-4},
  author={Peng, Baolin and Li, Chunyuan and He, Pengcheng and Galley, Michel and Gao, Jianfeng},
  journal={arXiv preprint arXiv:2304.03277},
  year={2023}
}

Related Projects

Acknowledgement

This repo benefits from LLaMA, Alpaca, and Vicuna. Thanks for their wonderful works.

gpt-4-llm's People

Contributors

chunyuanli avatar instruction-tuning-with-gpt-4 avatar 152334h avatar eltociear avatar pengbaolin avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.