georgian-io / llm-finetuning-toolkit Goto Github PK
View Code? Open in Web Editor NEWToolkit for fine-tuning, ablating and unit-testing open-source LLMs.
License: Apache License 2.0
Toolkit for fine-tuning, ablating and unit-testing open-source LLMs.
License: Apache License 2.0
llmtune run --verbose
llmtune run -v
See #115 (comment)
There are a bunch of unused code relating to accelerate
. Should remove to keep code cleaner
If I have:
Maybe we can get calc_val_split=1-0.1-0.8=0.1
split as validation. Maybe also apply something like max(calc_val_split, 0.05)
to prevent val split to be too big
I tried following the same framewrok for training the other llms (falcon,mistral,etc) with SFTTrainer to train the FlanT5 model as well.
But the results are bad, as if the llm doesn't learn anything.
Training it with the Seq2Seq method works. Why did you use this method for FlanT5 an SFTTrainer for all the other llms?
This improves user experience when desire output is json
E.g.
prompt_stub:
>-
{
"foo": {col_1},
"bar": {col_2}
}
The current approach would not work as we're capturing everything inside {}
We should cover most of our functionality with unit tests and report the coverage back to the README.
The command llmtune inference [experiment_dir]
aims to provide a versatile interface for running inference on pre-trained language models, allowing users to:
llmtune inference [experiment_dir] [options]
experiment_dir
: The experiment directory from finetuning experiments
--dataset [dataset_path]
: Path to a dataset (e.g., CSV, JSON, or Huggingface)
--text-input [text]
: An arbitrary text input to run inference on. This option can be used for a single text input or for quick manual inference.
--column [name=value]
: Allows specification of a column name and value for custom inputs. This option can be used multiple times to specify different column values.
Inference on a dataset:
llmtune inference ./my_experiment --dataset ./data/my_dataset.csv
Inference on arbitrary text:
llmtune inference ./my_experiment --text-input "This is an example text input for inference."
Inference with specific input values:
llmtune inference ./my_experiment --column column_1="foo" --column column_2="bar"
Related to: #160
The basic "quickstart" example downloads Mistral-7B-Instruct-v0.2, which is ~15GB, taking me over 20 minutes to download. A smaller model should be used as a quickstart example.
To Reproduce
Steps to reproduce the behavior:
The basic version of the quickstart should be, in my opinion, a 10 minute (max) process and not require so much disk space.
Environment:
Describe the bug
At dataset creation, the dataset generated will always get the cached version despite change in file.
To Reproduce
toolkit.py
toolkit.py
will not create a new dataset with desired changesExpected behavior
Environment:
Ubuntu
Change this file to include JSONL and pass the correct parameter as mentioned above. https://github.com/georgian-io/LLM-Finetuning-Toolkit/blob/main/llmtune/data/ingestor.py
I tried to go through the README file as mentioned, and once i execute llama2_baseline_inference.py
I am thrown with the error
ImportError: Using `load_in_8bit=True` requires Accelerate: `pip install accelerate` and the latest version of bitsandbytes `pip install -i https://test.pypi.org/simple/ bitsandbytes` or pip install bitsandbytes`
even though the packages are installed in my environment. I was able to circumvent this problem by upgrading my datasets
library using pip install -U datasets
and now I received one other error as given below in this link
To avoid this issue, I downgraded my transformers
library to 4.3 and currently, I am unable to download some of the checkpoints. I feel the packages need to be revamped with the latest versions
Ran:
llmtune generate config
llmtune run ./config.yml
Things worked well (once I fixed my mistake with Mistral/huggingface repo permissions). The job ran very fast and put results into the "experiment" directory. But the experiment/XXX/results/ directory only has a "results.csv" file in it. I expected there to be results from the qa/llm_tests section in the config.yml file, which looks like this:
qa:
llm_tests:
- jaccard_similarity
- dot_product
- rouge_score
- word_overlap
- verb_percent
- adjective_percent
- noun_percent
- summary_length
Do I have to do something extra to get the qa to run?
config.yml
out of the source repo. We can write a simple script to download the file and output to user's current working directoryIt would be nice if this project had an official Docker image and a docker-compose example - that would make trying it out easier for a lot of folks ๐
Is your feature request related to a problem? Please describe.
I'm working on a problem that requires me to split my data in a specific way (base on dates). Right now the config only allows for a single dataset to be provided and it internally does a train-test split based on the values provided for the test_size
and train_size
parameters.
Describe the solution you'd like
Ideally, an option to specify paths to both train and test data.
Describe alternatives you've considered
The alternative would be to add in support for other types of data splitting which I don't think makes sense for this repo to include.
Additional context
None
Is your feature request related to a problem? Please describe.
prompt
and prompt_stub
Describe the solution you'd like
Would be nice to be able to specify where the config file is.
Ensure all releases, style checks, and unit tests can be run via CI, blocking any PRs that fail CI.
For Docker packages, use: https://github.com/orgs/georgian-io/packages
For PyPI packages, use: https://pypi.org/
Hello
I ran the falcon classification task uaing the following command:
!python falcon_classification.py --lora_r 64 --epochs 1 --dropout 0.1 # finetune Falcon-7B on newsgroup classification dataset
Upon inspecting the model, I find that many of the layers are full rank and not the lower rank
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("experiments/classification-sampleFraction-0.99_epochs-1_rank-64_dropout-0.1/assets")
Here is a screenshot showing this
Is this expected behavior ?
After installation, run:
llmtune generate config
==> works fine
llmtune run ./config.yml
==> get this error
OSError: You are trying to access a gated repo.
Make sure to request access at https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2 and pass a token having permission to this repo either by logging in with huggingface-cli login
or by passing token=<your_token>
.
So then I do:
huggingface-cli login
===> login successfully
llmtune run ./config.yml
==> get same error
Any ideas?
Describe the solution you'd like
Describe the bug
I'm trying to run this toolkit on colab notebook with T4 GPU and ran into errors. In order to get it working, I needed to turn bf16 and tf32 to false, and fp16 to true. There's already a note for the bf16 and fp16, maybe we can add a note for tf32 as well.
How is latency and throughput is being measured for Llama 2 7B model inference benchmarking using TGI. Reference
There is no good & easy-to-start end-to-end distributed training example on the web. Plus, there are so many ways of doing this: via raw PyTorch, via Ray Train, via TorchX, via Accelerate or via DeepSpeed.
How could I do this with toolkit?
Make sure we have:
Make sure we have docker image for toolkit.
Make sure we have pipy package for toolkit.
Are we supporting flash_attention feature? https://github.com/Dao-AILab/flash-attention/tree/main
It's much better to publish documentation on a dedicated static hosting solution.
https://docs.github.com/en/pages/getting-started-with-github-pages or https://medium.com/swlh/publish-a-static-website-in-a-day-with-mkdocs-and-netlify-3cc076d0efaf
Some of my examples of historical chat logs. How should I incorporate this into the input ideally?
Is your feature request related to a problem? Please describe.
Describe the solution you'd like
Describe alternatives you've considered
black
from time to time on the whole repo, but not the best solution for collaborationEnsure that we include a Makefile containing all the necessary development commands, such as how to run tests, perform releases, and execute style checks, among others.
For a great example, see the Makefile at: https://github.com/huggingface/transformers/blob/main/Makefile
Is it possible to provide a config file that shows how to run inference on an already fine-tuned model?
I have run the starter config, and it looks like the final PEFT model weights are in experiment/XXX/weights/.
So how do I re-run inference only (and possibly qa checks) on that model?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.