mshumer / gpt-llm-trainer Goto Github PK
View Code? Open in Web Editor NEWLicense: MIT License
License: MIT License
`import torch
tokenizer_custom = GPT2Tokenizer.from_pretrained("gpt2")
model_custom = GPT2LMHeadModel.from_pretrained('gpt2')
generated_custom = tokenizer_custom.encode("The Manhattan bridge")
context_custom = torch.tensor([generated_custom])
past_custom = None
for j in range(100):
print(j)
output_custom, past_custom = model_custom(context_custom, past=past_custom)
token_custom = torch.argmax(output_custom[..., -1, :])
generated_custom += [token_custom.tolist()]
context_custom = token_custom.unsqueeze(0)
sequence_custom = tokenizer_custom.decode(generated_custom)
print(sequence_custom)`
Please help me correct this
'''
KeyError Traceback (most recent call last)
in <cell line: 45>()
45 for i in range(number_of_examples):
46 print(f'Generating example {i}')
---> 47 example = generate_example(prompt, prev_examples, temperature)
48 print(example)
49 prev_examples.append(example)
in generate_example(prompt, prev_examples, temperature)
39 print(response.json())
40
---> 41 return '' + response.json()['content'][0]['text'].split('')[1]
42
43 # Generate examples
KeyError: 'content'
'''
Hello,
The concept of 'LLM Knowledge Distillation', intrinsic to gpt-llm-trainer, isn't explicitly highlighted in the repository. Adding it to the Readme or the Github Topic Tags could introduce multiple benefits. It could help users understand the core mechanism employed and enhance the discoverability of this repository for those seeking similar solutions.
Thanks for considering my suggestion!
Can we modify the code to use gpt 3.5 instead of gpt 4, most people don't have access, and to make it a levelled field we may use double the examples?
I have Colab+ and avalable to me are
v100, t4, tpu
Update: TPU doesn't work using the off-the-shelf version of the script as it assumes an NVIDIA GPU.
which should I use/which will be fastest?
OutOfMemoryError Traceback (most recent call last)
in <cell line: 8>()
6
7 # Reload model in FP16 and merge it with LoRA weights
----> 8 base_model = AutoModelForCausalLM.from_pretrained(
9 model_name,
10 low_cpu_mem_usage=True,
4 frames
/usr/local/lib/python3.10/dist-packages/accelerate/utils/modeling.py in set_module_tensor_to_device(module, tensor_name, device, value, dtype, fp16_statistics)
296 module._parameters[tensor_name] = param_cls(new_value, requires_grad=old_value.requires_grad)
297 elif isinstance(value, torch.Tensor):
--> 298 new_value = value.to(device)
299 else:
300 new_value = torch.tensor(value, device=device)
OutOfMemoryError: CUDA out of memory. Tried to allocate 314.00 MiB (GPU 0; 15.77 GiB total capacity; 14.32 GiB already allocated; 2.12 MiB free; 14.45 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
APIRemovedInV1:
You tried to access openai.ChatCompletion, but this is no longer supported in openai>=1.0.0 - see the README at https://github.com/openai/openai-python for the API.
You can run openai migrate
to automatically upgrade your codebase to use the 1.0.0 interface.
Alternatively, you can pin your installation to the old version, e.g. pip install openai==0.28
A detailed migration guide is available here: openai/openai-python#742
APIRemovedInV1Proxy: def call(*_args: Any, **_kwargs: Any) -> Any
openai.lib._old_api.APIRemovedInV1Proxy instance
First, Matt, congratulations on the success thus far of your projects. I am sure you will soon have Sam Altman unable to sleep :-). In the readme, you said:
It'll take some time (from 10 minutes to a couple of hours, depending on how many examples you generate), but soon, you'll have your fine-tuned model!
Can you give some ballpark cost estimates / ranges for this? Thx.
im interested in developing a model to generate the data for this pipeline, in fact, ive spent the better part of the last several months working on a very large and very scaled up system to do just exactly this thing.
i was wondering if you had a minute this week to talk and maybe compare notes?
Hello @mshumer . I am trying to run the code on colab and running into CUDA out of memory error as below :
OutOfMemoryError Traceback (most recent call last)
in <cell line: 14>()
12
13 # Reload model in FP16 and merge it with LoRA weights
---> 14 base_model = AutoModelForCausalLM.from_pretrained(
15 model_name,
16 low_cpu_mem_usage=True,
4 frames
/usr/local/lib/python3.10/dist-packages/accelerate/utils/modeling.py in set_module_tensor_to_device(module, tensor_name, device, value, dtype, fp16_statistics)
296 module._parameters[tensor_name] = param_cls(new_value, requires_grad=old_value.requires_grad)
297 elif isinstance(value, torch.Tensor):
--> 298 new_value = value.to(device)
299 else:
300 new_value = torch.tensor(value, device=device)
OutOfMemoryError: CUDA out of memory. Tried to allocate 250.00 MiB (GPU 0; 14.75 GiB total capacity; 13.52 GiB already allocated; 48.81 MiB free; 13.63 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Its happening at "Merge the model and store in Google Drive" step.
I am getting the following message: The model gpt-4
does not exist or you do not have access to it. eventhoguh i upgraded to gpt 4. Please help me out
Hi,
Doesn’t finetuning on Claude’s output violate their terms of use?
Wish you could just upload your own jsonl instead of having to generate them in order to use the script, it's like you have to go step by step even if you want to start with the 'Upload the file to OpenAI' step
NousResearch/llama-2-7b-chat-hf is no longer available and when I got to this stage in the project, there was no model available...and the wholöe process needed to be restart from scratch. What llama 2 model would you recommend in its place. Tried the Bloke uncensored and ran into trouble using that one, and since it takes so long getting everything in place for that stage to fail is quite annoying and expensive on colab. So perhaps someone can help suggest a model that is loadable for this project.
is this possible without openai?
Hello!
the cell after"Load Datasets and Train" throw me a prompt to enter API details for wandb.ai. here is the message I get:
Failed to detect the name of this notebook, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable to enable code saving.
wandb: Logging into wandb.ai. (Learn how to deploy a W&B server locally: https://wandb.me/wandb-server)
wandb: You can find your API key in your browser here: https://wandb.ai/authorize
wandb: Paste an API key from your profile and hit enter, or press ctrl+c to quit:
I want to run everything locally, I already replace the codes that goes to openai, but i don't see where this wandb is call and how to avoid it!
Before I'm looking at (Learn how to deploy a W&B server locally: https://wandb.me/wandb-server)
Can you help with directions on how to not use wandb?
thank you
I was getting the RateLimitReached Error yesterday (after around the 80th generation each prompt is around 10000 tokens). My simple workaround is below, but is there a better way?
def generate_examples(tokenizer, prompt, number_of_examples):
# Generate examples
prev_examples = []
for i in range(number_of_examples):
try:
print(f'Generating example {i}')
prompt_tokens = tokenizer.tokenize(prompt)
prev_examples_tokens = [tokenizer.tokenize(example) for example in prev_examples]
total_tokens = len(prompt_tokens) + sum(len(tokens) for tokens in prev_examples_tokens)
print(f'Tokens in prompt and previous examples: {total_tokens}')
example = generate_example(prompt, prev_examples, temperature)
print(example)
prev_examples.append(example)
# if i % 5 == 0:
# time.sleep(10)
except openai.error.RateLimitError:
print("RATELIMITREACHED: waiting 10 seconds")
time.sleep(10)
here is a local version: https://github.com/xiscoding/local_gpt_llm_trainer
In the first cell I got this error message: InvalidRequestError: The model gpt-4
does not exist or you do not have access to it. Learn more: https://help.openai.com/en/articles/7102672-how-can-i-access-gpt-4.
Has anyone else seen this error or know to how fix it?
How to expand the system to limit the generation of fine-tuning samples based on a given set of corpus documents, rather than blindly fabricating them。
For example, generating fine-tuning samples for disease diagnosis, I hope it is based on the case in the uploaded real diagnosis report
the model in point 1 and point 2 shown below is diff, i've compared their respective generated text.. it's really different.
1.just aft 4bit training->gen = pipeline('text-generation', model=model, tokenizer=tokenizer, max_length=max_length)
2.model = PeftModel.from_pretrained(base_model, new_model)
model = model.merge_and_unload()
gen = pipeline('text-generation', model=model, tokenizer=tokenizer, max_length=max_length)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.