tomekkorbak / pretraining-with-human-feedback Goto Github PK
View Code? Open in Web Editor NEWCode accompanying the paper Pretraining Language Models with Human Preferences
Home Page: https://arxiv.org/abs/2302.08582
License: MIT License
Code accompanying the paper Pretraining Language Models with Human Preferences
Home Page: https://arxiv.org/abs/2302.08582
License: MIT License
I'm getting the following output, which ends with an error:
I'm trying to run python train.py --task configs/toxicity/pretrain.yml --method configs/toxicity/mle.yml
setting gradient_accumulation_steps=16 based on effective_batch_size=64 and instantaneous_bsz=80 (world_size=1, n_gpu=10)
setting max_steps=50354 based on num_tokens=3.30e+09 and tokens_already_seen=0.00e+00
Setting train_dataloader.batch_size=80
Setting state.tokens_seen=0.00e+00
Generating samples, scenario unconditional, batch 1 of 8
Generating samples, scenario unconditional, batch 2 of 8
Generating samples, scenario unconditional, batch 3 of 8
Generating samples, scenario unconditional, batch 4 of 8
Generating samples, scenario unconditional, batch 5 of 8
Generating samples, scenario unconditional, batch 6 of 8
Generating samples, scenario unconditional, batch 7 of 8
Generating samples, scenario unconditional, batch 8 of 8
Using pad_token, but it is not set yet.
max_steps is given, it will override any value given in num_train_epochs
Using amp half precision backend
/afs/cs.stanford.edu/u/schundi/miniconda/lib/python3.9/site-packages/transformers/optimization.py:306: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning
warnings.warn(
***** Running training *****
Num examples = 64453120
Num Epochs = 9223372036854775807
Instantaneous batch size per device = 8
Total train batch size (w. parallel, distributed & accumulation) = 1280
Gradient Accumulation steps = 16
Total optimization steps = 50354
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Traceback (most recent call last):
File "/lfs/hyperturing1/0/schundi/pretraining-with-human-feedback/train.py", line 153, in <module>
train(args.checkpoint_path, config=config)
File "/lfs/hyperturing1/0/schundi/pretraining-with-human-feedback/train.py", line 129, in train
trainer.train(resume_from_checkpoint=checkpoint_path)
File "/afs/cs.stanford.edu/u/schundi/miniconda/lib/python3.9/site-packages/transformers/trainer.py", line 1343, in train
self.control = self.callback_handler.on_train_begin(args, self.state, self.control)
File "/afs/cs.stanford.edu/u/schundi/miniconda/lib/python3.9/site-packages/transformers/trainer_callback.py", line 347, in on_train_begin
return self.call_event("on_train_begin", args, state, control)
File "/afs/cs.stanford.edu/u/schundi/miniconda/lib/python3.9/site-packages/transformers/trainer_callback.py", line 388, in call_event
result = getattr(callback, event)(
File "/lfs/hyperturing1/0/schundi/pretraining-with-human-feedback/apo/callbacks.py", line 88, in on_train_begin
self.run(args, state, control, model, tokenizer, **kwargs)
File "/lfs/hyperturing1/0/schundi/pretraining-with-human-feedback/apo/callbacks.py", line 171, in run
self.generate_and_score(model, tokenizer, step=state.global_step)
File "/lfs/hyperturing1/0/schundi/pretraining-with-human-feedback/apo/callbacks.py", line 203, in generate_and_score
for name, value in metric.score_texts(texts=samples.continuations).items()
File "/lfs/hyperturing1/0/schundi/pretraining-with-human-feedback/apo/metrics.py", line 80, in score_texts
pool = Pool(os.cpu_count())
File "/afs/cs.stanford.edu/u/schundi/miniconda/lib/python3.9/multiprocessing/context.py", line 119, in Pool
return Pool(processes, initializer, initargs, maxtasksperchild,
File "/afs/cs.stanford.edu/u/schundi/miniconda/lib/python3.9/multiprocessing/pool.py", line 212, in __init__
self._repopulate_pool()
File "/afs/cs.stanford.edu/u/schundi/miniconda/lib/python3.9/multiprocessing/pool.py", line 303, in _repopulate_pool
return self._repopulate_pool_static(self._ctx, self.Process,
File "/afs/cs.stanford.edu/u/schundi/miniconda/lib/python3.9/multiprocessing/pool.py", line 326, in _repopulate_pool_static
w.start()
File "/afs/cs.stanford.edu/u/schundi/miniconda/lib/python3.9/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "/afs/cs.stanford.edu/u/schundi/miniconda/lib/python3.9/multiprocessing/context.py", line 277, in _Popen
return Popen(process_obj)
File "/afs/cs.stanford.edu/u/schundi/miniconda/lib/python3.9/multiprocessing/popen_fork.py", line 19, in __init__
self._launch(process_obj)
File "/afs/cs.stanford.edu/u/schundi/miniconda/lib/python3.9/multiprocessing/popen_fork.py", line 66, in _launch
self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory
Have you encountered this error before, or have advice to get past it?
Approximately how much GPU memory is required to pretrain? We're running on a single GPU but we're receiving the following error, even with batch size 1:
RuntimeError: CUDA out of memory. Tried to allocate 296.00 MiB (GPU 0; 10.76 GiB total capacity; 6.34 GiB already allocated; 206.56 MiB free; 9.41 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Do we just need to move to a GPU with more memory? Or are we doing something wrong?
These two lines throw an error AttributeError: 'dict' object has no attribute 'input_ids'
The cause is that inputs
is a dict with string keys, but the code references their member object input_ids
.
https://github.com/tomekkorbak/pretraining-with-human-feedback/blob/master/apo/trainer.py#L36-L37
When I run the code via:
python train.py --task configs/toxicity/pretrain.yml --method configs/toxicity/mle.yml
I am getting this error:
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Using pad_token, but it is not set yet.
setting gradient_accumulation_steps=8 based on effective_batch_size=64 and instantaneous_bsz=8 (world_size=1, n_gpu=1)
setting max_steps=50354 based on num_tokens=3.30e+09 and tokens_already_seen=0.00e+00
max_steps is given, it will override any value given in num_train_epochs
Setting train_dataloader.batch_size=8
Using amp half precision backend
/data1/debajyoti/test/pre_human_feedback/env/lib/python3.9/site-packages/transformers/optimization.py:306: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning
warnings.warn(
***** Running training *****
Num examples = 3222656
Num Epochs = 9223372036854775807
Instantaneous batch size per device = 8
Total train batch size (w. parallel, distributed & accumulation) = 64
Gradient Accumulation steps = 8
Total optimization steps = 50354
Traceback (most recent call last):
File "/data1/debajyoti/test/pre_human_feedback/pretraining-with-human-feedback/train.py", line 153, in <module>
train(args.checkpoint_path, config=config)
File "/data1/debajyoti/test/pre_human_feedback/pretraining-with-human-feedback/train.py", line 129, in train
trainer.train(resume_from_checkpoint=checkpoint_path)
File "/data1/debajyoti/test/pre_human_feedback/env/lib/python3.9/site-packages/transformers/trainer.py", line 1343, in train
self.control = self.callback_handler.on_train_begin(args, self.state, self.control)
File "/data1/debajyoti/test/pre_human_feedback/env/lib/python3.9/site-packages/transformers/trainer_callback.py", line 347, in on_train_begin
return self.call_event("on_train_begin", args, state, control)
File "/data1/debajyoti/test/pre_human_feedback/env/lib/python3.9/site-packages/transformers/trainer_callback.py", line 388, in call_event
result = getattr(callback, event)(
File "/data1/debajyoti/test/pre_human_feedback/pretraining-with-human-feedback/apo/callbacks.py", line 135, in on_train_begin
tokens_already_seen = kwargs.get('train_dataloader').dataset.datapipe.skip_tokens
File "/data1/debajyoti/test/pre_human_feedback/env/lib/python3.9/site-packages/torch/utils/data/datapipes/datapipe.py", line 129, in __getattr__
raise AttributeError(f"'{self.__class__.__name__}' object has no attribute '{attribute_name}")
AttributeError: '_IterDataPipeSerializationWrapper' object has no attribute 'datapipe
Hi,
Thanks for the interesting work. Are you planning to release the pretrained model checkpoints? They will be very helpful. Thank you.
In configs/toxicity/conditional.yml
, we have the line
dataset:
conditional_training_config:
threshold: 0.00056
aligned_prefix: "<|aligned|>"
misaligned_prefix: "<|misaligned|>"
drop_token_fraction: 0.01
Why here is the toxicity threshold 0.00056? This is incredibly low. Only sentences with toxicity scores lower than 0.00056 would be marked as non-toxic. Everything greater (or equal to) that would be marked as toxic.
Don't we only want documents to be marked as toxic when their toxicity is, let's say, 0.9 or greater? (I chose 0.9 arbitrarily as an example). Generally speaking, 0.00056 seems to be quite a low threshold and I'm worried that this might hurt performance.
Can you explain the thought process that went into making the toxicity threshold 0.00056? Is this simply what got the best results?
Thanks!
I"m getting some weird error when using the default for the dataloader of shuffle=True
. Can you please help me debug why this is occurring?
Traceback (most recent call last):
File "/lfs/hyperturing2/0/rschaef/KoyejoLab-Pretrain-Human-Feedback/train.py", line 163, in <module>
train(args.checkpoint_path, config=config)
File "/lfs/hyperturing2/0/rschaef/KoyejoLab-Pretrain-Human-Feedback/train.py", line 139, in train
trainer.train(resume_from_checkpoint=checkpoint_path)
File "/lfs/hyperturing2/0/rschaef/miniconda3/envs/pretrain_hf/lib/python3.9/site-packages/transformers/trainer.py", line 1196, in train
train_dataloader = self.get_train_dataloader()
File "/lfs/hyperturing2/0/rschaef/KoyejoLab-Pretrain-Human-Feedback/apo/trainer.py", line 118, in get_train_dataloader
return DataLoader(
File "/lfs/hyperturing2/0/rschaef/miniconda3/envs/pretrain_hf/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 228, in __init__
python-BaseException
raise ValueError(
ValueError: DataLoader with IterableDataset: expected unspecified shuffle option, but got shuffle=True
Thank you for your interesting work and code. However, I cannot find the training dataset, further cannot run the code. Could you please share the training dataset used in your paper?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.