The gsm8k-screl from ofa-sys

if you have some plan to release data?

if you have some plan to release the training data?

the inference of OFA-Sys/gsm8k-rft-llama13b2-u13b has shape error: 13Bllama2的u13b版本推理时出现矩阵形状错误

There seems no people tried your 13b2-u13b version and I may be the first one. But I got 'RuntimeError: mat1 and mat2 shapes cannot be multiplied (111x5120 and 1x2560)' on my inference. While the 7b version works well.

Release the RFT 7B model

Hi there, is there any chance to share with us your RFT-7B model?

When will release model of LLama13b RFT model?

Hi, I wanna to reproduce the result form RFT model for LLaMa 13B, Do you have any plan for that ?

Questions about RFT Inference

Thanks for this great work. I have two questions: the first one is that the generation code for 7b/13b seems to be missing. The second is about the specific hyperparameter settings. The default hyperparameters set in single_inference_30b.py are not reasonable for generating different reasoning paths.

Thank you for your help!

70B training fails

Hi,

I am trying to run script to finetune 70B model and I am getting this error:
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 2699234) of binary:

Any idea what could be the issue?
I am able to train 13B models and I follow all the dependencies version mentioned in the past issues.

Thanks.

The RFT data

Hi,after completing SFT and multipath reasoning, I have some doubts about the data under the data/rft path in your github code base. I would like to ask you how these data are generated from? I see that four data sets are generated after the Filter reasoning path process, and I would like to ask whether the data under data/rft are created from four datasets?

Is majority voting(self-consistency) helpful for 70B llama2-sft model?

test.py 里面use_diverse_beam 和do_sample 都默认是False?

这两个值都是false的情况下，模型每次生成的结果应该是固定的？

Is MuggleMath dataset suitable for pre-training?

Hi，thank you for your excellent work!
I would like to know if augmented datasets like MuggleMath or RFT are suitable for pre-training?

SFT

Does SFT contain instruction tuning process?

Release RFT datasets

Could you please directly release datasets for RFT that contains various reasoning paths?

Reproducing llama7b2-sft problem

This issue is closed related to #9 and #8. However, after taking their insights into consideration, I still only achieve scores of 24.86 (llama-7b) and 26.99 (llama2-7b) on the gsm8k training set (7.4K, 3 epoch), 41.6% as mentioned in the paper. Here are the specifics:

Environment:
Hardware: 2 X A100 80G GPUs
Software: transformers==4.29.2

Training configuration:
CUDA_VISIBLE_DEVICES=0,1 python3 -m torch.distributed.launch --master_addr ${MASTER_ADDR} --master_port ${MASTER_PORT} --nproc_per_node=2 --use_env train.py
--model_name_or_path $MODEL_PATH
--data_path $2
--bf16 True
--output_dir $SAVE_PATH
--num_train_epochs 1
--per_device_train_batch_size 32
--per_device_eval_batch_size 32
--gradient_accumulation_steps 16
--evaluation_strategy "no"
--save_strategy "steps"
--save_steps 50000
--save_total_limit 1
--learning_rate 2e-5
--weight_decay 0.
--warmup_ratio 0.03
--lr_scheduler_type "cosine"
--logging_steps 1
--fsdp "full_shard auto_wrap"
--fsdp_transformer_layer_cls_to_wrap 'LlamaDecoderLayer'
--tf32 True
--gradient_checkpointing True

For both training and testing, I used the tokenizer from huggyllama/llama-7b. No significant issues were detected during the training process. However, I suspect some underlying differences in environment or methodology which may be causing this performance gap.

I would appreciate any insights or suggestions to help bridge this discrepancy and achieve the expected performance.

When will release 33b RFT model?

Thanks

加载作者开源的 OFA-Sys/gsm8k-rft-llama7b-u13b 报错

总结：
可以复现 49+ 的分数，需要注意 1. 使用 LlamaTokenizer 2. pad_token 效果有问题，需要排除其干扰

https://github.com/Haskely/gsm8k-rft-llama7b-u13b_evaluation/tree/main

使用

from transformers import AutoTokenizer

model_path = "OFA-Sys/gsm8k-rft-llama7b-u13b"
tokenizer = AutoTokenizer.from_pretrained(model_path)

会报错：

  File "/data/public/zhangzixin/conda_envs/nova/lib/python3.11/site-packages/transformers/tokenization_utils_fast.py", line 250, in convert_tokens_to_ids
    return self._convert_token_to_id_with_added_voc(tokens)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/public/zhangzixin/conda_envs/nova/lib/python3.11/site-packages/transformers/tokenization_utils_fast.py", line 257, in _convert_token_to_id_with_added_voc
    return self.unk_token_id
           ^^^^^^^^^^^^^^^^^
  File "/data/public/zhangzixin/conda_envs/nova/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 1155, in unk_token_id
    return self.convert_tokens_to_ids(self.unk_token)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/public/zhangzixin/conda_envs/nova/lib/python3.11/site-packages/transformers/tokenization_utils_fast.py", line 250, in convert_tokens_to_ids
    return self._convert_token_to_id_with_added_voc(tokens)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/public/zhangzixin/conda_envs/nova/lib/python3.11/site-packages/transformers/tokenization_utils_fast.py", line 257, in _convert_token_to_id_with_added_voc
    return self.unk_token_id
           ^^^^^^^^^^^^^^^^^
  File "/data/public/zhangzixin/conda_envs/nova/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 1155, in unk_token_id
    return self.convert_tokens_to_ids(self.unk_token)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/public/zhangzixin/conda_envs/nova/lib/python3.11/site-packages/transformers/tokenization_utils_fast.py", line 250, in convert_tokens_to_ids
    return self._convert_token_to_id_with_added_voc(tokens)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/public/zhangzixin/conda_envs/nova/lib/python3.11/site-packages/transformers/tokenization_utils_fast.py", line 257, in _convert_token_to_id_with_added_voc
    return self.unk_token_id
           ^^^^^^^^^^^^^^^^^
RecursionError: maximum recursion depth exceeded

但是我检查本仓库源码，加载方式是一样的：

gsm8k-ScRel/test.py

Line 185 in f4d0176

tokenizer = transformers.AutoTokenizer.from_pretrained(base_model)

我的 transformers 版本：transformers 4.31.0

PS：我手动使用 LlamaTokenizer.from_pretrained(model_path) 不会报错, 暂时按这种方式测分了

problems about reproducing llama7b2-sft and llama7b2-rft-100

Hello, I'm trying to reproduce your results for two settings with llama2-7b, but I cannot get as high scores as those mentioned in the paper.

llama2-7b sft on gsm8k training set(7.4K, 3 epoch), 41.6% in the paper. I've tried training 2 times with testing scores 34.57% and 37.9%.
llama2-7b rft on rft-k=100(47K, 3 epoch), 47.5% in the paper. My testing score is 42.22%.

Btw, while training on 8 Nvidia-A800-80g gpus, I always got torch.cuda.OutOfMemoryError. So I divide the micro-batch-size-per-gpu by 2 and double the gradient-accumulation-step.

Is this because we are using different GPUs/environments?
Could you please share a requirement.txt about your environment or certain checkpoint/seeds to help reproducing your result.

Thanks!

关于源码的一些细节问题

1. special tokens 设置问题：

gsm8k-ScRel/train.py

Lines 43 to 45 in f4d0176

    
           DEFAULT_EOS_TOKEN = "</s>" 
        
           DEFAULT_BOS_TOKEN = "</s>" 
        
           DEFAULT_UNK_TOKEN = "</s>"

这三行，为什么把 eos bos unk 都设置为 eos；而

gsm8k-ScRel/train_llama_30b_65b.py

Lines 51 to 53 in f4d0176

    
           DEFAULT_EOS_TOKEN = "</s>" 
        
           DEFAULT_BOS_TOKEN = "<s>" 
        
           DEFAULT_UNK_TOKEN = "<unk>"

以及

gsm8k-ScRel/train_llama2_70b.py

Lines 51 to 53 in f4d0176

    
           DEFAULT_EOS_TOKEN = "</s>" 
        
           DEFAULT_BOS_TOKEN = "<s>" 
        
           DEFAULT_UNK_TOKEN = "<unk>"

就是不一样的。为什么7b,13b 模型与 30b,65b,70b 模型之间需要有这种差异，还是随意指定的？

2. 数据集处理问题：

gsm8k-ScRel/group_test_7b_13b.py

Lines 109 to 110 in f4d0176

    
           self.input_ids = data_dict["input_ids"] + data_dict["input_ids"][-100:] 
        
           self.labels = data_dict["labels"] + data_dict["labels"][-100:]

以及

gsm8k-ScRel/test.py

Lines 109 to 110 in f4d0176

    
           self.input_ids = data_dict["input_ids"] + data_dict["input_ids"][-100:] 
        
           self.labels = data_dict["labels"] + data_dict["labels"][-100:]

会对数据做出这种操作，把 token ids 截取后 100 个拼一下用意是什么？我实在无法理解。

而

gsm8k-ScRel/single_inference_30b.py

Lines 114 to 115 in f4d0176

    
           self.input_ids = data_dict["input_ids"]  
        
           self.labels = data_dict["labels"]

以及

gsm8k-ScRel/single_inference_65b.py

Lines 113 to 114 in f4d0176

    
           self.input_ids = data_dict["input_ids"]  
        
           self.labels = data_dict["labels"]

却没有这种操作？这也是 7b,13b 模型与 30b,65b,70b 模型之间需要有差异吗？

Enviroment

Could you please provide official enviroment of your project, like requirements.txt?

Missing test.py file

Hello,

Thank you for sharing the invaluable code. While attempting to replicate your work, I noticed that the test_7b_13b.sh script references a test.py file, but it seems to be missing from the repository. Would you be able to add this file? It would be immensely helpful for researchers like us who are trying to replicate your work.

Best

	DEFAULT_EOS_TOKEN = "</s>"
	DEFAULT_BOS_TOKEN = "</s>"
	DEFAULT_UNK_TOKEN = "</s>"

	DEFAULT_EOS_TOKEN = "</s>"
	DEFAULT_BOS_TOKEN = "<s>"
	DEFAULT_UNK_TOKEN = "<unk>"

	self.input_ids = data_dict["input_ids"] + data_dict["input_ids"][-100:]
	self.labels = data_dict["labels"] + data_dict["labels"][-100:]

	self.input_ids = data_dict["input_ids"]
	self.labels = data_dict["labels"]

ofa-sys / gsm8k-screl Goto Github PK

gsm8k-screl's People

Contributors

Stargazers

Watchers

Forkers

gsm8k-screl's Issues

1. special tokens 设置问题：

2. 数据集处理问题：

Recommend Projects

Recommend Topics

Recommend Org