Coder Social home page Coder Social logo

jerry1993-tech / cornucopia-llama-fin-chinese Goto Github PK

View Code? Open in Web Editor NEW
536.0 536.0 54.0 1.68 MB

聚宝盆(Cornucopia): 中文金融系列开源可商用大模型,并提供一套高效轻量化的垂直领域LLM训练框架(Pretraining、SFT、RLHF、Quantize等)

Home Page: https://zhuanlan.zhihu.com/p/633736418

License: Apache License 2.0

Python 85.36% Shell 14.64%
chinese finance large-language-models llama nlp qa rlhf sft text-generation transformers

cornucopia-llama-fin-chinese's People

Contributors

jerry1993-tech avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

cornucopia-llama-fin-chinese's Issues

运行infer.sh 报错

运行报错代码如下:
###infering###
((), (), (), ()) tensor([0, 0, 0, 0], device='cuda:0')
Traceback (most recent call last):
File "/home/pxc/Cornucopia-LLaMA-Fin-Chinese/infer.py", line 168, in
main()
File "/home/pxc/Cornucopia-LLaMA-Fin-Chinese/infer.py", line 154, in main
infer_from_json(args.instruct_dir)
File "/home/pxc/Cornucopia-LLaMA-Fin-Chinese/infer.py", line 145, in infer_from_json
model_output = evaluate(instruction)
File "/home/pxc/Cornucopia-LLaMA-Fin-Chinese/infer.py", line 120, in evaluate
generation_output = model.generate(
File "/home/zengbo/anaconda3/envs/Cornucopia/lib/python3.10/site-packages/peft-0.5.0.dev0-py3.10.egg/peft/peft_model.py", line 1002, in generate
File "/home/zengbo/anaconda3/envs/Cornucopia/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/home/zengbo/anaconda3/envs/Cornucopia/lib/python3.10/site-packages/transformers/generation/utils.py", line 1628, in generate
return self.beam_search(
File "/home/zengbo/anaconda3/envs/Cornucopia/lib/python3.10/site-packages/transformers/generation/utils.py", line 3010, in beam_search
beam_indices = tuple((beam_indices[beam_idx[i]] + (beam_idx[i],) for i in range(len(beam_indices))))
File "/home/zengbo/anaconda3/envs/Cornucopia/lib/python3.10/site-packages/transformers/generation/utils.py", line 3010, in
beam_indices = tuple((beam_indices[beam_idx[i]] + (beam_idx[i],) for i in range(len(beam_indices))))
IndexError: tuple index out of range

=============================
若不进行生成model.generate,直接输出模型推理结果,返回值有大量nan

如何做评估

请问作者,微调后的模型,如何做效果的评估?

您好,目前的模型A10,24G显存可以支持预测功能吗?

作者您好,首先非常致敬您的工作,也感谢开源
很感兴趣,想看看模型效果,但是不知购买A10这样一个相对有些老GPU能否支持该模型预测吗?目前暂时不考虑训练的问题
因为毕竟也是不少的钱,所以想问下这个配置是否能支持,再进行购买

启动模型 bash ./scripts/infer.sh 异常

===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues

/home/ubuntu/miniconda3/envs/LLaMA/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: /home/ubuntu/miniconda3/envs/LLaMA did not contain libcudart.so as expected! Searching further paths...
warn(msg)
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64...
CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so
CUDA SETUP: Highest compute capability among GPUs detected: 7.0
CUDA SETUP: Detected CUDA version 112
/home/ubuntu/miniconda3/envs/LLaMA/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: Compute capability < 7.5 detected! Only slow 8-bit matmul is supported for your GPU!
warn(msg)
CUDA SETUP: Loading binary /home/ubuntu/miniconda3/envs/LLaMA/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda112_nocublaslt.so...
The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization.
The tokenizer class you load from this checkpoint is 'LLaMATokenizer'.
The class this function is called from is 'LlamaTokenizer'.
You are using the legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. This means that tokens that come after special tokens will not be properly handled. We recommend you to read the related pull request available at huggingface/transformers#24565
Traceback (most recent call last):
File "/data/Cornucopia-LLaMA-Fin-Chinese/infer.py", line 143, in
main()
File "/data/Cornucopia-LLaMA-Fin-Chinese/infer.py", line 46, in main
tokenizer = LlamaTokenizer.from_pretrained(args.base_model)
File "/home/ubuntu/miniconda3/envs/LLaMA/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1846, in from_pretrained
return cls._from_pretrained(
File "/home/ubuntu/miniconda3/envs/LLaMA/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2009, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "/home/ubuntu/miniconda3/envs/LLaMA/lib/python3.10/site-packages/transformers/models/llama/tokenization_llama.py", line 128, in init
self.sp_model.Load(vocab_file)
File "/home/ubuntu/miniconda3/envs/LLaMA/lib/python3.10/site-packages/sentencepiece/init.py", line 905, in Load
return self.LoadFromFile(model_file)
File "/home/ubuntu/miniconda3/envs/LLaMA/lib/python3.10/site-packages/sentencepiece/init.py", line 310, in LoadFromFile
return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
RuntimeError: Internal: src/sentencepiece_processor.cc(1101) [model_proto->ParseFromArray(serialized.data(), serialized.size())]
(

请教下数据集规模

非常感谢您很有意义的工作,想请教一下所使用到的instruction-tuning的数据量。
另外,想再请教一下是否有探究多大的instruction-tuning数据量就够用了呢?
非常感谢

Do we have any opportunities for cooperation ?

Hello, we are an algorithm research team currently working on large language models in the field of new energy. We are very interested in your research. May I ask if your research can be transferred to our vertical field of new energy? Do we know of any opportunities for further cooperation?

运行Infer.sh报错

按照README.md的方式运行Infer.sh,报错如下:
###infering###
((), (), (), ()) tensor([0, 0, 0, 0], device='cuda:0')
Traceback (most recent call last):
File "/home/pxc/Cornucopia-LLaMA-Fin-Chinese/infer.py", line 168, in
main()
File "/home/pxc/Cornucopia-LLaMA-Fin-Chinese/infer.py", line 154, in main
infer_from_json(args.instruct_dir)
File "/home/pxc/Cornucopia-LLaMA-Fin-Chinese/infer.py", line 145, in infer_from_json
model_output = evaluate(instruction)
File "/home/pxc/Cornucopia-LLaMA-Fin-Chinese/infer.py", line 120, in evaluate
generation_output = model.generate(
File "/home/zengbo/anaconda3/envs/Cornucopia/lib/python3.10/site-packages/peft-0.5.0.dev0-py3.10.egg/peft/peft_model.py", line 1002, in generate
File "/home/zengbo/anaconda3/envs/Cornucopia/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/home/zengbo/anaconda3/envs/Cornucopia/lib/python3.10/site-packages/transformers/generation/utils.py", line 1628, in generate
return self.beam_search(
File "/home/zengbo/anaconda3/envs/Cornucopia/lib/python3.10/site-packages/transformers/generation/utils.py", line 3010, in beam_search
beam_indices = tuple((beam_indices[beam_idx[i]] + (beam_idx[i],) for i in range(len(beam_indices))))
File "/home/zengbo/anaconda3/envs/Cornucopia/lib/python3.10/site-packages/transformers/generation/utils.py", line 3010, in
beam_indices = tuple((beam_indices[beam_idx[i]] + (beam_idx[i],) for i in range(len(beam_indices))))
IndexError: tuple index out of range

===============================
如果直接输出模型结果,不适用model.generate(),output中有大量nan。
请问是什么原因?谢谢

infer_result文件夹缺失

运行# 多模型对比命令:
bash ./scripts/comparison_test.sh
报错:only_ori_llama
./scripts/comparison_test.sh: line 31: infer_result/o_tmp.txt: No such file or directory
lora-llama-fin-ori-fb
./scripts/comparison_test.sh: line 33: infer_result/a_tmp.txt: No such file or directory
lora-llama-fin-Linly-zh
./scripts/comparison_test.sh: line 35: infer_result/m_tmp.txt: No such file or directory

有办法进行多轮问答吗?

image

我看构造的prompt是这样的,这样好像每次都只能进行单轮问答,有测试过的朋友告诉下结论吗?能否进行多轮的问答吗?

关于测试您的模型的问题

作者您好!我对您的工作非常感兴趣,同时我拿了您发布的权重,想要测试一下您的模型,结果发现效果并不是很理想,我问的是几个您在文档中有提及的问题,下面是我的测试记录
image
llama-7b模型本身就非常容易陷入胡说八道的情况,目前我也在做和您类似的工作,我用的是alpaca-7b的lora 微调算法,发现效果要远好于llama。同时扩充中文词汇量的工作也有人做过了,lora训练后的效果有大幅提升。不知道您接下来是否会尝试一下?目前我的工作发现用chinese-alpaca lora精调,比用llama、vicuna精调的效果都好很多。

wechat

I want to get your wechat code

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.