Coder Social home page Coder Social logo

a-baoyang / alpaca-7b-chinese Goto Github PK

View Code? Open in Web Editor NEW
131.0 5.0 17.0 33.89 MB

Finetune LLaMA-7B with Chinese instruction datasets

License: Creative Commons Zero v1.0 Universal

Python 98.81% Shell 1.19%
chatgpt deep-learning llm nlp pytorch alpaca fine-tuning lora instruction-following

alpaca-7b-chinese's People

Contributors

a-baoyang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

alpaca-7b-chinese's Issues

expected scalar type Float but found Half

hello,
when i python api.py and test. i have a problem:
RuntimeError: expected scalar type Float but found Half

======================================
----> 1 model.generate(instruction=item["instruction"], input=item["input"])

File /data/dongxz/research/alpaca-7b-chinese/serve/model.py:139, in ModelServe.generate(self, instruction, input, temperature, top_p, top_k, num_beams, max_new_tokens, **kwargs)
137 print("generating...")
138 with torch.no_grad():
--> 139 generation_output = self.model.generate(
140 input_ids=input_ids,
141 generation_config=generation_config,
142 return_dict_in_generate=True,
143 output_scores=True,
144 max_new_tokens=max_new_tokens,
145 )
146 s = generation_output.sequences[0]
147 output = self.tokenizer.decode(s)

File /data/anaconda3/envs/dongxz_chatglm_v1/lib/python3.9/site-packages/peft/peft_model.py:731, in PeftModelForCausalLM.generate(self, **kwargs)
729 try:
730 if not isinstance(peft_config, PromptLearningConfig):
--> 731 outputs = self.base_model.generate(**kwargs)
732 else:
733 if "input_ids" not in kwargs:

File /data/anaconda3/envs/dongxz_chatglm_v1/lib/python3.9/site-packages/torch/utils/_contextlib.py:115, in context_decorator..decorate_context(*args, **kwargs)
112 @functools.wraps(func)
113 def decorate_context(*args, **kwargs):
114 with ctx_factory():
--> 115 return func(*args, **kwargs)

File /data/anaconda3/envs/dongxz_chatglm_v1/lib/python3.9/site-packages/transformers/generation/utils.py:1524, in GenerationMixin.generate(self, inputs, generation_config, logits_processor, stopping_criteria, prefix_allowed_tokens_fn, synced_gpus, streamer, **kwargs)
1517 input_ids, model_kwargs = self._expand_inputs_for_generation(
1518 input_ids=input_ids,
1519 expand_size=generation_config.num_beams,
1520 is_encoder_decoder=self.config.is_encoder_decoder,
1521 **model_kwargs,
1522 )
1523 # 13. run beam search
-> 1524 return self.beam_search(
1525 input_ids,
1526 beam_scorer,
1527 logits_processor=logits_processor,
1528 stopping_criteria=stopping_criteria,
1529 pad_token_id=generation_config.pad_token_id,
1530 eos_token_id=generation_config.eos_token_id,
1531 output_scores=generation_config.output_scores,
1532 return_dict_in_generate=generation_config.return_dict_in_generate,
1533 synced_gpus=synced_gpus,
1534 **model_kwargs,
1535 )
1537 elif is_beam_sample_gen_mode:
1538 # 11. prepare logits warper
1539 logits_warper = self._get_logits_warper(generation_config)

File /data/anaconda3/envs/dongxz_chatglm_v1/lib/python3.9/site-packages/transformers/generation/utils.py:2810, in GenerationMixin.beam_search(self, input_ids, beam_scorer, logits_processor, stopping_criteria, max_length, pad_token_id, eos_token_id, output_attentions, output_hidden_states, output_scores, return_dict_in_generate, synced_gpus, **model_kwargs)
2806 break
2808 model_inputs = self.prepare_inputs_for_generation(input_ids, **model_kwargs)
-> 2810 outputs = self(
2811 **model_inputs,
2812 return_dict=True,
2813 output_attentions=output_attentions,
2814 output_hidden_states=output_hidden_states,
2815 )
2817 if synced_gpus and this_peer_finished:
2818 cur_len = cur_len + 1

File /data/anaconda3/envs/dongxz_chatglm_v1/lib/python3.9/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
1496 # If we don't have any hooks, we want to skip the rest of the logic in
1497 # this function, and just call forward.
1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
1499 or _global_backward_pre_hooks or _global_backward_hooks
1500 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501 return forward_call(*args, **kwargs)
1502 # Do not call functions when jit is used
1503 full_backward_hooks, non_full_backward_hooks = [], []

File /data/anaconda3/envs/dongxz_chatglm_v1/lib/python3.9/site-packages/accelerate/hooks.py:165, in add_hook_to_module..new_forward(*args, **kwargs)
163 output = old_forward(*args, **kwargs)
164 else:
--> 165 output = old_forward(*args, **kwargs)
166 return module._hf_hook.post_forward(module, output)

File /data/anaconda3/envs/dongxz_chatglm_v1/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py:687, in LlamaForCausalLM.forward(self, input_ids, attention_mask, position_ids, past_key_values, inputs_embeds, labels, use_cache, output_attentions, output_hidden_states, return_dict)
684 return_dict = return_dict if return_dict is not None else self.config.use_return_dict
686 # decoder outputs consists of (dec_features, layer_state, dec_hidden, dec_attn)
--> 687 outputs = self.model(
688 input_ids=input_ids,
689 attention_mask=attention_mask,
690 position_ids=position_ids,
691 past_key_values=past_key_values,
692 inputs_embeds=inputs_embeds,
693 use_cache=use_cache,
694 output_attentions=output_attentions,
695 output_hidden_states=output_hidden_states,
696 return_dict=return_dict,
697 )
699 hidden_states = outputs[0]
700 logits = self.lm_head(hidden_states)

File /data/anaconda3/envs/dongxz_chatglm_v1/lib/python3.9/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
1496 # If we don't have any hooks, we want to skip the rest of the logic in
1497 # this function, and just call forward.
1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
1499 or _global_backward_pre_hooks or _global_backward_hooks
1500 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501 return forward_call(*args, **kwargs)
1502 # Do not call functions when jit is used
1503 full_backward_hooks, non_full_backward_hooks = [], []

File /data/anaconda3/envs/dongxz_chatglm_v1/lib/python3.9/site-packages/accelerate/hooks.py:165, in add_hook_to_module..new_forward(*args, **kwargs)
163 output = old_forward(*args, **kwargs)
164 else:
--> 165 output = old_forward(*args, **kwargs)
166 return module._hf_hook.post_forward(module, output)

File /data/anaconda3/envs/dongxz_chatglm_v1/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py:577, in LlamaModel.forward(self, input_ids, attention_mask, position_ids, past_key_values, inputs_embeds, use_cache, output_attentions, output_hidden_states, return_dict)
569 layer_outputs = torch.utils.checkpoint.checkpoint(
570 create_custom_forward(decoder_layer),
571 hidden_states,
(...)
574 None,
575 )
576 else:
--> 577 layer_outputs = decoder_layer(
578 hidden_states,
579 attention_mask=attention_mask,
580 position_ids=position_ids,
581 past_key_value=past_key_value,
582 output_attentions=output_attentions,
583 use_cache=use_cache,
584 )
586 hidden_states = layer_outputs[0]
588 if use_cache:

File /data/anaconda3/envs/dongxz_chatglm_v1/lib/python3.9/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
1496 # If we don't have any hooks, we want to skip the rest of the logic in
1497 # this function, and just call forward.
1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
1499 or _global_backward_pre_hooks or _global_backward_hooks
1500 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501 return forward_call(*args, **kwargs)
1502 # Do not call functions when jit is used
1503 full_backward_hooks, non_full_backward_hooks = [], []

File /data/anaconda3/envs/dongxz_chatglm_v1/lib/python3.9/site-packages/accelerate/hooks.py:165, in add_hook_to_module..new_forward(*args, **kwargs)
163 output = old_forward(*args, **kwargs)
164 else:
--> 165 output = old_forward(*args, **kwargs)
166 return module._hf_hook.post_forward(module, output)

File /data/anaconda3/envs/dongxz_chatglm_v1/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py:292, in LlamaDecoderLayer.forward(self, hidden_states, attention_mask, position_ids, past_key_value, output_attentions, use_cache)
289 hidden_states = self.input_layernorm(hidden_states)
291 # Self Attention
--> 292 hidden_states, self_attn_weights, present_key_value = self.self_attn(
293 hidden_states=hidden_states,
294 attention_mask=attention_mask,
295 position_ids=position_ids,
296 past_key_value=past_key_value,
297 output_attentions=output_attentions,
298 use_cache=use_cache,
299 )
300 hidden_states = residual + hidden_states
302 # Fully Connected

File /data/anaconda3/envs/dongxz_chatglm_v1/lib/python3.9/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
1496 # If we don't have any hooks, we want to skip the rest of the logic in
1497 # this function, and just call forward.
1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
1499 or _global_backward_pre_hooks or _global_backward_hooks
1500 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501 return forward_call(*args, **kwargs)
1502 # Do not call functions when jit is used
1503 full_backward_hooks, non_full_backward_hooks = [], []

File /data/anaconda3/envs/dongxz_chatglm_v1/lib/python3.9/site-packages/accelerate/hooks.py:165, in add_hook_to_module..new_forward(*args, **kwargs)
163 output = old_forward(*args, **kwargs)
164 else:
--> 165 output = old_forward(*args, **kwargs)
166 return module._hf_hook.post_forward(module, output)

File /data/anaconda3/envs/dongxz_chatglm_v1/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py:196, in LlamaAttention.forward(self, hidden_states, attention_mask, position_ids, past_key_value, output_attentions, use_cache)
185 def forward(
186 self,
187 hidden_states: torch.Tensor,
(...)
192 use_cache: bool = False,
193 ) -> Tuple[torch.Tensor, Optional[torch.Tensor], Optional[Tuple[torch.Tensor]]]:
194 bsz, q_len, _ = hidden_states.size()
--> 196 query_states = self.q_proj(hidden_states).view(bsz, q_len, self.num_heads, self.head_dim).transpose(1, 2)
197 key_states = self.k_proj(hidden_states).view(bsz, q_len, self.num_heads, self.head_dim).transpose(1, 2)
198 value_states = self.v_proj(hidden_states).view(bsz, q_len, self.num_heads, self.head_dim).transpose(1, 2)

File /data/anaconda3/envs/dongxz_chatglm_v1/lib/python3.9/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
1496 # If we don't have any hooks, we want to skip the rest of the logic in
1497 # this function, and just call forward.
1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
1499 or _global_backward_pre_hooks or _global_backward_hooks
1500 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501 return forward_call(*args, **kwargs)
1502 # Do not call functions when jit is used
1503 full_backward_hooks, non_full_backward_hooks = [], []

File /data/anaconda3/envs/dongxz_chatglm_v1/lib/python3.9/site-packages/peft/tuners/lora.py:710, in Linear8bitLt.forward(self, x)
706 if x.dtype != torch.float32:
707 x = x.float()
708 output = (
709 self.lora_B[self.active_adapter](
--> 710 self.lora_Aself.active_adapter
711 ).to(expected_dtype)
712 * self.scaling[self.active_adapter]
713 )
714 else:
715 output = (
716 self.lora_B[self.active_adapter](
717 self.lora_Aself.active_adapter
718 )
719 * self.scaling[self.active_adapter]
720 )

File /data/anaconda3/envs/dongxz_chatglm_v1/lib/python3.9/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
1496 # If we don't have any hooks, we want to skip the rest of the logic in
1497 # this function, and just call forward.
1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
1499 or _global_backward_pre_hooks or _global_backward_hooks
1500 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501 return forward_call(*args, **kwargs)
1502 # Do not call functions when jit is used
1503 full_backward_hooks, non_full_backward_hooks = [], []

File /data/anaconda3/envs/dongxz_chatglm_v1/lib/python3.9/site-packages/torch/nn/modules/linear.py:114, in Linear.forward(self, input)
113 def forward(self, input: Tensor) -> Tensor:
--> 114 return F.linear(input, self.weight, self.bias)

RuntimeError: expected scalar type Float but found Half

TypeError: not a string

(gh_alpaca-7b-chinese) ub2004@ub2004-B85M-A0:/llm_dev/alpaca-7b-chinese/finetune$
(gh_alpaca-7b-chinese) ub2004@ub2004-B85M-A0:
/llm_dev/alpaca-7b-chinese/finetune$
(gh_alpaca-7b-chinese) ub2004@ub2004-B85M-A0:~/llm_dev/alpaca-7b-chinese/finetune$ python3 finetune.py --base_model bigscience/bloomz-7b1-mt --data_dir /home/ub2004/llm_dev/alpaca-7b-chinese/data/general/alpaca-en-zh.json --output_dir ../finetuned/bloomz-7b1-mt_alpaca-en-zh --lora_target_modules '["query_key_value"]'

===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues

/home/ub2004/.local/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/home/ub2004/anaconda3/envs/gh_alpaca-7b-chinese/lib')}
warn(msg)
/home/ub2004/.local/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: /home/ub2004/anaconda3/envs/gh_alpaca-7b-chinese did not contain libcudart.so as expected! Searching further paths...
warn(msg)
/home/ub2004/.local/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('@/tmp/.ICE-unix/2101,unix/ub2004-B85M-A0'), PosixPath('local/ub2004-B85M-A0')}
warn(msg)
/home/ub2004/.local/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/etc/xdg/xdg-ubuntu')}
warn(msg)
/home/ub2004/.local/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('1'), PosixPath('0')}
warn(msg)
/home/ub2004/.local/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/org/gnome/Terminal/screen/9452ad9e_f9da_4eba_aff1_19fe374cdc1e')}
warn(msg)
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64...
CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so
CUDA SETUP: Highest compute capability among GPUs detected: 6.1
CUDA SETUP: Detected CUDA version 117
/home/ub2004/.local/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: Compute capability < 7.5 detected! Only slow 8-bit matmul is supported for your GPU!
warn(msg)
CUDA SETUP: Loading binary /home/ub2004/.local/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda117_nocublaslt.so...
/usr/lib/python3/dist-packages/requests/init.py:89: RequestsDependencyWarning: urllib3 (1.26.15) or chardet (3.0.4) doesn't match a supported version!
warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
Finetune parameters:
base_model: bigscience/bloomz-7b1-mt
model_type: llama
data_dir: /home/ub2004/llm_dev/alpaca-7b-chinese/data/general/alpaca-en-zh.json
output_dir: ../finetuned/bloomz-7b1-mt_alpaca-en-zh
batch_size: 128
micro_batch_size: 1
num_epochs: 20
learning_rate: 0.0003
cutoff_len: 512
val_set_size: 2000
lora_r: 8
lora_alpha: 16
lora_dropout: 0.05
lora_target_modules: ['query_key_value']
train_on_inputs: True
group_by_length: True

-1 -1 -1 False
The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization.
The tokenizer class you load from this checkpoint is 'BloomTokenizerFast'.
The class this function is called from is 'LlamaTokenizer'.
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/ub2004/llm_dev/alpaca-7b-chinese/finetune/finetune.py:259 in │
│ │
│ 256 │
│ 257 │
│ 258 if name == "main": │
│ ❱ 259 │ main() │
│ 260 │
│ │
│ /home/ub2004/.local/lib/python3.8/site-packages/click/core.py:1130 in call
│ │
│ 1127 │ │
│ 1128 │ def call(self, *args: t.Any, **kwargs: t.Any) -> t.Any: │
│ 1129 │ │ """Alias for :meth:main.""" │
│ ❱ 1130 │ │ return self.main(*args, **kwargs) │
│ 1131 │
│ 1132 │
│ 1133 class Command(BaseCommand): │
│ │
│ /home/ub2004/.local/lib/python3.8/site-packages/click/core.py:1055 in main │
│ │
│ 1052 │ │ try: │
│ 1053 │ │ │ try: │
│ 1054 │ │ │ │ with self.make_context(prog_name, args, **extra) as ctx: │
│ ❱ 1055 │ │ │ │ │ rv = self.invoke(ctx) │
│ 1056 │ │ │ │ │ if not standalone_mode: │
│ 1057 │ │ │ │ │ │ return rv │
│ 1058 │ │ │ │ │ # it's not safe to ctx.exit(rv) here! │
│ │
│ /home/ub2004/.local/lib/python3.8/site-packages/click/core.py:1404 in invoke │
│ │
│ 1401 │ │ │ echo(style(message, fg="red"), err=True) │
│ 1402 │ │ │
│ 1403 │ │ if self.callback is not None: │
│ ❱ 1404 │ │ │ return ctx.invoke(self.callback, **ctx.params) │
│ 1405 │ │
│ 1406 │ def shell_complete(self, ctx: Context, incomplete: str) -> t.List["CompletionItem"]: │
│ 1407 │ │ """Return a list of completions for the incomplete value. Looks │
│ │
│ /home/ub2004/.local/lib/python3.8/site-packages/click/core.py:760 in invoke │
│ │
│ 757 │ │ │
│ 758 │ │ with augment_usage_errors(__self): │
│ 759 │ │ │ with ctx: │
│ ❱ 760 │ │ │ │ return __callback(*args, **kwargs) │
│ 761 │ │
│ 762 │ def forward( │
│ 763 │ │ __self, __cmd: "Command", *args: t.Any, **kwargs: t.Any # noqa: B902 │
│ │
│ /home/ub2004/llm_dev/alpaca-7b-chinese/finetune/finetune.py:188 in main │
│ │
│ 185 │ │ device = torch.device("cuda" if torch.cuda.is_available() else "cpu") │
│ 186 │ │ device_map = "auto" │
│ 187 │ │
│ ❱ 188 │ tokenizer, model = decide_model(args=local_args, device_map=device_map) │
│ 189 │ data = load_dataset("json", data_files=data_dir) │
│ 190 │
│ 191 │
│ │
│ /home/ub2004/llm_dev/alpaca-7b-chinese/finetune/finetune.py:66 in decide_model │
│ │
│ 63 │ │ │ device_map=device_map │
│ 64 │ │ ) │
│ 65 │ else: │
│ ❱ 66 │ │ tokenizer = _MODEL_CLASSES[model_type].tokenizer.from_pretrained(args.base_model │
│ 67 │ │ model = _MODEL_CLASSES[model_type].model.from_pretrained( │
│ 68 │ │ │ args.base_model, │
│ 69 │ │ │ load_in_8bit=True, │
│ │
│ /home/ub2004/.local/lib/python3.8/site-packages/transformers/tokenization_utils_base.py:1811 in │
│ from_pretrained │
│ │
│ 1808 │ │ │ else: │
│ 1809 │ │ │ │ logger.info(f"loading file {file_path} from cache at {resolved_vocab_fil │
│ 1810 │ │ │
│ ❱ 1811 │ │ return cls._from_pretrained( │
│ 1812 │ │ │ resolved_vocab_files, │
│ 1813 │ │ │ pretrained_model_name_or_path, │
│ 1814 │ │ │ init_configuration, │
│ │
│ /home/ub2004/.local/lib/python3.8/site-packages/transformers/tokenization_utils_base.py:1965 in │
│ _from_pretrained │
│ │
│ 1962 │ │ │
│ 1963 │ │ # Instantiate tokenizer. │
│ 1964 │ │ try: │
│ ❱ 1965 │ │ │ tokenizer = cls(*init_inputs, **init_kwargs) │
│ 1966 │ │ except OSError: │
│ 1967 │ │ │ raise OSError( │
│ 1968 │ │ │ │ "Unable to load vocabulary from file. " │
│ │
│ /home/ub2004/.local/lib/python3.8/site-packages/transformers/models/llama/tokenization_llama.py: │
│ 96 in init
│ │
│ 93 │ │ self.add_bos_token = add_bos_token │
│ 94 │ │ self.add_eos_token = add_eos_token │
│ 95 │ │ self.sp_model = spm.SentencePieceProcessor(**self.sp_model_kwargs) │
│ ❱ 96 │ │ self.sp_model.Load(vocab_file) │
│ 97 │ │
│ 98 │ def getstate(self): │
│ 99 │ │ state = self.dict.copy() │
│ │
│ /home/ub2004/.local/lib/python3.8/site-packages/sentencepiece/init.py:905 in Load │
│ │
│ 902 │ │ raise RuntimeError('model_file and model_proto must be exclusive.') │
│ 903 │ if model_proto: │
│ 904 │ │ return self.LoadFromSerializedProto(model_proto) │
│ ❱ 905 │ return self.LoadFromFile(model_file) │
│ 906 │
│ 907 │
│ 908 # Register SentencePieceProcessor in _sentencepiece: │
│ │
│ /home/ub2004/.local/lib/python3.8/site-packages/sentencepiece/init.py:310 in LoadFromFile │
│ │
│ 307 │ │ return _sentencepiece.SentencePieceProcessor_serialized_model_proto(self) │
│ 308 │ │
│ 309 │ def LoadFromFile(self, arg): │
│ ❱ 310 │ │ return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg) │
│ 311 │ │
│ 312 │ def _EncodeAsIds(self, text, enable_sampling, nbest_size, alpha, add_bos, add_eos, r │
│ 313 │ │ return _sentencepiece.SentencePieceProcessor__EncodeAsIds(self, text, enable_sam │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
TypeError: not a string
(gh_alpaca-7b-chinese) ub2004@ub2004-B85M-A0:~/llm_dev/alpaca-7b-chinese/finetune$

libbitsandbytes_cpu.so: undefined symbol: cget_col_row_stats

(gh_alpaca-7b-chinese) ub2004@ub2004-B85M-A0:~/llm_dev/alpaca-7b-chinese/finetune$ python3 finetune.py --base_model decapoda-research/llama-7b-hf --data_dir ../data/alpaca-en-zh.json --output_dir ../finetuned/llama-7b-hf_alpaca-en-zh --lora_target_modules '["q_proj", "v_proj"]'

===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please run

python -m bitsandbytes

and submit this information together with your error trace to: https://github.com/TimDettmers/bitsandbytes/issues

bin /home/ub2004/.local/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cpu.so
/home/ub2004/.local/lib/python3.8/site-packages/bitsandbytes/cextension.py:33: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable.
warn("The installed version of bitsandbytes was compiled without GPU support. "
/home/ub2004/.local/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:145: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/home/ub2004/anaconda3/envs/gh_alpaca-7b-chinese/lib')}
warn(msg)
/home/ub2004/.local/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:145: UserWarning: /home/ub2004/anaconda3/envs/gh_alpaca-7b-chinese did not contain ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] as expected! Searching further paths...
warn(msg)
/home/ub2004/.local/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:145: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('@/tmp/.ICE-unix/1648,unix/ub2004-B85M-A0'), PosixPath('local/ub2004-B85M-A0')}
warn(msg)
/home/ub2004/.local/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:145: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/etc/xdg/xdg-ubuntu')}
warn(msg)
/home/ub2004/.local/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:145: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('1'), PosixPath('0')}
warn(msg)
/home/ub2004/.local/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:145: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/org/gnome/Terminal/screen/50429ffc_73e7_436a_ae26_12ca37ff5ff1')}
warn(msg)
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching in backup paths...
/home/ub2004/.local/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:145: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/usr/local/cuda/lib64')}
warn(msg)
ERROR: python3: undefined symbol: cudaRuntimeGetVersion
CUDA SETUP: libcudart.so path is None
CUDA SETUP: Is seems that your cuda installation is not in your path. See TimDettmers/bitsandbytes#85 for more information.
CUDA SETUP: CUDA version lower than 11 are currently not supported for LLM.int8(). You will be only to use 8-bit optimizers and quantization routines!!
/home/ub2004/.local/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:145: UserWarning: WARNING: No libcudart.so found! Install CUDA or the cudatoolkit package (anaconda)!
warn(msg)
CUDA SETUP: Highest compute capability among GPUs detected: 6.1
CUDA SETUP: Detected CUDA version 00
/home/ub2004/.local/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:145: UserWarning: WARNING: Compute capability < 7.5 detected! Only slow 8-bit matmul is supported for your GPU!
warn(msg)
CUDA SETUP: Loading binary /home/ub2004/.local/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cpu.so...
/usr/lib/python3/dist-packages/requests/init.py:89: RequestsDependencyWarning: urllib3 (1.26.15) or chardet (3.0.4) doesn't match a supported version!
warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
Finetune parameters:
base_model: decapoda-research/llama-7b-hf
model_type: llama
data_dir: ../data/alpaca-en-zh.json
output_dir: ../finetuned/llama-7b-hf_alpaca-en-zh
batch_size: 128
micro_batch_size: 4
num_epochs: 20
learning_rate: 0.0003
cutoff_len: 512
val_set_size: 2000
lora_r: 8
lora_alpha: 16
lora_dropout: 0.05
lora_target_modules: ['q_proj', 'v_proj']
train_on_inputs: True
group_by_length: True

The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization.
The tokenizer class you load from this checkpoint is 'LLaMATokenizer'.
The class this function is called from is 'LlamaTokenizer'.
Overriding torch_dtype=None with torch_dtype=torch.float16 due to requirements of bitsandbytes to enable model loading in mixed int8. Either pass torch_dtype=torch.float16 or don't pass this argument at all to remove this warning.
Loading checkpoint shards: 0%| | 0/33 [00:00<?, ?it/s]
Traceback (most recent call last):
File "finetune.py", line 245, in
main()
File "/usr/lib/python3/dist-packages/click/core.py", line 764, in call
return self.main(*args, **kwargs)
File "/usr/lib/python3/dist-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/usr/lib/python3/dist-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/lib/python3/dist-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "finetune.py", line 173, in main
tokenizer, model = decide_model(args=local_args, device_map=device_map)
File "finetune.py", line 63, in decide_model
model = _MODEL_CLASSES[model_type].model.from_pretrained(
File "/home/ub2004/.local/lib/python3.8/site-packages/transformers/modeling_utils.py", line 2795, in from_pretrained
) = cls._load_pretrained_model(
File "/home/ub2004/.local/lib/python3.8/site-packages/transformers/modeling_utils.py", line 3123, in _load_pretrained_model
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
File "/home/ub2004/.local/lib/python3.8/site-packages/transformers/modeling_utils.py", line 706, in _load_state_dict_into_meta_model
set_module_8bit_tensor_to_device(
File "/home/ub2004/.local/lib/python3.8/site-packages/transformers/utils/bitsandbytes.py", line 78, in set_module_8bit_tensor_to_device
new_value = bnb.nn.Int8Params(new_value, requires_grad=False, has_fp16_weights=has_fp16_weights).to(device)
File "/home/ub2004/.local/lib/python3.8/site-packages/bitsandbytes/nn/modules.py", line 227, in to
return self.cuda(device)
File "/home/ub2004/.local/lib/python3.8/site-packages/bitsandbytes/nn/modules.py", line 191, in cuda
CB, CBt, SCB, SCBt, coo_tensorB = bnb.functional.double_quant(B)
File "/home/ub2004/.local/lib/python3.8/site-packages/bitsandbytes/functional.py", line 1642, in double_quant
row_stats, col_stats, nnz_row_ptr = get_colrow_absmax(
File "/home/ub2004/.local/lib/python3.8/site-packages/bitsandbytes/functional.py", line 1531, in get_colrow_absmax
lib.cget_col_row_stats(ptrA, ptrRowStats, ptrColStats, ptrNnzrows, ct.c_float(threshold), rows, cols)
File "/usr/lib/python3.8/ctypes/init.py", line 386, in getattr
func = self.getitem(name)
File "/usr/lib/python3.8/ctypes/init.py", line 391, in getitem
func = self._FuncPtr((name_or_ordinal, self))
AttributeError: /home/ub2004/.local/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cpu.so: undefined symbol: cget_col_row_stats
(gh_alpaca-7b-chinese) ub2004@ub2004-B85M-A0:~/llm_dev/alpaca-7b-chinese/finetune$

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.