The flan-alpaca-lora from reason-wang

Question about training loss

Hi, I'm very interested in your project, but during the training, I found that the training loss will be very big, more than 30, is it normal?

Training script takes more than 2 hours to finish

Hi. Thanks for your nice work!

I've tried to run your training script on a RTX3090 with exact dependencies as you suggested. It turned out that it took more than 2 hours to finish instead of 20 minutes. I also tried training flan-t5-large and it took more than 4 hours. What can be the reasons for this?

NameError: name 'bnb' is not defined

Getting the following error -
in <cell line: 8>:8 │
│ │
│ /usr/local/lib/python3.10/dist-packages/peft/peft_model.py:143 in from_pretrained │
│ │
│ 140 │ │ if config.task_type not in MODEL_TYPE_TO_PEFT_MODEL_MAPPING.keys(): │
│ 141 │ │ │ model = cls(model, config) │
│ 142 │ │ else: │
│ ❱ 143 │ │ │ model = MODEL_TYPE_TO_PEFT_MODEL_MAPPING[config.task_type](model, config) │
│ 144 │ │ │
│ 145 │ │ # load weights if any │
│ 146 │ │ if os.path.exists(os.path.join(model_id, WEIGHTS_NAME)): │
│ │
│ /usr/local/lib/python3.10/dist-packages/peft/peft_model.py:642 in init │
│ │
│ 639 │ """ │
│ 640 │ │
│ 641 │ def init(self, model, peft_config: PeftConfig): │
│ ❱ 642 │ │ super().init(model, peft_config) │
│ 643 │ │ self.base_model_prepare_inputs_for_generation = self.base_model.prepare_inputs_f │
│ 644 │ │ self.base_model.prepare_inputs_for_generation = self.prepare_inputs_for_generati │
│ 645 │ │ self.base_model_prepare_encoder_decoder_kwargs_for_generation = ( │
│ │
│ /usr/local/lib/python3.10/dist-packages/peft/peft_model.py:79 in init │
│ │
│ 76 │ │ if isinstance(self.peft_config, PromptLearningConfig): │
│ 77 │ │ │ self._setup_prompt_encoder() │
│ 78 │ │ else: │
│ ❱ 79 │ │ │ self.base_model = LoraModel(peft_config, model) │
│ 80 │ │ if getattr(self.peft_config, "modules_to_save", None) is not None: │
│ 81 │ │ │ self.modules_to_save = self.peft_config.modules_to_save │
│ 82 │ │ │ _set_trainable(self) │
│ │
│ /usr/local/lib/python3.10/dist-packages/peft/tuners/lora.py:118 in init │
│ │
│ 115 │ │ super().init() │
│ 116 │ │ self.peft_config = config │
│ 117 │ │ self.model = model │
│ ❱ 118 │ │ self._find_and_replace() │
│ 119 │ │ mark_only_lora_as_trainable(self.model, self.peft_config.bias) │
│ 120 │ │ self.forward = self.model.forward │
│ 121 │
│ │
│ /usr/local/lib/python3.10/dist-packages/peft/tuners/lora.py:148 in _find_and_replace │
│ │
│ 145 │ │ │ │ │ is_target_modules_in_base_model = True │
│ 146 │ │ │ │ parent, target, target_name = self._get_submodules(key) │
│ 147 │ │ │ │ bias = target.bias is not None │
│ ❱ 148 │ │ │ │ if loaded_in_8bit and isinstance(target, bnb.nn.Linear8bitLt): │
│ 149 │ │ │ │ │ kwargs.update( │
│ 150 │ │ │ │ │ │ { │
│ 151 │ │ │ │ │ │ │ "has_fp16_weights": target.state.has_fp16_weights,

Further fine tuning flan-alpaca-gpt4-lora-xl

Can reasonwang/flan-alpaca-gpt4-lora-xl be further fine tuned?

If yes, what would be the steps for it?

flan-alpaca-gpt4-lora-xl on Google colab

Can reasonwang/flan-alpaca-gpt4-lora-xl be run on Google colab?

If yes, what setting / config would it require?

I tried A100 and it seems to fail 😔

reason-wang / flan-alpaca-lora Goto Github PK

flan-alpaca-lora's Introduction

flan-alpaca-lora's People

Contributors

Stargazers

Watchers

Forkers

flan-alpaca-lora's Issues

Question about training loss

Training script takes more than 2 hours to finish

NameError: name 'bnb' is not defined

Further fine tuning flan-alpaca-gpt4-lora-xl

flan-alpaca-gpt4-lora-xl on Google colab

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent