I was trying to serve the model and after I successfully I uploaded the image, the res

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Weird result,about vision-cair/minigpt-4

Comments (42)

alibabadoufu commented on August 21, 2024 2

does the hash match ?

Sorry I was being dumb. I can just simply use the huggingface path and it will download the correct weights by itself. There is no need to download the weights manually.

from minigpt-4.

TsuTikgiau commented on August 21, 2024 2

@vtddggg @VvanGemert Thanks for your interest! Yes, as LLAMA doesn't allow to distributing their weights, models based on LLAMA have to find ways to bypass this rule, like releasing the 'delta' weight instead of the direct working weights. Direct working weights = delta weights + original llama weight. This is also the case for Vicuna as they explain in their instruction. Vicuna provides a script to convert the delta weight when you have LLAMA weight. I'm currently writing a simple introduction for the preparation of Vicuna.

from minigpt-4.

TsuTikgiau commented on August 21, 2024 2

@Andy1621 Thanks for your interest! The weight you download from huggingface is the delta weight instead of the directly working weight. You need to follow their instruction to add delta weight back to the original llama weight to get the final working weight. The reason vicuna doesn't release the direct working weight is, we cannot distribute llama weight according to their rules. I'm currently writing a guide for preparing the Vicuna weight

from minigpt-4.

alibabadoufu commented on August 21, 2024 1

@alibabadoufu Their v0.1.10 tag should work. What is the weird thing you find?

This is my result:

Here is my steps to get the Vicuna weights
git clone --depth 1 --branch v0.1.10 https://github.com/lm-sys/FastChat.git
git lfs install
git clone https://huggingface.co/decapoda-research/llama-13b-hf

correct the name in the config.json and tokenizer_config.json

Install FastChat from source as specified in the FastChat github repo

Execute the following command:

python3 -m fastchat.model.apply_delta     
--base xxx/llama-13b-hf     
--target xxx/MiniGPT-4/vicuna-7b     
--delta lmsys/vicuna-13b-delta-v0

(ignore the path, I double confirm that I set them correctly)

from minigpt-4.

zerocore2340 commented on August 21, 2024 1

For any one else having this issue , the version of fastchat matters ! obviously make sure to install
[email protected]

from minigpt-4.

TsuTikgiau commented on August 21, 2024 1

I update the code to remove <s> now

from minigpt-4.

TsuTikgiau commented on August 21, 2024

Thanks for your interest! We screenshot this image and checked the model output. It looks normal on my site.

I think some potential reason for your bug could be that you loaded the wrong weights. May I ask how you set up your Vicuna? Vicuna released a new version a few days ago, but currently, we are using the old version (v0 version, check the readme). So if you load the new Vicuna, it might not work.

from minigpt-4.

alibabadoufu commented on August 21, 2024

@TsuTikgiau I suspect its because when I git clone the old Vicuna weight there are some corruptions. I need to fix it first to confirm it's the problem.

from minigpt-4.

alibabadoufu commented on August 21, 2024

Thanks for your interest! We screenshot this image and checked the model output. It looks normal on my site. I think some potential reason for your bug could be that you loaded the wrong weights. May I ask how you set up your Vicuna? Vicuna released a new version a few days ago, but currently, we are using the old version (v0 version, check the readme). So if you load the new Vicuna, it might not work.

When I try to git clone https://huggingface.co/lmsys/vicuna-13b-delta-v0, I always encountered the following issue:

Encountered 3 file(s) that may not have been copied correctly on Windows:
	pytorch_model-00003-of-00003.bin
	pytorch_model-00001-of-00003.bin
	pytorch_model-00002-of-00003.bin

See: `git lfs help smudge` for more details.

Have you experienced the same issue before? If yes, would you suggest some solutions?

from minigpt-4.

VvanGemert commented on August 21, 2024

@alibabadoufu I've got the same issue. I've checked out these weights and they do work. https://huggingface.co/fasthuggy/vicuna-13b-delta-v1.1-fastchat-conversion/tree/main

Only get the result with strike-through text.

from minigpt-4.

TsuTikgiau commented on August 21, 2024

@VvanGemert Thanks for your interest! The weights you share in the link is Vicuna V1.1, which was released only few days ago. Our current model is based on Vicuna V0 as we say in the readme. Therefore, this weight doesn't work for us currently. We plan to train a new version of MiniGPT-4 based on the new Vicuna V1.1 soon.

from minigpt-4.

maciekpoplawski commented on August 21, 2024

alibabadoufu do you have git lfs installed? Please remember you can download the files manually. I just encountered a crash on git LFS and im just downloading the files manually.
Here: https://huggingface.co/lmsys/vicuna-13b-delta-v0/tree/main -> Files and versions

bottom left - Download

from minigpt-4.

alibabadoufu commented on August 21, 2024

alibabadoufu do you have git lfs installed? Please remember you can download the files manually. I just encountered a crash on git LFS and im just downloading the files manually. Here: https://huggingface.co/lmsys/vicuna-13b-delta-v0/tree/main -> Files and versions bottom left - Download

I tried doing that but the downloaded file was corrupted as well.

The file size specified and the downloaded file size do not match.

from minigpt-4.

zerocore2340 commented on August 21, 2024

does the hash match ?

from minigpt-4.

Richar-Du commented on August 21, 2024

I use the vicuna-v0 but the result is still strike-through text.

Meanwhile, I noticed that when the model generates text, there raises a warning:
A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.
However, I have set padding_side='left' in the init().

Could you examine why this happened? Thanks in advance :)

from minigpt-4.

vtddggg commented on August 21, 2024

I use the vicuna-v0 but the result is still strike-through text.

Meanwhile, I noticed that when the model generates text, there raises a warning: A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer. However, I have set padding_side='left' in the init().

Could you examine why this happened? Thanks in advance :)

I also met this warning. And the model gives some meaningless outputs:

<s> Moreoveravas зв $\{�added CentralONEyou)(partial \] †<s> 村 :)� журна�� sometimes \(\</s>

from minigpt-4.

alibabadoufu commented on August 21, 2024

But I found that even though I downloaded the weights directly using the the way I described above. The result still the same:

from minigpt-4.

pixeli99 commented on August 21, 2024

But I found that even though I downloaded the weights directly using the the way I described above. The result still the same:

I think the problem is:

from minigpt-4.

vtddggg commented on August 21, 2024

As said by @pixeli99 , do we need to apply the given delta model on llama to get true vicuna weights? @TsuTikgiau

from minigpt-4.

VvanGemert commented on August 21, 2024

As said by @pixeli99 , do we need to apply the given delta model on llama to get true vicuna weights? @TsuTikgiau

I'm trying this right now. It will take some time to combine the weights

from minigpt-4.

pixeli99 commented on August 21, 2024

I want to know if I must fill out the form before I can get the weight of llama?

from minigpt-4.

vtddggg commented on August 21, 2024

I want to know if I must fill out the form before I can get the weight of llama

Check this

from minigpt-4.

Andy1621 commented on August 21, 2024

I just meet the same issue.

And I found some keys in Vicuna are fine-tuned.

Is it right to directly download the checkpoint in huggingface, and load the special keys in pretrained_minigpt4.pth?
Maybe you can provide a whole checkpoint of miniGPT-4 for reproducing the results easier.

from minigpt-4.

pixeli99 commented on August 21, 2024

I want to know if I must fill out the form before I can get the weight of llama

Check this

tks

from minigpt-4.

vtddggg commented on August 21, 2024

@TsuTikgiau Thanks for your guidance! We will try this.
BTW, does mini-gpt4 model support in-context learning (like flamingo)?

from minigpt-4.

Andy1621 commented on August 21, 2024

@TsuTikgiau Thanks for your suggestion! I wi try it later.

from minigpt-4.

TsuTikgiau commented on August 21, 2024

@alibabadoufu @Richar-Du Thanks for your interest! The other user in this issue, @Andy1621 , shows similar issues to your cases. And according to his explanation, he didn't add the delta vicuna weight back to llama to get the final vicuna weight. According to LLAMA's rules, we cannot distribute llama's weight, that is the reason why models like Vicuna or Alpaca-Lore have to release the delta weghit instead of the direct working weight. I'm currently preparing a guide for the vicuna preparation, but if you are in a hurry, you can directly follow vicuna's introduction to create the final working weight

from minigpt-4.

alibabadoufu commented on August 21, 2024

@alibabadoufu @Richar-Du Thanks for your interest! The other user in this issue, @Andy1621 , shows similar issues to your cases. And according to his explanation, he didn't add the delta vicuna weight back to llama to get the final vicuna weight. According to LLAMA's rules, we cannot distribute llama's weight, that is the reason why models like Vicuna or Alpaca-Lore have to release the delta weghit instead of the direct working weight. I'm currently preparing a guide for the vicuna preparation, but if you are in a hurry, you can directly follow vicuna's introduction to create the final working weight

Thanks so much for your response. May I know which commit you use to convert the weights? I was using the latest Fastchat commit but I think its no longer supporting conversion for v.0 vicuna weights. I was trying the v0.1 tag but the result seems weird.

from minigpt-4.

TsuTikgiau commented on August 21, 2024

@vtddggg We haven't tested the in-context learning ability yet. However, it is possible to construct such a setting using our code and check the performance. You can check the 'Chat.anser' function in the file 'minigpt4/conversation/conversation.py' to know how we add image embedding to the text embedding if you want to build a in context learning test

from minigpt-4.

TsuTikgiau commented on August 21, 2024

@alibabadoufu Their v0.1.10 tag should work. What is the weird thing you find?

from minigpt-4.

TsuTikgiau commented on August 21, 2024

@alibabadoufu 🤔 your steps look OK to me. I just create a guide in the PrepareVicuna.md under the root path. I also remake the Vicuna weight again just now from scratch, it works fine in my case. Maybe you can first check the guide and see if there is something different than what you did? May I ask if you see any strange outputs when you apply your delta to llama? And just for double check, you also load the pretrained single-layer we provide in the readme, right?

from minigpt-4.

alibabadoufu commented on August 21, 2024

@alibabadoufu 🤔 your steps look OK to me. I just create a guide in the PrepareVicuna.md under the root path. I also remake the Vicuna weight again just now from scratch, it works fine in my case. Maybe you can first check the guide and see if there is something different than what you did? May I ask if you see any strange outputs when you apply your delta to llama? And just for double check, you also load the pretrained single-layer we provide in the readme, right?

Thanks for the efforts :D. Will give it a try for the PrepareVicuna.md later.

By the way, the pretrained single-layer, are you referring to this step?

3. Prepare the pretrained MiniGPT-4 checkpoint

To play with our pretrained model, download the pretrained checkpoint [here](https://drive.google.com/file/d/1a4zLvaiDBr-36pasffmgpvH5P7CKmpze/view?usp=share_link). Then, set the path to the pretrained checkpoint in the evaluation config file in [eval_configs/minigpt4_eval.yaml](https://github.com/Vision-CAIR/MiniGPT-4/blob/main/eval_configs/minigpt4_eval.yaml#L10) at Line 11.

If so, then I can confirm I followed this instruction for reproducing the work earlier.

from minigpt-4.

Andy1621 commented on August 21, 2024

@alibabadoufu It works for me after using the correct weights of vicuna.

However, there is some strange strikethrough in my results. I'm not sure it's a bug in the radio or others...
Here are my steps:

Download the original LLAMA weight here.
Use the following code to convert the weights (copy convert_llama_weights_to_hf.py from transformers)：

python src/transformers/models/llama/convert_llama_weights_to_hf.py \
    --input_dir /path/to/downloaded/llama/weights --model_size 7B --output_dir /output/path

Produce the vicuna weights (Using tag v0.1.10)

python3 -m fastchat.model.apply_delta \
    --base /path/to/llama-13b \
    --target /output/path/to/vicuna-13b \
    --delta lmsys/vicuna-13b-delta-v1.0

from minigpt-4.

alibabadoufu commented on August 21, 2024

@alibabadoufu It works for me after using the correct weights of vicuna. However, there is some strange strikethrough in my results. I'm not sure it's a bug in the radio or others... Here are my steps:

Download the original LLAMA weight here.

Use the following code to convert the weights (copy convert_llama_weights_to_hf.py from transformers)：
python src/transformers/models/llama/convert_llama_weights_to_hf.py \
    --input_dir /path/to/downloaded/llama/weights --model_size 7B --output_dir /output/path
Produce the vicuna weights (Using tag v0.1.10)
python3 -m fastchat.model.apply_delta \
    --base /path/to/llama-13b \
    --target /output/path/to/vicuna-13b \
    --delta lmsys/vicuna-13b-delta-v1.0

I suspect the llama weights in the huggingface is not the same as what the authors originally used for this project. I will give it a try later following your instruction. Thanks a lot Andy!

from minigpt-4.

alibabadoufu commented on August 21, 2024

@alibabadoufu It works for me after using the correct weights of vicuna. However, there is some strange strikethrough in my results. I'm not sure it's a bug in the radio or others... Here are my steps:

Download the original LLAMA weight here.

Use the following code to convert the weights (copy convert_llama_weights_to_hf.py from transformers)：
python src/transformers/models/llama/convert_llama_weights_to_hf.py \
    --input_dir /path/to/downloaded/llama/weights --model_size 7B --output_dir /output/path
Produce the vicuna weights (Using tag v0.1.10)
python3 -m fastchat.model.apply_delta \
    --base /path/to/llama-13b \
    --target /output/path/to/vicuna-13b \
    --delta lmsys/vicuna-13b-delta-v1.0

I think I found the reason.

In the eval_configs/minigpt4_eval.yaml file, if we activate the low_resouce, I will get the same strange result I reported above. However, if I deactivated it like the following:

model:
  arch: mini_gpt4
  model_type: pretrain_vicuna
  freeze_vit: True
  freeze_qformer: True
  max_txt_len: 160
  end_sym: "###"
  low_resource: False
  prompt_path: "prompts/alignment.txt"
  prompt_template: '###Human: {} ###Assistant: '
  ckpt: '/data/jasper.laiwy/MiniGPT-4/pretrained_minigpt4.pth'

The result looks awesome now!

But the crossout text is still there.

from minigpt-4.

TsuTikgiau commented on August 21, 2024

I guess the strikethrough issue might be caused by the tokenizer. Maybe you can try to print the output token id out for the text with a strikethrough. I can help check how my tokenizer decodes the id and whether it contains something strange. To print the token id for debugging, you can simply add a print(output_token.cpu().numpy()) in Chat.answer in minigpt4/conversation/conversation.py

from minigpt-4.

zerocore2340 commented on August 21, 2024

When trying to merge delta for 13B I am getting
RuntimeError: The size of tensor a (32000) must match the size of tensor b (32001) at non-singleton dimension 0

Any ideas or help ?

from minigpt-4.

alibabadoufu commented on August 21, 2024

When trying to merge delta for 13B I am getting RuntimeError: The size of tensor a (32000) must match the size of tensor b (32001) at non-singleton dimension 0

Any ideas or help ?

You use the wrong FastChat version. I guess you were using the latest version which doesn't support the weight merging for v0 vicuna model. You need to check out the v0.1.1 tag. Please read through the comments above.

from minigpt-4.

alibabadoufu commented on August 21, 2024

I added the print function here

/data/jasper.laiwy/MiniGPT-4/minigpt4/conversation/conversation.py

  if output_token[0] == 0:
      output_token = output_token[1:]
  output_text = self.model.llama_tokenizer.decode(output_token, add_special_tokens=False)
  output_text = output_text.split('###')[0]  # remove the stop sign '###'
  output_text = output_text.split('Assistant:')[-1].strip()

  print(output_token.cpu().numpy())

  conv.messages[-1][1] = output_text
  return output_text, output_token.cpu().numpy()

Here is the output tokens:

[ 1 450 6114 297 278 1967 338 13407 297 4565 310 263
19571 591 4362 263 302 1151 28684 269 3466 10714 411 263
715 686 292 18873 1220 322 263 1183 261 1250 29889 2296
338 3063 472 902 17842 297 278 19571 29889 2277 29937]

Here is the output text:
The woman in the image is standing in front of a mirror wearing a nude colored slip dress with a plunging neckline and a sheer back. She is looking at her reflection in the mirror.

from minigpt-4.

Andy1621 commented on August 21, 2024

@alibabadoufu You can simply remove the <s> in the output, which is used as Strikethrough in Markdown.

llm_message = llm_message.replace("<s>", "")

from minigpt-4.

alibabadoufu commented on August 21, 2024

@alibabadoufu You can simply remove the <s> in the output, which is used as Strikethrough in Markdown.
llm_message = llm_message.replace("<s>", "")

Thanks :D I was looking ways to remove it. Thanks for your suggestion. Now it works like a charm!

Thanks for all the team member's efforts in this project.

from minigpt-4.

yongliang-wu commented on August 21, 2024

@TsuTikgiau Thanks for your guidance! We will try this. BTW, does mini-gpt4 model support in-context learning (like flamingo)?

@vtddggg Have you try the in-context learning of it?

from minigpt-4.

Weird result about minigpt-4 HOT 42 CLOSED

Comments (42)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent