unimse,lemei

是否需要将 modules/generation_utils.p覆盖掉transformers/generation_utils.py？

在modules/modeling_t5.py modeling_t5_prefix.py等文件里，您重写了_prepare_encoder_decoder_kwargs_for_generation函数（使用self._prepare_encoder_decoder_kwargs_for_generation调用），但代码input_ids = self._prepare_decoder_input_ids_for_generation(input_ids, decoder_start_token_id=decoder_start_token_id, bos_token_id=bos_token_id) 仍会调用transformers/generation_utils.py里的_prepare_decoder_input_ids_for_generation【备注：其中第一个参数应为int型的batchsize】。我看到您在modules/generation_utils.py里将该函数的第一个参数改为了longtensor型【备注：符合input_ids的类型】，但modules/modeling_t5.py文件并没有使用modules/generation_utils.py而是使用了默认的transformers/generation_utils.py。
也就是运行代码时，会因为调用的函数参数类型不一致报错，请问我该如何修改使之适配呢？
非常感谢！

关于Adapter层（多模态融合层）

您好，请问文章中将多模态融合层同时嵌入到编码器和解码器的设计原因是什么呢？如果仅将多模态层加入到编码器，效果是不如同时在编码器和解码器中吗？
感谢！

配置环境问题

你好，我按照您给出的环境配置本地环境，模型运行到
if torch.cuda.is_available():
model = self.model.to(DEVICE)时，
报错RuntimeError: cuDNN error: CUDNN_STATUS_BAD_PARAM，将模型放在CPU上则没有问题，想请问一下您之前遇到过类似的吗？

How do I reproduce the numbers?

Hi,

Could you please add more documentation and steps to reproduce the numbers you have reported in the paper? At the moment, based just on the code it's not clear what scripts to run first. It seems like a few essential details are missing.

最终使用的pickle数据集各部分分别代表什么？

您好，我正计划使用额外的数据集来训练这个模型，因此我想知道最终使用的数据集中的各个部分分别代表什么。

请问第一个部分为什么是空的？
请问部分1-8分别是什么？（我知道第四个是文本）
非常感谢您的回答！

Reproduction

Hello,
Can you provide a complete and reproducible code for researchers? Thanks!

Has anyone successfully reproduced it?

Questions about model performance reproduction

Dear Author, I am trying to reproduce your code, but the current run is not optimal for performance. Here is what I have done:

I set use_adapter, info_nce, use_cl, use_prefix_p, prefix_projection to True in the config and left the rest unchanged.
I am using the dataset as shown in the picture. I did not run preprocess.py on the mosi,mosei dataset because I did not find the corresponding train/dev/test csv.
I have only run 50 epochs so far.

I think this is because I didn't run enough epochs or my dataset processing is wrong. I wonder if you can provide some suggestions? Much appreciated!

我把pickle 的keys 做了个输出，没有 0，1，2 可以解释下吗，感谢

请问Laptops在哪里下载

您好，感谢您的代码贡献。
我在运行Sim_Process_v3.py时遇到了问题；
提示找不到Laptops/all_convert.json;
请问这个文件在哪里下载呢，感谢您的帮助。

Feature dimension mismatch

In all your shared feature files (like meld_data_0610.pkl), the feature of a single utterance is two-dimensional, which means the feature of a dialogue will be three-dimensional and a batch will result in a four-dimensional tensors.

However, in the model's input part, I noticed that visual and acoustic features are directly inputted into the RNN model, which means the batch-level visual and acoustic features are three-dimensional tensors. I did not find dimensionality reduction code in this repository. Could you please explain or update the code accordingly?

Why multimodal fusion layer also used in the decoder part？

No such file or directory: 't5-large/pytorch_model.bin'

请问一下pytorch_model.bin这个文件是需要自己在哪里下载吗？

Error when running evaluation function for MELD and IEMOCAP

Hi @LeMei, I am encountering problems when attempting to run the evaluation functions for the IEMOCAP and MELD datasets when running the model for MOSELDMP. Specifically when using the "eval_emotionlines(meld_results, meld_truths)" function. Hope to hear from you soon.

帮助，缺少文件和数据集。

"new_train_por_v4_0610_6c_sep_contexts.pkl"
"new_dev_por_v4_0610_6c_sep_contexts.pkl"
"new_test_por_v4_0610_6c_sep_contexts_v.pkl"
都没有找到，不清楚main.py中使用了哪些文件，也不清楚config.py如何工作，MOSELDMP数据集没有找到。你能给出一个逐步的方法来运行IEMOCAP数据集的代码吗？

请问怎么复现，我照着readme运行，会报错

OSError: Can't load config for 'princeton-nlp/sup-simcse-bert-base-uncased'. Make sure that:

'princeton-nlp/sup-simcse-bert-base-uncased' is a correct model identifier listed on 'https://huggingface.co/models'
(make sure 'princeton-nlp/sup-simcse-bert-base-uncased' is not a path to a local directory with something else, in that case)
or 'princeton-nlp/sup-simcse-bert-base-uncased' is the correct path to a directory containing a config.json file

报错结果如上
请问怎么解决

Can you give the code that can be easily reproduced?

meld_data_0610.pkl has only length 2

Hi, nice work!
I want to reproduce your work but I have some issues in reproducing.

I got meld_data_0610.pkl from the link https://drive.google.com/file/d/1pWH2xPVZFymxeJUrd6gF37qYbvmhh32s/view?usp=sharing in your readme

I tried to execute simcse/Sim_process_v3.py but I do not have Laptops directory.

So I tried to execute preprocess.py and create_dataset.py instead.

However, in preprocess.py, an error is occured
Traceback (most recent call last): File "src/preprocess.py", line 44, in <module> train_emotion_f, dev_emotion_f, test_emotion_f = emotion_features[0], emotion_features[1], emotion_features[2] KeyError: 0
(I have changed all string '0424' into '0610')
so I checked the length of emotion_features and it is 2

What should I do in this case?

Thank you.

ABSA数据集使用

看到Sim_Process_v3中有用到ABSA的Restaurants和Laptops数据集，但是在论文以及后续代码中并没有继续使用，并且在readme**享的数据集链接中没有找到这两个数据集文件，请问是可以直接删除相关代码吗，或者能提示如何下载使用吗，感谢

关于训练的Loss

您好，
（1）请问训练loss中，除了两个对比损失，生成任务的L(task)是仅指交叉熵损失吗？（即torch.nn.CrossEntropyLoss）
（2）T5原文中提到了类似spanBert的bert-style的mask损失，请问论文中是否应用了这种目标函数呢？还是仅使用seq2seq的目标函数呢？
非常感谢！

如何使用这些文件

您好，我在复现论文的时候遇到了一些问题：

请问Sim_Process_v3中的这些文件可以从哪里找到？Restaurants和Laptops又是什么？
您提供的pickle文件又该如何使用呢？
期待您的回复，感激不尽！

您好，请问提取的视/音频特征是否包含了上下文，还是只有当前这句话的情况？

when will you open your code, im interested about it!

'../datasets/Laptops/all_convert.json' not found

Reproduction in 2023

Hello

I tried to run this code repo and I end up in some issues

How to star

Datasets
I downloaded files for MOSI based on links provided in README.
In MOSI.zip i have found already created U-labels new_MOSI-label-v3.csv

I changed paths in config to fit my paths/to/data
(csv, mosi.pkl, etc....)

T5
For each ../t5-base I renamed to t5-base due to fact that this can be taken from hugging face repo ( automatically)

For PyTorch model.bin I opened corresponding T5 hugging face repo and download .bin file
Changed path/to/bin

U can as well clone hugging face repo

So after this I ran
python main.py --dataset=mosi --multi=False

And...

in file data_loader.py line 423

encoding = tokenizer( [task_prefix + sequence for sequence in inputs_seq], return_tensors="pt", padding=True )

this code throw issue due to fact that task_prefix + sequence trying to do str + list[str] -> to fix this I did
task_prefix + ' '.join(sequence)

next....

in modeling_t5_prefix.py line 1848

else: input_ids = self._prepare_decoder_input_ids_for_generation( input_ids, decoder_start_token_id=decoder_start_token_id, bos_token_id=bos_token_id )

an error occur because what Im assume we put tensor as batch and func return expect tensor.ones((batch_size, 1)........) and if input_ids is here as batch_size as type tensor -> tried to fix this like -> input_ids.shape[1]

but than next lines start to fail

line 1900

logits_processor = self._get_logits_processor() expect Optional argument logits_processor which I assigned as None

and I finally end with fail

File "modules/modeling_t5_prefix.py", line 524, in forward
    scores += position_bias
RuntimeError: The size of tensor a (32) must match the size of tensor b (55) at non-singleton dimension 3

I stopped here because I don't want fix this any more -> I assume code need refactor by author

Cannot install transformers==4.14.5 because such version did not exist and Im using transformers==4.16.0

碰到一个问题?

请问这个问题要怎么解决呢？

请问一下这段代码该如何

配置环境问题

Hello,

在您的Readme文件里即提到了用pytorch1.7又有用tensorflow-gpu，所以是这两个环境都需要安装吗？还是二选一呢？Tensorflow的版本可以是2.0吗？

Unable to find '../t5-base/pytorch_model.bin'

Hi again @LeMei,

I'm getting this error when I run the main.py. I don't see any t5-base folder in the repository. Where can I find this file?

Do I need to download both BaiDu and Google Drive link?

两个链接内容是一样的吗？为什么我看见文件名是一样大的，但是文件大小不同？
非常感谢您的解答。

preprocess中的KeyError: 0

作者您好，我在执行preproccess.py时遇到了KeyError: 0的问题，是这个语句
train_emotion_f, dev_emotion_f, test_emotion_f = emotion_features[0], emotion_features[1], emotion_features[2]，请问您知道该如何解决吗？

请问一下最终输入至T5模型是只用两个模态：语音和视频吗，我看你的dataloader, 在文本这一项输入至模型的是一个空值

How do you set up the training and validation sets?

I noticed that you claim:

We integrate the training sets of MOSI,MOSEI, MELD, IEMOCAP to train the model and valid sets to select hyperparameters.

Did you fuse all the datasets into one training set, and did you use the same model weights on four benchmark reviews?
And did you also fuse the valid sets to get 1 best model for all 4 datasets?

如何生成iemocap_data_0610.pkl、meld_data_0610.pkl、mosei_data_0610.pkl、mosi_data_0610.pkl文件

作者你好：

请问iemocap_data_0610.pkl、meld_data_0610.pkl、mosei_data_0610.pkl、mosi_data_0610.pkl这四个文件是如何生成的，我在代码里没有找到生成上述四个文件的地方，可以提供一下生成的代码吗？感谢感谢

论文结果疑问

论文中mosi数据集的结果是五次随机种子取平均的结果吗

按照readme执行main.py 发生错误，错误如下。能否解答一下，感谢

FileNotFoundError: [Errno 2] No such file or directory: '/home/dwh/unimse/datasets/MOSELDMP/new_moseldmp_train_align_v4_0424_a_6c_contexts.pkl'

提示没有new_moseldmp_train_align_v4_0424_a_6c_contexts.pkl文件，我没找到生成此文件的代码

ABSA数据集没有提供啊

How would this apply to the Chinese dataset

Hello, I am very interested in this project.
1）When using the T5, do you initialize the T5 with the pre-training weight and then fine-tune the T5 weight with the multi-modal training？ Or freezing the parameters of T5 all the time？
2）If I want to apply it to Chinese dataset, can I use mT5 instead of T5 for training？
3）And how long did the training take in the paper?
Thank you very much！

main.py中的文件缺失

这个是报错，没有这个文件，并且我注意到这一段路径里的文件都不在text中，我应该替换成别的文件吗，还是需要在哪里生成，感谢！

求助这个key 0 error怎么解决谢谢！

缺少变量的定义

作者你好，请问一下这两个变量是如何定义的，似乎是代码缺失导致的问题

Sharing the features via Google drive

Hi again @LeMei, I see that you have made the features available via baidu and was wondering if it is possible to share the data via Google drive since I'm having issues with baidu services.

When will the paper be released?

文件问题

Hello，May I ask which file is the one where these. json files are generated.

模型

感谢您的工作，请问您方便释放最终得到的模型吗？

FileNotFoundError: [Errno 2] No such file or directory: PycharmProjects\\pythonProject4\\UniMSE-main\\datasets\\MOSELDMP/new_moseldmp_train_align_v4_0424_a_6c_contexts.pkl' Start loading the data....

缺少这个new_moseldmp_train_align_v4_0424_a_6c_contexts.pkl文件，请问在哪里可以找到呢？

lemei / unimse Goto Github PK

unimse's People

Contributors

Stargazers

Watchers

Forkers

unimse's Issues

How to star

Recommend Projects

Recommend Topics

Recommend Org