<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clip

<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data

<div class="snippet-clipboard-content notranslate position-relative overfl

<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data

您好，我想问一下这个问题解决了嘛？我这issue关闭了 <a h

对BERT-wwm-ext进行蒸馏时遇到以下问题，代码已贴出,about airaria/textbrewer

cgq0816 commented on May 28, 2024

说错了是位置编码的维度和输入的维度不匹配，自己debug的信息
inputs_embeds.shape
torch.Size([8, 1, 256, 768])
position_embeddings.shape
torch.Size([8, 768])
token_type_embeddings.shape
torch.Size([8, 1, 256, 768])

from textbrewer.

cgq0816 commented on May 28, 2024

    return {'input_ids': self.all_input_ids[index],
            'attention_mask': self.all_attention_mask[index],
            'labels': self.all_labels[index]}

输入改成字典形式的解决，但是遇到的新问题是
x.size()
torch.Size([8, 1, 256, 768])
x.size()
torch.Size([8, 1, 256, 12, 64])
def transpose_for_scores(self, x):
new_x_shape = x.size()[:-1] + (self.num_attention_heads, self.attention_head_size)
x = x.view(*new_x_shape)
return x.permute(0, 2, 1, 3)

File "C:\Users\cgq\AppData\Roaming\Python\Python36\site-packages\transformers\modeling_bert.py", line 206, in transpose_for_scores
return x.permute(0, 2, 1, 3)
RuntimeError: number of dims don't match in permute
经debug之后发现数据维度变成5维，这是什么原因导致的呢

from textbrewer.

airaria commented on May 28, 2024

问题解决了吗？看起来是输入顺序上出了问题。因为正常来说不会传入input_embeds，也不会有关于inputs_embeds的报错

File "d:\Users\cgq\Anaconda3\Lib\site-packages\torch\nn\modules\module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "d:\Users\cgq\Anaconda3\Lib\site-packages\transformers\modeling_bert.py", line 211, in forward
embeddings = inputs_embeds + position_embeddings + token_type_embeddings

builtins.RuntimeError: The size of tensor a (256) must match the size of tensor b (8) at non-singleton dimension 2

改成字典传值是能解决这个问题。
Example中的transformers版本较早，和后期版本中的模型参数顺序可能不同。只要注意dataset中的item顺序和forward参数顺序就能解决。

关于修改后的数据维数，你check下all_input_ids的shape?

from textbrewer.

airaria commented on May 28, 2024

    return {'input_ids': self.all_input_ids[index],
            'attention_mask': self.all_attention_mask[index],
            'labels': self.all_labels[index]}
输入改成字典形式的解决，但是遇到的新问题是

x.size()

torch.Size([8, 1, 256, 768])

x.size()

torch.Size([8, 1, 256, 12, 64])

def transpose_for_scores(self, x):

new_x_shape = x.size()[:-1] + (self.num_attention_heads, self.attention_head_size)

x = x.view(*new_x_shape)

return x.permute(0, 2, 1, 3)

File "C:\Users\cgq\AppData\Roaming\Python\Python36\site-packages\transformers\modeling_bert.py", line 206, in transpose_for_scores

return x.permute(0, 2, 1, 3)

RuntimeError: number of dims don't match in permute

经debug之后发现数据维度变成5维，这是什么原因导致的呢

问题解决了吗？看起来是输入顺序上出了问题。因为正常来说不会传入input_embeds，也不会有关于inputs_embeds的报错

File "d:\Users\cgq\Anaconda3\Lib\site-packages\torch\nn\modules\module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "d:\Users\cgq\Anaconda3\Lib\site-packages\transformers\modeling_bert.py", line 211, in forward
embeddings = inputs_embeds + position_embeddings + token_type_embeddings

builtins.RuntimeError: The size of tensor a (256) must match the size of tensor b (8) at non-singleton dimension 2

改成字典传值是能解决这个问题。
Example中的transformers版本较早，和后期版本中的模型参数顺序可能不同。只要注意dataset中的item顺序和forward参数顺序就能解决。

关于修改后的数据维数，你check下all_input_ids的shape?

from textbrewer.

cgq0816 commented on May 28, 2024

    return {'input_ids': self.all_input_ids[index],
            'attention_mask': self.all_attention_mask[index],
            'labels': self.all_labels[index]}
输入改成字典形式的解决，但是遇到的新问题是
x.size()
torch.Size([8, 1, 256, 768])
x.size()
torch.Size([8, 1, 256, 12, 64])
def transpose_for_scores(self, x):
new_x_shape = x.size()[:-1] + (self.num_attention_heads, self.attention_head_size)
x = x.view(*new_x_shape)
return x.permute(0, 2, 1, 3)
File "C:\Users\cgq\AppData\Roaming\Python\Python36\site-packages\transformers\modeling_bert.py", line 206, in transpose_for_scores
return x.permute(0, 2, 1, 3)
RuntimeError: number of dims don't match in permute
经debug之后发现数据维度变成5维，这是什么原因导致的呢
问题解决了吗？看起来是输入顺序上出了问题。因为正常来说不会传入input_embeds，也不会有关于inputs_embeds的报错
File "d:\Users\cgq\Anaconda3\Lib\site-packages\torch\nn\modules\module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "d:\Users\cgq\Anaconda3\Lib\site-packages\transformers\modeling_bert.py", line 211, in forward
embeddings = inputs_embeds + position_embeddings + token_type_embeddings

builtins.RuntimeError: The size of tensor a (256) must match the size of tensor b (8) at non-singleton dimension 2
改成字典传值是能解决这个问题。
Example中的transformers版本较早，和后期版本中的模型参数顺序可能不同。只要注意dataset中的item顺序和forward参数顺序就能解决。

关于修改后的数据维数，你check下all_input_ids的shape?

解决了，输入问题，应该是transformers版本不兼容导致的，我用的是2.9.0

from textbrewer.

junrong1 commented on May 28, 2024

你好，能不能贴一下你的bert_config_T6，我遇到了和你相似的问题，不知道怎么解决，我把版本换成了2.9.0，也没有解决

from textbrewer.

cgq0816 commented on May 28, 2024

你好，蒸馏文件已在附件中，麻烦下载一下，使用HFL发布的RoBERTa-wwm-ext或者BERT-wwm-ext做的实验

…

------------------ 原始邮件 ------------------ 发件人: "airaria/TextBrewer" <[email protected]>; 发送时间: 2021年1月4日(星期一) 下午5:35 收件人: "airaria/TextBrewer"<[email protected]>; 抄送: "云"<[email protected]>;"State change"<[email protected]>; 主题: Re: [airaria/TextBrewer] 对BERT-wwm-ext进行蒸馏时遇到以下问题，代码已贴出 (#30) 你好，能不能贴一下你的bert_config_T6，我遇到了和你相似的问题，不知道怎么解决，我把版本换成了2.9.0，也没有解决 — You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub, or unsubscribe.

from textbrewer.

junrong1 commented on May 28, 2024

    return {'input_ids': self.all_input_ids[index],
            'attention_mask': self.all_attention_mask[index],
            'labels': self.all_labels[index]}
输入改成字典形式的解决，但是遇到的新问题是
x.size()
torch.Size([8, 1, 256, 768])
x.size()
torch.Size([8, 1, 256, 12, 64])
def transpose_for_scores(self, x):
new_x_shape = x.size()[:-1] + (self.num_attention_heads, self.attention_head_size)
x = x.view(*new_x_shape)
return x.permute(0, 2, 1, 3)

File "C:\Users\cgq\AppData\Roaming\Python\Python36\site-packages\transformers\modeling_bert.py", line 206, in transpose_for_scores
return x.permute(0, 2, 1, 3)
RuntimeError: number of dims don't match in permute
经debug之后发现数据维度变成5维，这是什么原因导致的呢

embeddings = inputs_embeds + position_embeddings + token_type_embeddings
RuntimeError: The size of tensor a (30) must match the size of tensor b (512) at non-singleton dimension 1

inputs_embeds.size() = [512, 30, 768] # batch, seq_len, hid_dim
position_embeddings.size() = [512, 768]
token_type_embeddings = [512, 30, 768]

请问一下，这一段return加在哪里可以修复bug，这个维度匹配的问题我还是没能解决
我用的是hfl/roberta-wwm-ext

from textbrewer.

cgq0816 commented on May 28, 2024

嗯嗯，之前我也遇到过，transformers版本过高导致的，详细看我的transform怎么load数据的，我这边这个问题已经解决，附件可查看 ------------------ 原始邮件 ------------------ 发件人: "airaria/TextBrewer" <[email protected]>; 发送时间: 2021年1月4日(星期一) 晚上6:26 收件人: "airaria/TextBrewer"<[email protected]>; 抄送: "云"<[email protected]>;"State change"<[email protected]>; 主题: Re: [airaria/TextBrewer] 对BERT-wwm-ext进行蒸馏时遇到以下问题，代码已贴出 (#30) return {'input_ids': self.all_input_ids[index], 'attention_mask': self.all_attention_mask[index], 'labels': self.all_labels[index]} 输入改成字典形式的解决，但是遇到的新问题是 x.size() torch.Size([8, 1, 256, 768]) x.size() torch.Size([8, 1, 256, 12, 64]) def transpose_for_scores(self, x): new_x_shape = x.size()[:-1] + (self.num_attention_heads, self.attention_head_size) x = x.view(*new_x_shape) return x.permute(0, 2, 1, 3) File "C:\Users\cgq\AppData\Roaming\Python\Python36\site-packages\transformers\modeling_bert.py", line 206, in transpose_for_scores return x.permute(0, 2, 1, 3) RuntimeError: number of dims don't match in permute 经debug之后发现数据维度变成5维，这是什么原因导致的呢 embeddings = inputs_embeds + position_embeddings + token_type_embeddings RuntimeError: The size of tensor a (30) must match the size of tensor b (512) at non-singleton dimension 1 inputs_embeds.size() = [512, 30, 768] # batch, seq_len, hid_dim position_embeddings.size() = [512, 768] token_type_embeddings = [512, 30, 768] 请问一下，这一段return加在哪里可以修复bug，这个维度匹配的问题我还是没能解决我用的是hfl/roberta-wwm-ext — You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub, or unsubscribe.

from textbrewer.

cgq0816 commented on May 28, 2024

您好，我想问一下这个问题解决了嘛？我这issue关闭了

…

------------------ 原始邮件 ------------------ 发件人: "云" <[email protected]>; 发送时间: 2021年1月4日(星期一) 晚上7:38 收件人: "airaria/TextBrewer"<[email protected]>; 主题: 回复： [airaria/TextBrewer] 对BERT-wwm-ext进行蒸馏时遇到以下问题，代码已贴出 (#30) 嗯嗯，之前我也遇到过，transformers版本过高导致的，详细看我的transform怎么load数据的，我这边这个问题已经解决，附件可查看 ------------------ 原始邮件 ------------------ 发件人: "airaria/TextBrewer" <[email protected]>; 发送时间: 2021年1月4日(星期一) 晚上6:26 收件人: "airaria/TextBrewer"<[email protected]>; 抄送: "云"<[email protected]>;"State change"<[email protected]>; 主题: Re: [airaria/TextBrewer] 对BERT-wwm-ext进行蒸馏时遇到以下问题，代码已贴出 (#30) return {'input_ids': self.all_input_ids[index], 'attention_mask': self.all_attention_mask[index], 'labels': self.all_labels[index]} 输入改成字典形式的解决，但是遇到的新问题是 x.size() torch.Size([8, 1, 256, 768]) x.size() torch.Size([8, 1, 256, 12, 64]) def transpose_for_scores(self, x): new_x_shape = x.size()[:-1] + (self.num_attention_heads, self.attention_head_size) x = x.view(*new_x_shape) return x.permute(0, 2, 1, 3) File "C:\Users\cgq\AppData\Roaming\Python\Python36\site-packages\transformers\modeling_bert.py", line 206, in transpose_for_scores return x.permute(0, 2, 1, 3) RuntimeError: number of dims don't match in permute 经debug之后发现数据维度变成5维，这是什么原因导致的呢 embeddings = inputs_embeds + position_embeddings + token_type_embeddings RuntimeError: The size of tensor a (30) must match the size of tensor b (512) at non-singleton dimension 1 inputs_embeds.size() = [512, 30, 768] # batch, seq_len, hid_dim position_embeddings.size() = [512, 768] token_type_embeddings = [512, 30, 768] 请问一下，这一段return加在哪里可以修复bug，这个维度匹配的问题我还是没能解决我用的是hfl/roberta-wwm-ext — You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub, or unsubscribe.

from textbrewer.

对BERT-wwm-ext进行蒸馏时遇到以下问题，代码已贴出 about textbrewer HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent