Comments (10)
说错了 是位置编码的维度和输入的维度不匹配 ,自己debug的信息
inputs_embeds.shape
torch.Size([8, 1, 256, 768])
position_embeddings.shape
torch.Size([8, 768])
token_type_embeddings.shape
torch.Size([8, 1, 256, 768])
from textbrewer.
return {'input_ids': self.all_input_ids[index],
'attention_mask': self.all_attention_mask[index],
'labels': self.all_labels[index]}
输入改成字典形式的解决,但是遇到的新问题是
x.size()
torch.Size([8, 1, 256, 768])
x.size()
torch.Size([8, 1, 256, 12, 64])
def transpose_for_scores(self, x):
new_x_shape = x.size()[:-1] + (self.num_attention_heads, self.attention_head_size)
x = x.view(*new_x_shape)
return x.permute(0, 2, 1, 3)
File "C:\Users\cgq\AppData\Roaming\Python\Python36\site-packages\transformers\modeling_bert.py", line 206, in transpose_for_scores
return x.permute(0, 2, 1, 3)
RuntimeError: number of dims don't match in permute
经debug之后发现数据维度变成5维,这是什么原因导致的呢
from textbrewer.
问题解决了吗?看起来是输入顺序上出了问题。因为正常来说不会传入input_embeds,也不会有关于inputs_embeds的报错
File "d:\Users\cgq\Anaconda3\Lib\site-packages\torch\nn\modules\module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "d:\Users\cgq\Anaconda3\Lib\site-packages\transformers\modeling_bert.py", line 211, in forward
embeddings = inputs_embeds + position_embeddings + token_type_embeddings
builtins.RuntimeError: The size of tensor a (256) must match the size of tensor b (8) at non-singleton dimension 2
改成字典传值是能解决这个问题。
Example中的transformers版本较早,和后期版本中的模型参数顺序可能不同。只要注意dataset中的item顺序和forward参数顺序就能解决。
关于修改后的数据维数,你check下all_input_ids的shape?
from textbrewer.
return {'input_ids': self.all_input_ids[index], 'attention_mask': self.all_attention_mask[index], 'labels': self.all_labels[index]}
输入改成字典形式的解决,但是遇到的新问题是
x.size()
torch.Size([8, 1, 256, 768])
x.size()
torch.Size([8, 1, 256, 12, 64])
def transpose_for_scores(self, x):
new_x_shape = x.size()[:-1] + (self.num_attention_heads, self.attention_head_size)
x = x.view(*new_x_shape)
return x.permute(0, 2, 1, 3)
File "C:\Users\cgq\AppData\Roaming\Python\Python36\site-packages\transformers\modeling_bert.py", line 206, in transpose_for_scores
return x.permute(0, 2, 1, 3)
RuntimeError: number of dims don't match in permute
经debug之后发现数据维度变成5维,这是什么原因导致的呢
问题解决了吗?看起来是输入顺序上出了问题。因为正常来说不会传入input_embeds,也不会有关于inputs_embeds的报错
File "d:\Users\cgq\Anaconda3\Lib\site-packages\torch\nn\modules\module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "d:\Users\cgq\Anaconda3\Lib\site-packages\transformers\modeling_bert.py", line 211, in forward
embeddings = inputs_embeds + position_embeddings + token_type_embeddings
builtins.RuntimeError: The size of tensor a (256) must match the size of tensor b (8) at non-singleton dimension 2
改成字典传值是能解决这个问题。
Example中的transformers版本较早,和后期版本中的模型参数顺序可能不同。只要注意dataset中的item顺序和forward参数顺序就能解决。
关于修改后的数据维数,你check下all_input_ids的shape?
from textbrewer.
return {'input_ids': self.all_input_ids[index], 'attention_mask': self.all_attention_mask[index], 'labels': self.all_labels[index]}
输入改成字典形式的解决,但是遇到的新问题是
x.size()
torch.Size([8, 1, 256, 768])
x.size()
torch.Size([8, 1, 256, 12, 64])
def transpose_for_scores(self, x):
new_x_shape = x.size()[:-1] + (self.num_attention_heads, self.attention_head_size)
x = x.view(*new_x_shape)
return x.permute(0, 2, 1, 3)
File "C:\Users\cgq\AppData\Roaming\Python\Python36\site-packages\transformers\modeling_bert.py", line 206, in transpose_for_scores
return x.permute(0, 2, 1, 3)
RuntimeError: number of dims don't match in permute
经debug之后发现数据维度变成5维,这是什么原因导致的呢问题解决了吗?看起来是输入顺序上出了问题。因为正常来说不会传入input_embeds,也不会有关于inputs_embeds的报错
File "d:\Users\cgq\Anaconda3\Lib\site-packages\torch\nn\modules\module.py", line 550, in call result = self.forward(*input, **kwargs) File "d:\Users\cgq\Anaconda3\Lib\site-packages\transformers\modeling_bert.py", line 211, in forward embeddings = inputs_embeds + position_embeddings + token_type_embeddings builtins.RuntimeError: The size of tensor a (256) must match the size of tensor b (8) at non-singleton dimension 2
改成字典传值是能解决这个问题。
Example中的transformers版本较早,和后期版本中的模型参数顺序可能不同。只要注意dataset中的item顺序和forward参数顺序就能解决。关于修改后的数据维数,你check下all_input_ids的shape?
解决了,输入问题,应该是transformers版本不兼容导致的,我用的是2.9.0
from textbrewer.
你好,能不能贴一下你的bert_config_T6,我遇到了和你相似的问题,不知道怎么解决,我把版本换成了2.9.0,也没有解决
from textbrewer.
from textbrewer.
return {'input_ids': self.all_input_ids[index], 'attention_mask': self.all_attention_mask[index], 'labels': self.all_labels[index]}
输入改成字典形式的解决,但是遇到的新问题是
x.size()
torch.Size([8, 1, 256, 768])
x.size()
torch.Size([8, 1, 256, 12, 64])
def transpose_for_scores(self, x):
new_x_shape = x.size()[:-1] + (self.num_attention_heads, self.attention_head_size)
x = x.view(*new_x_shape)
return x.permute(0, 2, 1, 3)File "C:\Users\cgq\AppData\Roaming\Python\Python36\site-packages\transformers\modeling_bert.py", line 206, in transpose_for_scores
return x.permute(0, 2, 1, 3)
RuntimeError: number of dims don't match in permute
经debug之后发现数据维度变成5维,这是什么原因导致的呢
embeddings = inputs_embeds + position_embeddings + token_type_embeddings
RuntimeError: The size of tensor a (30) must match the size of tensor b (512) at non-singleton dimension 1
inputs_embeds.size() = [512, 30, 768] # batch, seq_len, hid_dim
position_embeddings.size() = [512, 768]
token_type_embeddings = [512, 30, 768]
请问一下,这一段return加在哪里可以修复bug,这个维度匹配的问题我还是没能解决
我用的是hfl/roberta-wwm-ext
from textbrewer.
from textbrewer.
from textbrewer.
Related Issues (20)
- pre-trained student weights HOT 3
- Where to find gs4210.pkl file or how to generate it ? thanks HOT 2
- interpreting intermediate matches HOT 5
- Show the progress bar when training. HOT 3
- Picking right layers HOT 3
- How about the distillation effect of gpt2 ? HOT 2
- Does it support translation model? HOT 2
- 在VisionTransformer HOT 7
- 关于ner数据的处理 HOT 2
- notebook_examples/msra_ner.ipynb 运行报错 HOT 12
- 不同维度蒸馏有对应的例子吗,从768降到256 HOT 4
- msra_ner.ipynb最后的trainer.evaluate()显示CUDA out of memory,请问训练的显存要求是多大?十分感谢! HOT 2
- 老师,您好,请问有多任务多教师的蒸馏的demo吗? HOT 4
- 老师您好,我想问一下,比如roberta蒸馏到tinybert,中间的hidden是通过线性层拉到同样的维度去算mse,那在推理的时候岂不是这些经过梯度更新的线性层毫无作用?那请问这些线性层仅仅就是为了调整维度? HOT 2
- 蒸馏后的模型进行evaluate,报错AxisError: axis 2 is out of bounds for array of dimension 1 HOT 5
- 可以使用chatgpt蒸馏到bert或者T5吗? HOT 2
- 麻烦问下,目前支持llama模型吗 HOT 2
- 请问支持BERT-of-Theseus的蒸馏方式吗 HOT 3
- 学生模型权重初始化问题 HOT 2
- TextBrewer/src/textbrewer/distiller_utils.py get_outputs_from_batch fails tocheck dicts properly for maskedLM HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from textbrewer.