Coder Social home page Coder Social logo

fewshottagging's People

Contributors

atmahou avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

fewshottagging's Issues

Training data description

Hi Yutai, nice work!

I am planning to train and test the model on other datasets. Do you mind to share the description of the form of the json files? Or does this repo offer any tools to create the json files from some basic data structure, like plain text with its NLU labels?

Cheers : )

Out of Memory

当我用自己的数据跑模型的时候,1-shot的数据可以跑出结果,但是5-shot总是内存溢出,是因为5-shot的中间计算结果变多了吗?
想问一下有什么解决办法吗?(grad_acc设成4还是溢出)谢谢!

GUM dataset

Hi,

Sorry to bother you, but I noticed some differences in the labels for entities between the version of the GUM dataset I obtained from https://github.com/amir-zeldes/gum and the version you used in the NER experiment. For example, in your xval_ner/ner_train_1.json file, the first sentence is annotated as ["hydrogen", "peroxide", "reduced", "infected", "ant", "fatalities", "by", "15", "and", "the", "ants", "varied", "their", "intake", "depending", "upon", "how", "high", "the", "peroxide", "concentration", "was"] with labels ["B-substance", "I-substance", "O", "B-abstract", "I-abstract", "I-abstract", "O", "B-quantity", "O", "B-animal", "I-animal", "O", "B-event", "I-event", "O", "O", "O", "O", "B-quantity", "I-quantity", "I-quantity", "O"]. However, in the version I obtained from the official website, the sentence is annotated as ["B-substance", "I-substance", "O", "B-event", "I-event", "I-event", "O", "B-event", "O", "B-animal", "I-animal", "O", "B-event", "I-event", "O", "O", "O", "O", "B-abstract", "I-abstract", "I-abstract", "O"].

I have checked the files multiple times, including the conllu and tsv formats, but I still cannot find the version you released. Could you please provide me with some hints or guidance on where I might have gone wrong?

Best regards.

执行多gpu命令 source ./scripts/run_L-Tapnet+CDT.sh 1,2 snips 时报错

Hello,我运行了一下多gpu的命令,报错信息如下,请问是什么原因呢?

Train-Batch Progress: 0%| | 0/5000 [00:03<?, ?it/s]
Epoch: 0%| | 0/1 [00:03<?, ?it/s]
Traceback (most recent call last):
File "main.py", line 218, in
main()
File "main.py", line 164, in main
training_model, train_features, opt.warmup_epoch)
File "/home/cike/zetaolian/FewShotTagging/utils/trainer.py", line 117, in do_train
loss = self.do_forward(batch, model, epoch_id, step)
File "/home/cike/zetaolian/FewShotTagging/utils/trainer.py", line 575, in do_forward
label_output_mask,
File "/home/cike/anaconda3/envs/zetaolian2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/cike/anaconda3/envs/zetaolian2/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 155, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/home/cike/anaconda3/envs/zetaolian2/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 165, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/home/cike/anaconda3/envs/zetaolian2/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 85, in parallel_apply
output.reraise()
File "/home/cike/anaconda3/envs/zetaolian2/lib/python3.7/site-packages/torch/_utils.py", line 395, in reraise
raise self.exc_type(msg)
StopIteration: Caught StopIteration in replica 0 on device 0.
Original Traceback (most recent call last):
File "/home/cike/anaconda3/envs/zetaolian2/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker
output = module(*input, **kwargs)
File "/home/cike/anaconda3/envs/zetaolian2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/cike/zetaolian/FewShotTagging/models/few_shot_seq_labeler.py", line 221, in forward
support_token_ids, support_segment_ids, support_nwp_index, support_input_mask
File "/home/cike/zetaolian/FewShotTagging/models/few_shot_seq_labeler.py", line 138, in get_context_reps
support_nwp_index, support_input_mask
File "/home/cike/anaconda3/envs/zetaolian2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/cike/zetaolian/FewShotTagging/models/fewshot_seqlabel/context_embedder_base.py", line 211, in forward
support_token_ids, support_segment_ids, support_nwp_index, support_input_mask,
File "/home/cike/zetaolian/FewShotTagging/models/fewshot_seqlabel/context_embedder_base.py", line 96, in concatenating_reps
sequence_output, _ = self.bert(input_ids, segment_ids, input_mask, output_all_encoded_layers=False)
File "/home/cike/anaconda3/envs/zetaolian2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/cike/anaconda3/envs/zetaolian2/lib/python3.7/site-packages/pytorch_pretrained_bert/modeling.py", line 708, in forward
extended_attention_mask = extended_attention_mask.to(dtype=next(self.parameters()).dtype) # fp16 compatibility
StopIteration

请问这个WARNING影响实验结果吗?

07/03/2020 13:19:10 - WARNING - pytorch_pretrained_bert.optimization - Training beyond specified 't_total'. Learning rate multiplier set to 0.0. Please set 't_total' of WarmupLinearSchedule correctly.

bug

File "./utils/model_helper.py", line 44, in make_model
trans_mat = opt.train_trans_mat
AttributeError: 'Namespace' object has no attribute 'train_trans_mat'

执行 source ./scripts/run_L-Tapnet+CDT.sh 0 snips 报错

报错信息如下:

[START] set jobs on dataset [ snips ] on gpu [ 0 ]
[CLI]
Model: L-Tapnet-CDT.dec_crf.enc_bert.ems_tapnet-dbt.mlp__random_0.5.e_scl_learn0.01_none.lb_sep_scl_fix0.5.t_scl_none1_none.t_i_rand.-mk_tr_.sim_dot.lr_0.00001.up_lr_0.001.bs_4_4.sp_b_2.w_ep_1.ep_3
Task: snips.shots_1.cross_id_1.m_seed_10150
[CLI]
Epoch:   0%|                                              | 0/1 [00:00<?, ?it/s]
Traceback (most recent call last):                     | 0/5000 [00:00<?, ?it/s]
  File "main.py", line 218, in <module>
    main()
  File "main.py", line 164, in main
    training_model, train_features, opt.warmup_epoch)
  File "/home/liu-mh/FewShotTagging-master/utils/trainer.py", line 117, in do_train
    loss = self.do_forward(batch, model, epoch_id, step)
  File "/home/liu-mh/FewShotTagging-master/utils/trainer.py", line 575, in do_forward
    label_output_mask,
  File "/home/liu-mh/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/liu-mh/FewShotTagging-master/models/few_shot_seq_labeler.py", line 258, in forward
    mask=test_output_mask)
  File "/home/liu-mh/FewShotTagging-master/models/fewshot_seqlabel/conditional_random_field.py", line 298, in forward
    tags, mask)
  File "/home/liu-mh/FewShotTagging-master/models/fewshot_seqlabel/conditional_random_field.py", line 251, in _joint_likelihood
    emit_score = logits[i].gather(1, current_tag.view(batch_size, 1)).squeeze(1)
RuntimeError: Invalid index in gather at **/pytorch/aten/src/TH/generic/THTensorEvenMoreMath.cpp:457

想问一下这是什么问题导致的?

Code and dataset license

Can you add a LICENSE file in the repo clarifying the licenses for the code and the datasets used in the paper? This is essential for other to properly leverage what you've done.
Thanks!

How to inspect prediction results?

Hi,

Thanks for releasing the code! May I know how to perform inference on the saved checkpoints? Basically, I want to take a look at the predicted BIO results of all the test sentences.

Thanks!

关于交叉检验

我查看了代码后没有发现代码处理交叉检验的部分,想问下怎么设置参数让模型进行交叉检验?

关于数据格式

我想问一下,数据集格式是怎么安排的, 为什么每一个batch都有support set而且还是同一个domain的。不是每一个train_set 才对应一个support set吗?

实验结果评判指标

请问实验结果除了输出F1值,有准确率和召回率吗?有的话在哪能找到?

Dataset

Hi, yutai.

Why the size of support set is bigger than the size of query set sometimes in your dataset?

代码细节疑问

您好,很感谢您的工作,简单看了一下代码,关于原始论文的
image
有些疑惑。

1、按TapNet中的说法,这个Φ应该是开始随机初始化的,每一个tag都对应一个向量,然后您在代码中的实现是这一行吗:
image
原文采用的BIO形式的标签,那么这个矩阵的维度不该是 num_tags*2+1, bert_emb_dim?

2、关于这个Φ的作用:因为是每一个tag对应一个向量,训练集和测试集的类别不交叉,训练时只能得到训练集相关类别的φi,但是测试时对应类别的φi仍然完全是随机的呀,这个φ到底有什么用呢。

好像有一个bug?

我好像发现了一个bug
image
这里好像不应该用一个transpose,因为SVD分解出来的vh=B x emb_dim x emb_dim,(_, s, vh = torch.svd(error_every_class, some=False), 每一个batch,其实它的列向量才是空间的基, 你这里好像把行向量作为空间的基了,这样求得的M,最后拿来投影error时(测试M能否把error投影到新空间中的0)
assert (torch.matmul(error_every_class, M) > 1E-6).sum().item() == 0时会报错,
如果把转置去掉就不会了

细节疑问

想请问一下,你们所说的N-Way是在什么粒度下的N-Way:

是把比如【Weather】当成一类,还是分开,【Weather】变为【B-Weather】、【I-Weather】成为两个类别

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.