seominjoon / piqa Goto Github PK

View Code? Open in Web Editor NEW

95.0 95.0 9.0 2.05 MB

Phrase-Indexed Question Answering (PIQA)

Home Page: https://pi-qa.com

License: Apache License 2.0

Python 91.55% Shell 3.33% CSS 0.38% HTML 4.75%

emnlp2018

piqa's Issues

DataLoader with args.cache does not load elmo idx

https://github.com/uwnlp/piqa/blob/3a3404d82bf61a07241035eaf64be10233e266dd/squad/baseline/processor.py#L215-L237

Collate fn uses self._elmo to check the use of elmo, but the cached Processor's _elmo is set to False.
(Cache was saved when processing SA+Elmo)

Error log is as follows:

$ python main.py baseline --cuda --mode embed_question --iteration 501 --test_path $SQUAD_DEV_QUESTION_PATH --elmo --num_heads 2 --batch_size 32 --cache
...
 'train_path': '/home/jinhyuk/data/squad/train-v1.1.json',
 'train_steps': 0,
 'word_vocab_size': 10000}
Model loaded from /tmp/piqa/squad/save/501/model.pt
Saving embeddings
Traceback (most recent call last):
  File "main.py", line 277, in <module>
    main()
  File "main.py", line 258, in main
    embed(args)
  File "main.py", line 240, in embed
    question_output = model.get_question(**test_batch)
  File "/home/jinhyuk/github/piqa/squad/baseline/model.py", line 285, in get_question
    q = self.question_embedding(question_char_idxs, question_glove_idxs, question_word_idxs, ex=question_elmo_idxs)
  File "/home/jinhyuk/anaconda3/envs/p3.6/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/jinhyuk/github/piqa/squad/baseline/model.py", line 98, in forward
    elmo, = self.elmo(ex)['elmo_representations']
  File "/home/jinhyuk/anaconda3/envs/p3.6/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/jinhyuk/anaconda3/envs/p3.6/lib/python3.6/site-packages/allennlp/modules/elmo.py", line 133, in forward
    original_shape = inputs.size()
AttributeError: 'NoneType' object has no attribute 'size'

Error while running piqa_evaluate.py

$ python piqa_evaluate.py $SQUAD_DEV_PATH /tmp/piqa/context_emb/ /tmp/piqa/question_emb/
Traceback (most recent call last):
File "piqa_evaluate.py", line 151, in
progress=args.progress)
File "piqa_evaluate.py", line 123, in get_predictions
m = sim.max(1)
File "/home/jinhyuk/anaconda3/envs/p3.6/lib/python3.6/site-packages/numpy/core/_methods.py", line 28, in _amax
return umr_maximum(a, axis, None, out, keepdims, initial)
numpy.core._internal.AxisError: axis 1 is out of bounds for array of dimension 1

Changing codes from

https://github.com/uwnlp/piqa/blob/c6e871919a6c664d53db58fd2d7c047843ce2e75/piqa_evaluate.py#L119-L125
to

119         else:                                                                                                
120             q_emb = q_emb['arr_0']                                                                           
121             c_emb = c_emb['arr_0']                                                                           
122             m = np.matmul(c_emb, q_emb.T)                                                                  
123             # m = sim.max(1)                                                                                 
124                                                                                                              
125         argmax = m.argmax(0)

simply solves the problem, but don't know why the original code was in that form.
q_emb.shape = (1024,), c_emb.shape = (1008, 1024) leads to sim.shape = (1008),
and just taking argmax(0) seems fine.

leaderboard?

Hey,
The arxiv paper links to this repo for the leaderboard, yet none is available. Have no valid submissions been received yet?

Thanks

faiss does not provide pip installation

Suggest using Conda for faiss installation.
Maybe we need to remove faiss from requirements.txt.
Docker image works fine.

CUDA out of memory when using ELMo on a 12GB GPU

This did not happen previously, so I am currently investigating the issue.

For the initial run, nltk download needed

Error occurs if nltk.download('punkt') was not called.
Maybe edit README.md or download.sh ?

Allennlp does not support 3.7 yet

Performance difference between evaluate.py vs piqa_evaluate.py

Performances of two evaluation scripts differ as follows:

$ python evaluate.py $SQUAD_DEV_PATH /tmp/piqa/pred.json 
{"exact_match": 52.81929990539262, "f1": 63.28879733489547}
$ python piqa_evaluate.py $SQUAD_DEV_PATH /tmp/piqa/context_emb/ /tmp/piqa/question_emb/
{"exact_match": 52.28949858088931, "f1": 62.72236634535493}

Difference is about 0.5~0.6, and tested model is LSTM+SA+ELMo.

TypeError: argument of type 'method' is not iterable

Training using

$ python main.py baseline --cuda

and testing with

$ python main.py baseline --cuda --mode test --iteration 501

makes an error as follows:

...
Model loaded from /tmp/piqa/squad/save/501/model.pt
Traceback (most recent call last):
  File "main.py", line 257, in <module>
    main()
  File "main.py", line 236, in main
    test(args)
  File "main.py", line 162, in test
    test_dataset = tuple(processor.preprocess(example) for example in test_examples)
  File "main.py", line 162, in <genexpr>
    test_dataset = tuple(processor.preprocess(example) for example in test_examples)
  File "/home/jinhyuk/github/piqa/squad/baseline/processor.py", line 124, in preprocess
    context_word_idxs = tuple(map(self._word2idx, context_words))
  File "/home/jinhyuk/github/piqa/squad/baseline/processor.py", line 276, in _word2idx
    return self._word2idx_dict[word] if word in self._word2idx_dict else 1
TypeError: argument of type 'method' is not iterable

Code version: fa113ab

model.init(metadata) for test/embed

Error occurs due to no initialized model.elmo when loading the models with elmo.
model.init(metadata) should be called in test(), embed() functions like in train() function.

It seems that in model.py you are maintaining the scale and mixing parameters for computing the averaged Elmo representation. However, from my understanding of allennlp.modules.elmo.Elmo this should be already taken care of by the Elmo class.

Is there any reason for re-mixing the already mixed Elmo representations returned? Or am I misunderstanding something?

Thanks!
Bhuwan

seominjoon / piqa Goto Github PK

piqa's Issues

DataLoader with args.cache does not load elmo idx

Error while running piqa_evaluate.py

leaderboard?

faiss does not provide pip installation

CUDA out of memory when using ELMo on a 12GB GPU

For the initial run, nltk download needed

Allennlp does not support 3.7 yet

Performance difference between evaluate.py vs piqa_evaluate.py

TypeError: argument of type 'method' is not iterable

model.init(metadata) for test/embed

About computation of Elmo

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent