Coder Social home page Coder Social logo

qanet-pytorch's People

Contributors

bangliu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

qanet-pytorch's Issues

Resume checkpoint

Why I resume the checkpoint to continue training, the loss is normal but the test result of new epoch is very low like the first epoch?
For example, when I have trained the model for 10 epochs. And then I resume the 10th checkpoint and continue training the 11th epoch. When training the 11th epoch, the loss is normal and low. But when the training of 11th epoch is end, the test result of 11th is very low. Like the 1st epoch's result. Can you tell me the reason? Thank you very much.

File about data and Embedding

Hi, there is no code about SQuAD data and Embedding file ,so i want to know how to download those files and store where?

filter_func in SQuAD.py

In the line 308 you said that "# !!! use last answer as the target answer", and "start, end = example["y1s"][-1], example["y2s"][-1]"

However, in the function of def filter_func(config, example) in line 40, the 3rd condition is "(example["y2s"][0] - example["y1s"][0]) > config.ans_limit", which means the first answer is considered as the target answer (the index is 0, not -1)?

No softmax in Pointer class

Hello, I would like to know why do not you add the softmax operation in the last line of Pointer class:
Y1 = mask_logits(self.w1(X1).squeeze(), mask)
Y2 = mask_logits(self.w2(X2).squeeze(), mask)
return Y1, Y2

Thank you!

No directory

Hello.
After Generating word embedding, datasets/processed/SQuAD/word_emb.pkl.
How to solve it?

Add license

Hi, thanks for the implementation. Would love to use it but can't do it right now without a license. Would you mind adding one?

arguments are located on different GPUs

Hi BangLiu, thanks for this awesome code :)

I want to run QANet with 4 gpus and I change some related settings as:

parser.add_argument('--with_cuda', default=True)
parser.add_argument('--multi_gpu', default=True)

model = nn.DataParallel(model, device_ids=[0, 1, 2, 3])

I am using torch 0.4.0 and cuda 9.0. And I meet with a runtime error arguments are located on different GPUs as follow:

Traceback (most recent call last):
  File "QANet_main.py", line 633, in <module>
    p1, p2 = trainer.model(context_wids, context_cids, question_wids, question_cids)
  File "/home/fuyuwei/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/fuyuwei/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 114, in forward
    outputs = self.parallel_apply(replicas, inputs, kwargs)
  File "/home/fuyuwei/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 124, in parallel_apply
    return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
  File "/home/fuyuwei/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 65, in parallel_apply
    raise output
  File "/home/fuyuwei/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 41, in _worker
    output = module(*input, **kwargs)
  File "/home/fuyuwei/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/fuyuwei/CODE/_QANet/model/QANet_andy.py", line 353, in forward
    Ce = self.emb_enc(C, maskC, 1, 1)
  File "/home/fuyuwei/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/fuyuwei/CODE/_QANet/model/QANet_andy.py", line 224, in forward
    out = PosEncoder(x)
  File "/home/fuyuwei/CODE/_QANet/model/QANet_andy.py", line 55, in PosEncoder
    return (x + signal.to(device)).transpose(1, 2)
RuntimeError: arguments are located on different GPUs at /pytorch/aten/src/THC/generated/../generic/THCTensorMathPointwise.cu:233

It seems that the error is due to in PosEncoder() where the input x and signal.to(device) is on different gpus.

Could you please provide some clue to solve this problem? Many thanks.

embedding function

Why do we need "ch_emb = ch_emb.squeeze()" in the Embedding class in QANet_andy.py. When I test with batch_size = 1, this code will made the dim of ch_emb lower than wd_emb, which leads an error when we execute "emb = torch.cat([ch_emb, wd_emb], dim=1)"

Questions about Back translation

Thank you for sharing your wonderful works.
I wonder whether I could find back translation code for data augmentation!
Does it contain in this repository?
I'm waiting for your reply, thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.