Replication on Quora dataset?

Hi! I am trying to run an experiment on Quora dataset. I am using the dataset split provided by: https://github.com/zhiguowang/BiMPM and created a quora.w2v file similarly to askubuntu.w2v and meta.w2v. I got the following error:

Using Theano backend.
INFO:Reading training sentence pairs from data/quora/train.tsv:
/ 298204 Elapsed Time: 0:10:34 /home/andrada.pumnea/anaconda3/lib/python3.6/site-packages/bs4/init.py:219: UserWarning: "b'.'" looks like a filename, not markup. You shouldprobably open this file and pass the filehandle intoBeautiful Soup.
'Beautiful Soup.' % markup)
| 384347 Elapsed Time: 0:13:40
INFO:...read 384348 pairs in 820.31 seconds.
INFO:...class distribution: 0 = 245042 (63.8%) | 1 = 139306 (36.2%)
INFO:Reading validation sentence pairs from data/quora/dev.tsv:
| 9999 Elapsed Time: 0:00:21
INFO:...read 10000 pairs in 21.21 seconds.
INFO:...class distribution: 0 = 5000 (50.0%) | 1 = 5000 (50.0%)
INFO:Reading testing sentence pairs from data/quora/test.tsv:
| 9999 Elapsed Time: 0:00:21
INFO:...read 10000 pairs in 21.26 seconds.
INFO:...class distribution: 0 = 5000 (50.0%) | 1 = 5000 (50.0%)
INFO:Vectorizing data:
INFO:...fitted tokenizer in 14.60 seconds;
INFO:...found 103831 unique tokens;
INFO:Load embeddings from models/quora2.w2v:
INFO:...read 36111 word embeddings in 2.82 seconds;
INFO:...created embedding matrix with shape (103832, 200);
INFO:...cached matrix in file models/quora2.w2v.min.cache.npy.
INFO:Creating CNN model:
INFO:...model created.
INFO:Compiling model:
INFO:...model 0105d13fe81945018824e64905d8f7ad compiled with optimizer: <keras.optimizers.SGD object at 0x7fd9dd23cef0>, lr (sgd-only): 0.005, loss: mse.
Model summary:

Layer (type) Output Shape Param # Connected to

input_1 (InputLayer) (None, None) 0

input_2 (InputLayer) (None, None) 0

embedding_1 (Embedding) (None, None, 200) 20766400 input_1[0][0]
input_2[0][0]

convolution1d_1 (Convolution1D) (None, None, 300) 180300 embedding_1[0][0]
embedding_1[1][0]

globalmaxpooling1d_1 (GlobalMaxPo(None, 300) 0 convolution1d_1[0][0]
convolution1d_1[1][0]

activation_1 (Activation) (None, 300) 0 globalmaxpooling1d_1[0][0]
globalmaxpooling1d_1[1][0]

merge_1 (Merge) (None, 1) 0 activation_1[0][0]
activation_1[1][0]

Total params: 20946700

INFO:Train on 384348 samples, validate on 10000 samples
INFO:Epoch 1/1
2% (11127 of 384348) |### | Elapsed Time: 0:23:50 ETA: 13:16:51
Parameter 8 to routine SGEMM NTCSGEMV SGER was incorrect
Floating point exception (core dumped)

I am using Ubuntu 16.04.3.

Any idea why it happened and how it can be fixed?

nlx-group / replicating-bogdanova-et-al.-2015-duplicate-question-detection Goto Github PK

replicating-bogdanova-et-al.-2015-duplicate-question-detection's People

Contributors

Stargazers

Watchers

replicating-bogdanova-et-al.-2015-duplicate-question-detection's Issues

Replication on Quora dataset?

Layer (type) Output Shape Param # Connected to

merge_1 (Merge) (None, 1) 0 activation_1[0][0]
activation_1[1][0]

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

nlx-group / replicating-bogdanova-et-al.-2015-duplicate-question-detection Goto Github PK

replicating-bogdanova-et-al.-2015-duplicate-question-detection's People

Contributors

Stargazers

Watchers

replicating-bogdanova-et-al.-2015-duplicate-question-detection's Issues

Layer (type) Output Shape Param # Connected to

merge_1 (Merge) (None, 1) 0 activation_1[0][0] activation_1[1][0]

Recommend Projects

Recommend Topics

Recommend Org

merge_1 (Merge) (None, 1) 0 activation_1[0][0]
activation_1[1][0]