stanfordnlp / treelstm Goto Github PK

View Code? Open in Web Editor NEW

878.0 49.0 240.0 71 KB

Tree-structured Long Short-Term Memory networks (http://arxiv.org/abs/1503.00075)

License: GNU General Public License v2.0

Shell 0.48% Lua 69.11% Java 12.24% Python 18.16%

treelstm's People

Contributors

Stargazers

Watchers

Forkers

beronx86 fanfannothing chagge devsinghsachan fangyw xsongx vseledkin samcorbettdavies jevenwill nagyistoce carpedm20 zheng-yang-development salemameen cedias azrael1 zzmjohn suyogdjain jethrotan linflyer icemansina zhenyangiacas zhangaustin sebnickel zhimingz mr-justin vsooda oztc nimishzynga jfhou hohocode ypkang ashhher3 ml-lab dapeng2018 dinhnn frankszn lngvietthang codeaudit weidezhang cc13ny caomw tkngoutham xiaoxiamii wgapl pombredanne jasmeet107 kevinwenya frankchu0229 chenying99 milestonesvn sonu5623 simudream magicyb2016 ultrons michellehanan folk19 jamie-murdoch yehaibuaa file-citas sunlinjie talentlei jolinxql ajaytalati junteudjio chubbymaggie sisirkoppaka binbinbian ecnumjc miradel51 rjbashar mjstevens777 peratham andyrbm wxdublin imclab pineking stevenlol liuyang1123 caoge4 zbxzc35 zhyongwei silasxue alextrott16 appcoreopc tedyhabtegebrial stevenygd farizikhwantri ai2010 hfxunlp sqxiang noxgreeneye teslasloan eriche2016 sohuren hantek jianbotang knhuq hawklucky rahul-sindhu mzdu

treelstm's Issues

Using the trained model for new relatedness score ?

As per the instructions in the readme I ran the relatedness code and got it to run for 10 epocs - which produced a new trained_models folder containing the "trainedmodel". I am not really sure how I can use this trained model to use with my own two sentences to find a relatedness score between them ?

Unable to convert data/glove/glove.840B.300d.txt to Torch serialized format

Error occurs in convert-wordvecs.lua at

vecs[{i, j}] = tonumber(tokens[j + 1])

The conversion fails as the data/glove/glove.840B.300d.txt file has non-UTF8 and Non-ASCII characters. Did anyone face this issue with data/glove/glove.840B.300d.txt file?

Changed the for loop to add the condition to avoid conversion to number and writing to vocab, if the second token in each count is not a number.(the problem is because if there is no number, then tonumber returns nil)

for i = 1, count do
 repeat
  xlua.progress(i, count)
  local tokens = stringx.split(file:read())
  if tonumber(tokens[2]) == nil then break end
  local word = tokens[1]
  vocab:write(word .. '\n')
  for j = 1, dim do
   vecs[{i, j}] = tonumber(tokens[j + 1])
  end
 until true
end

The above fix solves the issue, but I would like to know if this is the correct solution for the problem.

Segmentation fault when running the constituency tree-lstm

Similar to the Issue #8 , "Segmentation fault (core dumped)" encountered at a random epoch when using the constituency tree-lstm.
I'm running several trials with the SICK dataset as the command provided without any change in the codes:
th relatedness/main.lua --model constituency
How can I solve this issue?

Here is a sample of the console output:

--------------------------------------------------------------------------------	
Constituency Tree LSTM for Semantic Relatedness	
--------------------------------------------------------------------------------	
loading word embeddings	
unk count = 5	
loading datasets	
num train = 4500
num dev   = 500
num test  = 4927
--------------------------------------------------------------------------------	
model configuration	
--------------------------------------------------------------------------------	
max epochs = 10
num params                = 241655
num compositional params  = 226350
word vector dim           = 300
Tree-LSTM memory dim      = 150
regularization strength   = 1.00e-04
minibatch size            = 25
learning rate             = 5.00e-02
word vector learning rate = 0.00e+00
parse tree type           = constituency
sim module hidden dim     = 50
--------------------------------------------------------------------------------	
Training model	
--------------------------------------------------------------------------------	
-- epoch 1
 [======================================== 4500/4500 ==================================>]  Tot: 9m11s | Step: 122ms     
-- finished epoch in 551.54s
 [======================================== 500/500 ====================================>]  Tot: 20s578ms | Step: 45ms   
-- dev score: 0.6270
-- epoch 2
Segmentation fault (core dumped)========== 4151/4500 ===========================>.......]  ETA: 41s839ms | Step: 119ms

Segmentation fault (core dumped) Dependency Tree LSTM for Semantic Relatedness

Hi,

I am having the following error on dataset containing only two classes (two values of score- 1,2). Also, the segmentation fault error occurs on different epochs.

I have made the following changes to handle two classes:

self.num_classes = 2 and torch.range(1, 2):dot(output:exp()) in TreeLSTM.lua.
dataset.labels[i] = (sim_file:readDouble() - 1) in read_data.lua.

I am running on Ubuntu 14.04.4 LTS and using LuaJIT 2.1.0-beta1.

Running error

I'd like to ask, when I am running you program, there's an error happened
"treelstm-master/util/Vocab.lua:19: attempt to index local 'file' (a nil value)"

when I looked into the file directory, there is no file named as "glove.840B.300d.th" in the code, only "glove.840B.300d.txt.gz"..
Thanks.

Stanford Sentiment Treebank

Under the Stanford Sentiment TreeBank, directory there are several files, which is explained clearly in the README.txt file. And i've my own dataset and i am thinking to train my own dataset using the Stanford Sentiment (RNTN). But i am wondering how do i generate the "STree.txt" file from my dataset. The format of data in "STree.txt" is in the form of:

40|39|38|37|36|34|33|32|32|31|30|30|29|28|26|25|24|23|22|22|27|23|24|25|26|27|28|29|41|31|35|33|34|35|36|37|38|39|40|41|0

mini-batch in forward computing?

I'm focusing on Tree-LSTM. My question is:

can torch implement of treeLSTM do mini-batching during forward pass?
And if not, has anyone see any mini-batch implementation of treeLSTM in Theano/tensorflow/pytorch/mxnet/dynet/tf-fold... or any other framework, or even C++?
for the framework like torch that is "dynamic", will mini-batch speedup the computation?

I'm new to torch, but from the code in sentiment/TreeLSTMSentiment.lua

      local loss = 0
      for j = 1, batch_size do
        local idx = indices[i + j - 1]
        local sent = dataset.sents[idx]
        local tree = dataset.trees[idx]

        local inputs = self.emb:forward(sent)
        local _, tree_loss = self.treelstm:forward(tree, inputs)
        loss = loss + tree_loss
        local input_grad = self.treelstm:backward(tree, inputs, {zeros, zeros})
        self.emb:backward(sent, input_grad)

I think the forward and backward pass are computed one sample a time, and accumulate the gradient in a mini-batch before updating. So it is not a mini-batch forward computing, but a mini-batch update.

@kaishengtai Would you help?

Can this be configured to run on GPU?

I have tried to look up on net but we need to specify each tensor that is being used in this to be used as GPU tensor, Is there any other way to run entire code on GPU?

in semantic relatedness task

in semantic relatedness task which I choose dependency model with 10 epochs for 300K training and 70 K for development, it shows "not enough memory"

CPU is 24 core and 32 GB ram with 12 GB GPU

TIP: Can this problem be solved by reducing the batch size? If yes what is the current batch size used in this code?

the result is "nan" in semantic relatedness task

while in training model,the dev score is nan. And the final score is nan. I opened the prediton file, the result is all nan. So I checked the data, and it is ok. I have no idea about why the result is nan.

sentiment classification task

dear sir, I found that the sentiment classification task use the file "STree.txt" during the preprocessing stage. Is it necessary and how to generate this file if i use my dataset?

Fixed an Error: ./scripts/download.py tries to download a non-existent file

This looks like a really cool project! I'm testing it for potential use for my CS221 final project :)

The current version of download.py in the scripts folder tries to download:
/data/glove.840B.300d.txt.gz
Which does not exist. It appears this old .gz file has been antiquated and superseded by this file: /data/glove.840B.300d.zip.

I found and fixed the error in the python script, and changed the extraction method from gzip to the unzip() function, and have tested it locally and appears to work properly. Tried to push the changes but the repo didn't grant me permission to push the fix.

FYI if you grant me permission to push the changes, let me know and I will push the fix asap.
Thanks!
Erick

The cpu usage very high when run the program.

Hello, I want to use this model to predict the semantic relatedness of Chinese sentence pairs.
I didn't change the model but just change the program to accept the Chinese word vector with 200dim and Chinese dependency tree, but when I run the "th relatedness/main.lua" the cpu usage at almost 1700% on a 24 core server. And the application running at a very low speed.

My machine has GPU, so I tried to change the model to work on GPU with just
require('cutorch')
require('cunn')
move model on gpu with :cuda()
change input and output data on gpu with :cuda()

but it always give errors, and I am new to torch. Could you please gives me any suggestions?

Thanks.

Leaf module in paper?

Hi!

Could you point to where the leaf module implemented in the BinaryTreeLSTM object is described in the paper? I can't seem to find it, but it seems like a non-trivial part of the model.

How did "STree.txt" come into being ?

40|39|38|37|36|34|33|32|32|31|30|30|29|28|26|25|24|23|22|22|27|23|24|25|26|27|28|29|41|31|35|33|34|35|36|37|38|39|40|41|0

duplicated bias in the forget gate of ChildSumTreeLSTM.lua

Hello,

At the 40th and 41th lines of models/ChildSumTreeLSTM.lua, to compose the forget gate output, the outputs of both nn.TemporalConvolution and nn.Linear are summed. Since both of the modules have bias terms, additional bias term will be added differently to the formulation in the ACL2015 paper.

I wonder if I missed some part resolving the issue, it was intended, or a bug.