Coder Social home page Coder Social logo

scpn's People

Contributors

jwieting avatar miyyer avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

scpn's Issues

can i use same model for training spanish language dataset ?

Hi,

Thanks for sharing model. great work.

I want to get variations of spanish language text similar to english. can i do this using same model ?

and do you know , is there any text corpus for spanish language to train text variation ?

Thanks.

The command you ran to parse the ParaNMT dataset

Hi, thanks for the paper and interesting topics. I would like to apply your pre-trained model on my own data using the code given:

java -Xmx12g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLP -threads 1 -annotators tokenize,ssplit,pos,parse -ssplit.eolonly -filelist filenames.txt -outputFormat text -parse.model edu/stanford/nlp/models/srparser/englishSR.ser.gz -outputDirectory /outputdir/

I do not have a CS background. Could anyone give some explanation of this code? Which software to run and how to customize accurately the information to fill in? Thanks...

Some questions about trans_embs

Hello, thank you so much for sharing the code with us. I have learned a lot. Thank you so much! But I have some questions about the trans_embs in this code.

  1. In train_scpn.py, SCPN's parameter "len_parse_voc" is 103, which means the parse vocabulary doesn't include the token 'EOP'. But during the training of SCPN, the function indexify_transformations() is called to get valid instances of transformations. In this function, deleaf() is called and deleaf() will add 'EOP' at the end of the parse. But there isn't 'EOP' in the parse vocabulary which will result in mistakes when transform the parse tag into index.
  2. There might be the token 'EOP' in the parses generated from ParseNet. But the trans_embs' shape in SCPN is (103*56), which means the embedding table doesn't include 'EOP'. This will result in errors when running generate_paraphrases.py.
  3. SCPN and ParseNet use different trans_embs, what if they share the same trans_embs ?

Using the Supplied Training Data for Training

Does anyone has any success in using the supplied training data for training (with Python 2.7 and Pytorch 3.1)?

It appears that it is non-trainable in my machine, and there is an infinite loop within the enumeration of the minibatches.

z = indexify_transformations(in_p, out_p, label_voc, args)
if z == None:
    continue

The above code always produces a z of None type, thus generating an infinite loop.

@miyyer @jwieting

Running time

Hello
I used your model to extend my dataset for sentiment analysis. It performs well on small datasets. However, it is time consuming "4 seconds per item". So when trying it with a dataset of 500000 items, it takes three weeks to complete.

Do you have any idea about how can I accelerate the execution?

Question about copy mechanism implementation

Thank you for sharing your code!
I have one question about the copy mechanism implementation

As far as I could see, you calculate the final word distribution as:
(1 - p_copy) * log word_dist_from_decoder + p_copy * log word_dist_by_copy

I think the logarithm should be removed.

RuntimeError: cuda runtime error (59) / Ranking output sentences

Hi @miyyer,

I had ran demo.sh with pytorch 0.4.0 in both python 2.7 & 3.6, GPU usage is about 1.4Gb, both give me the error:

/pytorch/aten/src/THC/THCTensorIndex.cu:360: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2, IndexIsMajor = true]: block: [113,0,0], thread: [94,0,0] Assertion ``srcIndex < srcSelectDimSize`` failed.

/pytorch/aten/src/THC/THCTensorIndex.cu:360: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2, IndexIsMajor = true]: block: [113,0,0], thread: [95,0,0] Assertion ``srcIndex < srcSelectDimSize`` failed.

THCudaCheck FAIL file=/pytorch/aten/src/THC/generic/THCTensorCopy.c line=70 error=59 : device-side assert triggered beam search OOM 2 1.59588813782 Traceback (most recent call last): File "generate_paraphrases.py", line 187, in <module> encode_data(out_file=args.out_file) File "generate_paraphrases.py", line 70, in encode_data torch_sent = Variable(torch.from_numpy(np.array(seg_sent, dtype='int32')).long().cuda()) RuntimeError: cuda runtime error (59) : device-side assert triggered at /pytorch/aten/src/THC/generic/THCTensorCopy.c:20

I had tried a few times but it always return this error, especially THCTensorCopy.c:20. Hope you can help me on this.

Approaximately how much time required to complete individual parse_generator and scpn model training ?

Hi Guys,

It's been 10 days since training of scpn model was started, and after this 10 days, training has just completed 4 epochs out of 15 epochs with DEFAULT settings.

machine configuration :

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.54                 Driver Version: 396.54                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla K80           Off  | 000075E1:00:00.0 Off |                    0 |
| N/A   67C    P0    89W / 149W |   9896MiB / 11441MiB |     84%      Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0     47372      C   xyzz/python2.7                                 1440MiB |
|    0     82095      C   scpn/python                                      8443MiB |
+-----------------------------------------------------------------------------+

using above system, can you guys tell me approximately how much time will it take to train scpn model(DEFAULT settings) ?
with that as another model has dependency with scpn which is "parse_generator", can you tell me how much time will this model(DEFAULT settings) take to train ?

Thanks.

How should I preprocess the data?

If I just want to train the SCPN model, I just need to preprocess the para-nmt dataset. But what if I want to use SCPN to generate syntactically adversarial examples for downstream task? Should I preprocess (for example, tokenizing and BPE) the para-nmt dataset with the downstream task's dataset together? How did you preprocess SST and SICK data ? @miyyer @jwieting Thank you very much!

An issue about the implement of train_scpn.py forward function

The line 241 in train_scpn.py:
copy_probs = copy_probs.view(-1);
I think it should be:
copy_probs = copy_probs.transpose(0, 1).contiguous().view(-1)
because decoder_states, decoder_copy_dists are transposed:
line 234: decoder_states = decoder_states.transpose(0, 1).contiguous().view(-1, self.d_hid * 2)
line 238: decoder_copy_dists = decoder_copy_dists.transpose(0, 1).contiguous().view(-1, self.len_voc)
@miyyer @jwieting

training from scratch

train_scpn.py when run just freezes at 1123 mb gpu usage, and the 15 gig dataset memory usage. prints nothing, does nothing, anyone seen this issue?

python 2.7, latest pytorch

Questions about training time consuming

I try to train the scpn model but the training data is too large. I use one GPU. The batch size is 64 and for every batch I need 1.6 seconds to train. But there is 439586 batches. I try to use two GPUs to train but I fail. Could you tell me how you speed up the training process? Thank you so much. @miyyer @jwieting

issue in demo.sh

Thanks for code sharing.

while executing through below loop, there are two issues i'm facing.

# loop over sentences and transform them
    for d_idx, ex in enumerate(inrdr):
        stime = time.time()

1. exception : cuda runtime error : device-side assert triggered at /pytorch/aten/src/THC/THCTensorCopy.cpp:20
at line 3

# add EOS
        seg_sent.append(pp_vocab['EOS'])
        torch_sent = Variable(torch.from_numpy(np.array(seg_sent, dtype='int32')).long().cuda())
        torch_sent_len = torch.from_numpy(np.array([len(seg_sent)], dtype='int32')).long().cuda()

2. exception : tensor(101,device='cuda:0')
at line 2

for b_idx in beam_dict:
                prob,_,_,seq = beam_dict[b_idx][0]
                gen_parse = ' '.join([rev_label_voc[z] for z in seqs[b_idx]])
                gen_sent = ' '.join([rev_pp_vocab[w] for w in seq[:-1]])

when tried to debug , found something like this.
seqs contains [[tensor(101,device='cuda:0'),tensor(1,device='cuda:0')....etc]]
seq contains [tensor(5,device='cuda:0'),tensor(448,device='cuda:0')....etc]
rev_label_voc contains {0:'NN',1:'VB'...etc}
rev_pp_vocab contains {'is':229,'am':43...etc}
as you can see variables and their values , it's failing because ok KEYERROR.

any suggestion why is it happening ?

@miyyer @jwieting

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.