nyu-mll / jiant-v1-legacy Goto Github PK
View Code? Open in Web Editor NEWThe jiant toolkit for general-purpose text understanding models
License: MIT License
The jiant toolkit for general-purpose text understanding models
License: MIT License
Issue by epavlick
Tuesday Jun 26, 2018 at 18:43 GMT
Originally opened as nyu-mll/jiant#34
Issue by sleepinyourhat
Tuesday Jun 26, 2018 at 18:21 GMT
Originally opened as nyu-mll/jiant#32
Issue by W4ngatang
Monday Jun 25, 2018 at 05:13 GMT
Originally opened as nyu-mll/jiant#14
Use ELMoTokenEmbedder instead of ELMo() so people don't need to modify their source code
Issue by iftenney
Wednesday Jun 27, 2018 at 03:35 GMT
Originally opened as nyu-mll/jiant#45
Add comments to replace old argparse docstrings
iftenney included the following code: https://github.com/nyu-mll/jiant/pull/45/commits
Issue by roma-patel
Tuesday Jun 26, 2018 at 10:29 GMT
Originally opened as nyu-mll/jiant#25
roma-patel included the following code: https://github.com/nyu-mll/jiant/pull/25/commits
Issue by sleepinyourhat
Tuesday Jun 26, 2018 at 20:13 GMT
Originally opened as nyu-mll/jiant#37
Low priority, but should find a fix eventually.
Issue by sleepinyourhat
Monday Jun 18, 2018 at 22:11 GMT
Originally opened as nyu-mll/jiant#2
Fix this:
.../src/trainer.py:258: UserWarning: torch.nn.utils.clip_grad_norm is now deprecated in favor of torch.nn.utils.clip_grad_norm_.
Issue by sleepinyourhat
Friday Jun 22, 2018 at 19:06 GMT
Originally opened as nyu-mll/jiant#9
Epoch is used in a few places to mean number of validations done so far. This is pretty weird. Rename and/or add some documentation?
Issue by W4ngatang
Monday Jun 25, 2018 at 05:15 GMT
Originally opened as nyu-mll/jiant#16
^ so people know how to use things
Issue by iftenney
Wednesday Jun 27, 2018 at 01:12 GMT
Originally opened as nyu-mll/jiant#42
Desiderata:
Issue by iftenney
Monday Jun 25, 2018 at 17:50 GMT
Originally opened as nyu-mll/jiant#23
More cleanup to play nice with GCP & remote execution
iftenney included the following code: https://github.com/nyu-mll/jiant/pull/23/commits
Issue by epavlick
Tuesday Jun 26, 2018 at 13:32 GMT
Originally opened as nyu-mll/jiant#26
We need a way to take a trained model and run it on small probing test sets, assuming the probing test sets are in the form of p/h pairs.
Issue by W4ngatang
Monday Jun 25, 2018 at 17:16 GMT
Originally opened as nyu-mll/jiant#21
There are a a few transformer parameters we've been avoiding (projection dimension, feedforward dimension) by setting it all to the same value, in addition to training hyperparameters. We should probably go through what literature there is and find sensible defaults
Issue by sleepinyourhat
Tuesday Jun 26, 2018 at 20:08 GMT
Originally opened as nyu-mll/jiant#36
sleepinyourhat included the following code: https://github.com/nyu-mll/jiant/pull/36/commits
Issue by sleepinyourhat
Friday Jun 22, 2018 at 15:53 GMT
Originally opened as nyu-mll/jiant#4
This probably wastes time with QQP, which has a huge test set.
Issue by epavlick
Tuesday Jun 26, 2018 at 14:42 GMT
Originally opened as nyu-mll/jiant#27
…on eval task
epavlick included the following code: https://github.com/nyu-mll/jiant/pull/27/commits
Issue by pitrack
Wednesday Jun 20, 2018 at 17:20 GMT
Originally opened as nyu-mll/jiant#3
@W4ngatang Ideally you would check out this branch + download fastText and confirm that the instructions are clear, and that the code still runs (with/without fastText). If you want fastText to be default, then some flags may need to get changed.
I'm not getting high accuracy on the demo task.
I also changed some things in the config file so that I can use my paths. You'll need to update your paths too if you checkout this branch/if this gets merged.
pitrack included the following code: https://github.com/nyu-mll/jiant/pull/3/commits
Issue by epavlick
Tuesday Jun 26, 2018 at 21:45 GMT
Originally opened as nyu-mll/jiant#39
Refactoring main
epavlick included the following code: https://github.com/nyu-mll/jiant/pull/39/commits
Issue by sleepinyourhat
Tuesday Jun 26, 2018 at 15:15 GMT
Originally opened as nyu-mll/jiant#28
Currently we start overwriting checkpoints once we switch from pretraining to target task training.
Issue by sleepinyourhat
Tuesday Jun 26, 2018 at 18:14 GMT
Originally opened as nyu-mll/jiant#29
Issue by iftenney
Wednesday Jun 27, 2018 at 15:50 GMT
Originally opened as nyu-mll/jiant#48
We had two that did the same thing
iftenney included the following code: https://github.com/nyu-mll/jiant/pull/48/commits
Issue by W4ngatang
Monday Jun 25, 2018 at 17:19 GMT
Originally opened as nyu-mll/jiant#22
Using the standard Pytorch RNN won't work for multi layer bidirectional language models because the directions are aggregated between layers. See elmo lstm for an example of how to do this (and maybe an easy work around?).
Issue by sleepinyourhat
Wednesday Jun 27, 2018 at 16:15 GMT
Originally opened as nyu-mll/jiant#50
Fixes #47
sleepinyourhat included the following code: https://github.com/nyu-mll/jiant/pull/50/commits
Issue by sleepinyourhat
Monday Jun 25, 2018 at 21:25 GMT
Originally opened as nyu-mll/jiant#24
sleepinyourhat included the following code: https://github.com/nyu-mll/jiant/pull/24/commits
Issue by W4ngatang
Monday Jun 25, 2018 at 05:22 GMT
Originally opened as nyu-mll/jiant#18
We want to allow for more fine-grained control between training parameters on the main task we're training on and auxiliary tasks we're evaluating on.
If we're not training on multiple tasks, it would probably make sense to switch to a deterministic trainer that validates after passing through the entire training set (as opposed to a fixed number of batches).
Issue by sleepinyourhat
Saturday Jun 23, 2018 at 13:28 GMT
Originally opened as nyu-mll/jiant#12
bash ./demo.sh -w none
[...]
Traceback (most recent call last):
File "../src/main.py", line 236, in
sys.exit(main(sys.argv[1:]))
File "../src/main.py", line 186, in main
args.shared_optimizer, args.load_model)
File "/Users/Bowman/Drive/JSALT/jiant/src/trainer.py", line 273, in train
output_dict = self._forward(batch, task=task, for_training=True)
File "/Users/Bowman/Drive/JSALT/jiant/src/trainer.py", line 504, in _forward
return self._model.forward(task, tensor_batch) # , **tensor_batch)
File "/Users/Bowman/Drive/JSALT/jiant/src/models.py", line 291, in forward
out = self._single_classification_forward(batch, task)
File "/Users/Bowman/Drive/JSALT/jiant/src/models.py", line 309, in _single_classification_forward
sent_embs, sent_mask = self.sent_encoder(batch['input1'])
File "/Users/Bowman/anaconda3/envs/jiant/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/Users/Bowman/Drive/JSALT/jiant/src/modules.py", line 75, in forward
sent_embs = self._highway_layer(self._text_field_embedder(sent))
File "/Users/Bowman/anaconda3/envs/jiant/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/Users/Bowman/anaconda3/envs/jiant/lib/python3.6/site-packages/allennlp/modules/text_field_embedders/basic_text_field_embedder.py", line 63, in forward
raise ConfigurationError(message)
allennlp.common.checks.ConfigurationError: "Mismatched token keys: dict_keys(['chars']) and dict_keys(['words', 'chars'])"
Issue by sleepinyourhat
Saturday Jun 23, 2018 at 12:48 GMT
Originally opened as nyu-mll/jiant#11
Issue by epavlick
Tuesday Jun 26, 2018 at 22:54 GMT
Originally opened as nyu-mll/jiant#40
Issue by W4ngatang
Monday Jun 25, 2018 at 05:28 GMT
Originally opened as nyu-mll/jiant#20
For downstream tasks' components, allow for a learned linear projection of the core RNN's hidden states before any pooling
Issue by sleepinyourhat
Monday Jun 18, 2018 at 22:10 GMT
Originally opened as nyu-mll/jiant#1
See if this is worth worrying about:
06/18 06:07:17 PM: Your label namespace was 'idxs'. We recommend you use a namespace ending with 'labels' or 'tags', so we don't add UNK and PAD tokens by default to your vocabulary. See documentation for non_padded_namespaces
parameter in Vocabulary.
Issue by iftenney
Sunday Jun 24, 2018 at 20:38 GMT
Originally opened as nyu-mll/jiant#13
Fix paths to play nicer with GCP config.
iftenney included the following code: https://github.com/nyu-mll/jiant/pull/13/commits
Issue by najoungkim
Tuesday Jun 26, 2018 at 18:53 GMT
Originally opened as nyu-mll/jiant#35
Just checked that it runs with GPU--can this be merged?
najoungkim included the following code: https://github.com/nyu-mll/jiant/pull/35/commits
Issue by W4ngatang
Monday Jun 25, 2018 at 05:14 GMT
Originally opened as nyu-mll/jiant#15
insert learnable layer scaling parameters to be learned once LSTM weights are frozen (for eval tasks) when training on LM
Issue by sleepinyourhat
Friday Jun 22, 2018 at 16:58 GMT
Originally opened as nyu-mll/jiant#6
Now possible to train on one dataset and later test on another, while still using pretrained embeddings for words that only appear in the second dataset. Just rebuild the vocabulary (-v) when launching the test run.
sleepinyourhat included the following code: https://github.com/nyu-mll/jiant/pull/6/commits
Issue by W4ngatang
Monday Jun 25, 2018 at 05:25 GMT
Originally opened as nyu-mll/jiant#19
Currently training task-specific components for eval-only tasks will delete (since we're starting training without loading a state) / overwrite (since the checkpoints will share the same names) the model checkpoints that we're created during the main training loop.
Issue by iftenney
Wednesday Jun 27, 2018 at 03:58 GMT
Originally opened as nyu-mll/jiant#46
Also remove some dead code
iftenney included the following code: https://github.com/nyu-mll/jiant/pull/46/commits
Issue by sleepinyourhat
Wednesday Jun 27, 2018 at 01:42 GMT
Originally opened as nyu-mll/jiant#44
Should fix #38, to the extent that it was a real problem.
sleepinyourhat included the following code: https://github.com/nyu-mll/jiant/pull/44/commits
Issue by sleepinyourhat
Friday Jun 22, 2018 at 16:37 GMT
Originally opened as nyu-mll/jiant#5
If LOAD_MODEL is 0, all checkpoints should be deleted. Otherwise, you can wind up in an odd setup (easy to hit with demo experiments):
Issue by W4ngatang
Friday Jun 22, 2018 at 17:14 GMT
Originally opened as nyu-mll/jiant#7
Current code to save and load optimizer and scheduler states are clunky and/or broken. Find a more pytorch-onic way to do it.
Issue by iftenney
Wednesday Jun 27, 2018 at 16:05 GMT
Originally opened as nyu-mll/jiant#49
Training on MultiNLI w/ELMo takes 65G of RAM, which is a lot - this will get painful with larger datasets. See if we can stream from disk instead of loading the whole set into memory.
Issue by sleepinyourhat
Tuesday Jun 26, 2018 at 18:16 GMT
Originally opened as nyu-mll/jiant#30
Issue by sleepinyourhat
Tuesday Jun 26, 2018 at 18:22 GMT
Originally opened as nyu-mll/jiant#33
Unassigned. Can start after the main big experiment.
Issue by pitrack
Saturday Jun 23, 2018 at 04:03 GMT
Originally opened as nyu-mll/jiant#10
Mostly code taken out from allennlp
and modified with a function from the subsequent mask part of The Annotated Transformer.
Ideally this would just be a flag in the allennlp
library.
Confirmed that this runs. Did not confirm that it is correct (there's a chance that it is off-by-one, but I don't think so). Also seems sensitive to hyperparameters (not doing better than chance with default setting)
pitrack included the following code: https://github.com/nyu-mll/jiant/pull/10/commits
Issue by tommccoy1
Tuesday Jun 26, 2018 at 18:21 GMT
Originally opened as nyu-mll/jiant#31
Added handling for multiple DisSent corpora
tommccoy1 included the following code: https://github.com/nyu-mll/jiant/pull/31/commits
Issue by W4ngatang
Monday Jun 25, 2018 at 05:17 GMT
Originally opened as nyu-mll/jiant#17
Specifically for translation, denoising tasks. The AllenNLP RNN decoder should have beam search already implemented, but you might need to poke around for validation metrics such as BLEU.
Issue by iftenney
Wednesday Jun 27, 2018 at 01:15 GMT
Originally opened as nyu-mll/jiant#43
iftenney included the following code: https://github.com/nyu-mll/jiant/pull/43/commits
Issue by sleepinyourhat
Friday Jun 22, 2018 at 17:23 GMT
Originally opened as nyu-mll/jiant#8
Fix for #5
sleepinyourhat included the following code: https://github.com/nyu-mll/jiant/pull/8/commits
Issue by sleepinyourhat
Wednesday Jun 27, 2018 at 14:30 GMT
Originally opened as nyu-mll/jiant#47
06/27 10:18:45 AM: Beginning training. Stopping metric: sts-b_corr
Traceback (most recent call last):
File "main.py", line 185, in
sys.exit(main(sys.argv[1:]))
File "main.py", line 162, in main
args.shared_optimizer, load_model=False, phase="eval")
File "/Users/Bowman/Drive/JSALT/jiant/src/trainer.py", line 275, in train
output_dict = self._forward(batch, task=task, for_training=True)
File "/Users/Bowman/Drive/JSALT/jiant/src/trainer.py", line 521, in _forward
return self._model.forward(task, tensor_batch) # , **tensor_batch)
File "/Users/Bowman/Drive/JSALT/jiant/src/models.py", line 355, in forward
out = self._pair_regression_forward(batch, task)
File "/Users/Bowman/Drive/JSALT/jiant/src/models.py", line 433, in _pair_regression_forward
logits = classifier(s1, s2, s1_mask, s2_mask)
File "/Users/Bowman/anaconda3/envs/jiant/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
TypeError: forward() takes 2 positional arguments but 5 were given
@hyinghui, @W4ngatang, anyone else who's seen this code—could you take a look?
Issue by EdouardGrave
Tuesday Jun 26, 2018 at 23:35 GMT
Originally opened as nyu-mll/jiant#41
This PR adds the data loader and model for MT pre-training tasks. The decoder is a slightly modified version of the SimpleSeq2Seq class from AllenNLP.
EdouardGrave included the following code: https://github.com/nyu-mll/jiant/pull/41/commits
Issue by sleepinyourhat
Tuesday Jun 26, 2018 at 20:56 GMT
Originally opened as nyu-mll/jiant#38
The 'best checkpoint' handling logic has a loop nesting issue that means that all models are evaluated on whatever checkpoint was best for the last model to be trained. @epavlick - take a look/let me know when you've pushed your related changes.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.