nyu-mll / jiant-v1-legacy Goto Github PK

View Code? Open in Web Editor NEW

21.0 21.0 9.0 51.14 MB

The jiant toolkit for general-purpose text understanding models

License: MIT License

Dockerfile 0.07% Python 31.20% Shell 1.77% Jsonnet 0.13% Jupyter Notebook 66.84%

jiant-v1-legacy's People

Contributors

Stargazers

Watchers

Forkers

yianzhang tobiaslee lovodkin93 ikergarcia1996 darrenabramson vtyh bhuvanakundumani samarthmm andysucao

jiant-v1-legacy's Issues

[CLOSED] refactor main so all blocks of code are opt-in

Issue by epavlick
Tuesday Jun 26, 2018 at 18:43 GMT
Originally opened as nyu-mll/jiant#34

[CLOSED] Double check that early stopping strategy is sane for single-task GLUE training

Issue by sleepinyourhat
Tuesday Jun 26, 2018 at 18:21 GMT
Originally opened as nyu-mll/jiant#32

[CLOSED] Switch to AllenNLP ELMoTokenEmbedder

Issue by W4ngatang
Monday Jun 25, 2018 at 05:13 GMT
Originally opened as nyu-mll/jiant#14

Use ELMoTokenEmbedder instead of ELMo() so people don't need to modify their source code

[CLOSED] Config

Issue by iftenney
Wednesday Jun 27, 2018 at 03:35 GMT
Originally opened as nyu-mll/jiant#45

Add comments to replace old argparse docstrings

iftenney included the following code: https://github.com/nyu-mll/jiant/pull/45/commits

[CLOSED] pretrained cnns, grounded tasks plus cnn encoder

Issue by roma-patel
Tuesday Jun 26, 2018 at 10:29 GMT
Originally opened as nyu-mll/jiant#25

roma-patel included the following code: https://github.com/nyu-mll/jiant/pull/25/commits

[CLOSED] Not possible to add a new eval task when loading an old pretrained model.

Issue by sleepinyourhat
Tuesday Jun 26, 2018 at 20:13 GMT
Originally opened as nyu-mll/jiant#37

Low priority, but should find a fix eventually.

[CLOSED] PyTorch clip warning

Issue by sleepinyourhat
Monday Jun 18, 2018 at 22:11 GMT
Originally opened as nyu-mll/jiant#2

Fix this:

.../src/trainer.py:258: UserWarning: torch.nn.utils.clip_grad_norm is now deprecated in favor of torch.nn.utils.clip_grad_norm_.

[CLOSED] Clarify `epoch`

Issue by sleepinyourhat
Friday Jun 22, 2018 at 19:06 GMT
Originally opened as nyu-mll/jiant#9

Epoch is used in a few places to mean number of validations done so far. This is pretty weird. Rename and/or add some documentation?

[CLOSED] writeup options and trainer setup in README

Issue by W4ngatang
Monday Jun 25, 2018 at 05:15 GMT
Originally opened as nyu-mll/jiant#16

^ so people know how to use things

[CLOSED] Config file system to replace multitude of flags

Issue by iftenney
Wednesday Jun 27, 2018 at 01:12 GMT
Originally opened as nyu-mll/jiant#42

Desiderata:

Configuration by files, with some sort of templating and inheritance
Command-line overrides
Save a copy of params to the run dir, so models can be re-loaded later

[CLOSED] Absolute paths & scripts for GCP

Issue by iftenney
Monday Jun 25, 2018 at 17:50 GMT
Originally opened as nyu-mll/jiant#23

More cleanup to play nice with GCP & remote execution

Absolute paths relative to script location, so that script can be run from any directory
Args list as array with expansion for readability, cleaner diffs
GCP convenience scripts

iftenney included the following code: https://github.com/nyu-mll/jiant/pull/23/commits

[CLOSED] set up command line invocation for probing with NLI-style data

Issue by epavlick
Tuesday Jun 26, 2018 at 13:32 GMT
Originally opened as nyu-mll/jiant#26

We need a way to take a trained model and run it on small probing test sets, assuming the probing test sets are in the form of p/h pairs.

[CLOSED] find sensible transformer params

Issue by W4ngatang
Monday Jun 25, 2018 at 17:16 GMT
Originally opened as nyu-mll/jiant#21

There are a a few transformer parameters we've been avoiding (projection dimension, feedforward dimension) by setting it all to the same value, in addition to training hyperparameters. We should probably go through what literature there is and find sensible defaults

[CLOSED] Separate checkpoint names for main training phase and eval phase

Issue by sleepinyourhat
Tuesday Jun 26, 2018 at 20:08 GMT
Originally opened as nyu-mll/jiant#36

#28

sleepinyourhat included the following code: https://github.com/nyu-mll/jiant/pull/36/commits

[CLOSED] Don't load the test set unless you need it.

Issue by sleepinyourhat
Friday Jun 22, 2018 at 15:53 GMT
Originally opened as nyu-mll/jiant#4

This probably wastes time with QQP, which has a huge test set.

[CLOSED] Flag to control whether or not loaded/pretrained model is re-trained …

Issue by epavlick
Tuesday Jun 26, 2018 at 14:42 GMT
Originally opened as nyu-mll/jiant#27

…on eval task

epavlick included the following code: https://github.com/nyu-mll/jiant/pull/27/commits

[CLOSED] fastText embeddings

Issue by pitrack
Wednesday Jun 20, 2018 at 17:20 GMT
Originally opened as nyu-mll/jiant#3

@W4ngatang Ideally you would check out this branch + download fastText and confirm that the instructions are clear, and that the code still runs (with/without fastText). If you want fastText to be default, then some flags may need to get changed.

I'm not getting high accuracy on the demo task.

I also changed some things in the config file so that I can use my paths. You'll need to update your paths too if you checkout this branch/if this gets merged.

pitrack included the following code: https://github.com/nyu-mll/jiant/pull/3/commits

[CLOSED] Probing infra

Issue by epavlick
Tuesday Jun 26, 2018 at 21:45 GMT
Originally opened as nyu-mll/jiant#39

Refactoring main

epavlick included the following code: https://github.com/nyu-mll/jiant/pull/39/commits

[CLOSED] Make checkpointing during target-task training less wonky.

Issue by sleepinyourhat
Tuesday Jun 26, 2018 at 15:15 GMT
Originally opened as nyu-mll/jiant#28

Currently we start overwriting checkpoints once we switch from pretraining to target task training.

[CLOSED] Add a projection layer before max pooling

Issue by sleepinyourhat
Tuesday Jun 26, 2018 at 18:14 GMT
Originally opened as nyu-mll/jiant#29

@W4ngatang ?

[CLOSED] De-dupe path scripts

Issue by iftenney
Wednesday Jun 27, 2018 at 15:50 GMT
Originally opened as nyu-mll/jiant#48

We had two that did the same thing

iftenney included the following code: https://github.com/nyu-mll/jiant/pull/48/commits

[CLOSED] bidirectional rnn language model

Issue by W4ngatang
Monday Jun 25, 2018 at 17:19 GMT
Originally opened as nyu-mll/jiant#22

Using the standard Pytorch RNN won't work for multi layer bidirectional language models because the directions are aggregated between layers. See elmo lstm for an example of how to do this (and maybe an easy work around?).

[CLOSED] Fix/simplify regression code.

Issue by sleepinyourhat
Wednesday Jun 27, 2018 at 16:15 GMT
Originally opened as nyu-mll/jiant#50

Fixes #47

sleepinyourhat included the following code: https://github.com/nyu-mll/jiant/pull/50/commits

[CLOSED] Don't automatically delete checkpoints, even when it might be useful.

Issue by sleepinyourhat
Monday Jun 25, 2018 at 21:25 GMT
Originally opened as nyu-mll/jiant#24

#19

sleepinyourhat included the following code: https://github.com/nyu-mll/jiant/pull/24/commits

[CLOSED] separate training parameters for train tasks and eval tasks

Issue by W4ngatang
Monday Jun 25, 2018 at 05:22 GMT
Originally opened as nyu-mll/jiant#18

We want to allow for more fine-grained control between training parameters on the main task we're training on and auxiliary tasks we're evaluating on.

If we're not training on multiple tasks, it would probably make sense to switch to a deterministic trainer that validates after passing through the entire training set (as opposed to a fixed number of batches).

[CLOSED] Crashing when trying to use only char embs in demo.sh

Issue by sleepinyourhat
Saturday Jun 23, 2018 at 13:28 GMT
Originally opened as nyu-mll/jiant#12

bash ./demo.sh -w none
[...]
Traceback (most recent call last):
File "../src/main.py", line 236, in
sys.exit(main(sys.argv[1:]))
File "../src/main.py", line 186, in main
args.shared_optimizer, args.load_model)
File "/Users/Bowman/Drive/JSALT/jiant/src/trainer.py", line 273, in train
output_dict = self._forward(batch, task=task, for_training=True)
File "/Users/Bowman/Drive/JSALT/jiant/src/trainer.py", line 504, in _forward
return self._model.forward(task, tensor_batch) # , **tensor_batch)
File "/Users/Bowman/Drive/JSALT/jiant/src/models.py", line 291, in forward
out = self._single_classification_forward(batch, task)
File "/Users/Bowman/Drive/JSALT/jiant/src/models.py", line 309, in _single_classification_forward
sent_embs, sent_mask = self.sent_encoder(batch['input1'])
File "/Users/Bowman/anaconda3/envs/jiant/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/Users/Bowman/Drive/JSALT/jiant/src/modules.py", line 75, in forward
sent_embs = self._highway_layer(self._text_field_embedder(sent))
File "/Users/Bowman/anaconda3/envs/jiant/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/Users/Bowman/anaconda3/envs/jiant/lib/python3.6/site-packages/allennlp/modules/text_field_embedders/basic_text_field_embedder.py", line 63, in forward
raise ConfigurationError(message)
allennlp.common.checks.ConfigurationError: "Mismatched token keys: dict_keys(['chars']) and dict_keys(['words', 'chars'])"

[CLOSED] Flag-guard import pdb

Issue by sleepinyourhat
Saturday Jun 23, 2018 at 12:48 GMT
Originally opened as nyu-mll/jiant#11

[CLOSED] include options for specifying classifier to use for probing tasks

Issue by epavlick
Tuesday Jun 26, 2018 at 22:54 GMT
Originally opened as nyu-mll/jiant#40

[CLOSED] implement pre-pooling linear projection

Issue by W4ngatang
Monday Jun 25, 2018 at 05:28 GMT
Originally opened as nyu-mll/jiant#20

For downstream tasks' components, allow for a learned linear projection of the core RNN's hidden states before any pooling

[CLOSED] AllenNLP vocabulary warning.

Issue by sleepinyourhat
Monday Jun 18, 2018 at 22:10 GMT
Originally opened as nyu-mll/jiant#1

See if this is worth worrying about:

06/18 06:07:17 PM: Your label namespace was 'idxs'. We recommend you use a namespace ending with 'labels' or 'tags', so we don't add UNK and PAD tokens by default to your vocabulary. See documentation for non_padded_namespaces parameter in Vocabulary.

[CLOSED] Fix paths & allow for read-only data_dir

Issue by iftenney
Sunday Jun 24, 2018 at 20:38 GMT
Originally opened as nyu-mll/jiant#13

Fix paths to play nicer with GCP config.

Fix bug where run_dir was prepended to log.log twice
Use exp_dir/prepreproc as scratch directory for data pre-preprocessing, so that glue_data can be readonly
make user_config.sh optional, since GCP instances will have the required environment vars already set.

iftenney included the following code: https://github.com/nyu-mll/jiant/pull/13/commits

[CLOSED] JOCI pretraining task

Issue by najoungkim
Tuesday Jun 26, 2018 at 18:53 GMT
Originally opened as nyu-mll/jiant#35

Just checked that it runs with GPU--can this be merged?

najoungkim included the following code: https://github.com/nyu-mll/jiant/pull/35/commits

[CLOSED] Learn linear combinations of core LSTM weights

Issue by W4ngatang
Monday Jun 25, 2018 at 05:14 GMT
Originally opened as nyu-mll/jiant#15

insert learnable layer scaling parameters to be learned once LSTM weights are frozen (for eval tasks) when training on LM

[CLOSED] Can now rebuild vocabulary before loading model.

Issue by sleepinyourhat
Friday Jun 22, 2018 at 16:58 GMT
Originally opened as nyu-mll/jiant#6

Now possible to train on one dataset and later test on another, while still using pretrained embeddings for words that only appear in the second dataset. Just rebuild the vocabulary (-v) when launching the test run.

sleepinyourhat included the following code: https://github.com/nyu-mll/jiant/pull/6/commits

[CLOSED] training eval-only tasks will erase model checkpoints create during training

Issue by W4ngatang
Monday Jun 25, 2018 at 05:25 GMT
Originally opened as nyu-mll/jiant#19

Currently training task-specific components for eval-only tasks will delete (since we're starting training without loading a state) / overwrite (since the checkpoints will share the same names) the model checkpoints that we're created during the main training loop.

[CLOSED] Update readme for new configs

Issue by iftenney
Wednesday Jun 27, 2018 at 03:58 GMT
Originally opened as nyu-mll/jiant#46

Also remove some dead code

iftenney included the following code: https://github.com/nyu-mll/jiant/pull/46/commits

[CLOSED] Best checkpoint logic changes.

Issue by sleepinyourhat
Wednesday Jun 27, 2018 at 01:42 GMT
Originally opened as nyu-mll/jiant#44

Should fix #38, to the extent that it was a real problem.

sleepinyourhat included the following code: https://github.com/nyu-mll/jiant/pull/44/commits

[CLOSED] Delete checkpoints when starting training from scratch

Issue by sleepinyourhat
Friday Jun 22, 2018 at 16:37 GMT
Originally opened as nyu-mll/jiant#5

If LOAD_MODEL is 0, all checkpoints should be deleted. Otherwise, you can wind up in an odd setup (easy to hit with demo experiments):

Train model for 10 epochs with settings S.
Save model checkpoints for epochs 1–10.
Train new model for 5 epochs with settings S', and LOAD_MODEL=0.
Save model checkpoints for epoch 1–5, overwriting old checkpoints.
Try to continue training the new model with LOAD_MODEL=1. Wind up using settings S' but loading the highest-numbered epoch checkpoint, which was created with settings S. Crash.

[CLOSED] Save optimizer and LR scheduler state

Issue by W4ngatang
Friday Jun 22, 2018 at 17:14 GMT
Originally opened as nyu-mll/jiant#7

Current code to save and load optimizer and scheduler states are clunky and/or broken. Find a more pytorch-onic way to do it.

[CLOSED] Stream data from disk instead of loading entirely to RAM

Issue by iftenney
Wednesday Jun 27, 2018 at 16:05 GMT
Originally opened as nyu-mll/jiant#49

Training on MultiNLI w/ELMo takes 65G of RAM, which is a lot - this will get painful with larger datasets. See if we can stream from disk instead of loading the whole set into memory.

[CLOSED] Use ELMo character handling in all experiments, set as default

Issue by sleepinyourhat
Tuesday Jun 26, 2018 at 18:16 GMT
Originally opened as nyu-mll/jiant#30

[CLOSED] Train a GLUE MTL run for use in analysis fine-tuning.

Issue by sleepinyourhat
Tuesday Jun 26, 2018 at 18:22 GMT
Originally opened as nyu-mll/jiant#33

Unassigned. Can start after the main big experiment.

[CLOSED] Masked transformer

Issue by pitrack
Saturday Jun 23, 2018 at 04:03 GMT
Originally opened as nyu-mll/jiant#10

Mostly code taken out from allennlp and modified with a function from the subsequent mask part of The Annotated Transformer.

Ideally this would just be a flag in the allennlp library.

Confirmed that this runs. Did not confirm that it is correct (there's a chance that it is off-by-one, but I don't think so). Also seems sensitive to hyperparameters (not doing better than chance with default setting)

pitrack included the following code: https://github.com/nyu-mll/jiant/pull/10/commits

[CLOSED] Dis sent

Issue by tommccoy1
Tuesday Jun 26, 2018 at 18:21 GMT
Originally opened as nyu-mll/jiant#31

Added handling for multiple DisSent corpora

tommccoy1 included the following code: https://github.com/nyu-mll/jiant/pull/31/commits

[CLOSED] modify AllenNLP RNN decoder to fit in framework (sequence generation tasks)

Issue by W4ngatang
Monday Jun 25, 2018 at 05:17 GMT
Originally opened as nyu-mll/jiant#17

Specifically for translation, denoising tasks. The AllenNLP RNN decoder should have beam search already implemented, but you might need to poke around for validation metrics such as BLEU.

[CLOSED] Configuration files using pyhocon

Issue by iftenney
Wednesday Jun 27, 2018 at 01:15 GMT
Originally opened as nyu-mll/jiant#43

iftenney included the following code: https://github.com/nyu-mll/jiant/pull/43/commits

[CLOSED] Delete checkpoints when reinitializing.

Issue by sleepinyourhat
Friday Jun 22, 2018 at 17:23 GMT
Originally opened as nyu-mll/jiant#8

Fix for #5

sleepinyourhat included the following code: https://github.com/nyu-mll/jiant/pull/8/commits

[CLOSED] STS-B classifier is broken

Issue by sleepinyourhat
Wednesday Jun 27, 2018 at 14:30 GMT
Originally opened as nyu-mll/jiant#47

06/27 10:18:45 AM: Beginning training. Stopping metric: sts-b_corr
Traceback (most recent call last):
File "main.py", line 185, in
sys.exit(main(sys.argv[1:]))
File "main.py", line 162, in main
args.shared_optimizer, load_model=False, phase="eval")
File "/Users/Bowman/Drive/JSALT/jiant/src/trainer.py", line 275, in train
output_dict = self._forward(batch, task=task, for_training=True)
File "/Users/Bowman/Drive/JSALT/jiant/src/trainer.py", line 521, in _forward
return self._model.forward(task, tensor_batch) # , **tensor_batch)
File "/Users/Bowman/Drive/JSALT/jiant/src/models.py", line 355, in forward
out = self._pair_regression_forward(batch, task)
File "/Users/Bowman/Drive/JSALT/jiant/src/models.py", line 433, in _pair_regression_forward
logits = classifier(s1, s2, s1_mask, s2_mask)
File "/Users/Bowman/anaconda3/envs/jiant/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
TypeError: forward() takes 2 positional arguments but 5 were given

@hyinghui, @W4ngatang, anyone else who's seen this code—could you take a look?

[CLOSED] Add MT pretraining task

Issue by EdouardGrave
Tuesday Jun 26, 2018 at 23:35 GMT
Originally opened as nyu-mll/jiant#41

This PR adds the data loader and model for MT pre-training tasks. The decoder is a slightly modified version of the SimpleSeq2Seq class from AllenNLP.

EdouardGrave included the following code: https://github.com/nyu-mll/jiant/pull/41/commits

[CLOSED] Training for multiple tasks in the eval phase is broken.

Issue by sleepinyourhat
Tuesday Jun 26, 2018 at 20:56 GMT
Originally opened as nyu-mll/jiant#38

The 'best checkpoint' handling logic has a loop nesting issue that means that all models are evaluated on whatever checkpoint was best for the last model to be trained. @epavlick - take a look/let me know when you've pushed your related changes.