Coder Social home page Coder Social logo

nyu-mll / jiant Goto Github PK

View Code? Open in Web Editor NEW
1.6K 1.6K 295.0 4.43 MB

jiant is an nlp toolkit

Home Page: https://jiant.info

License: MIT License

Python 98.74% Shell 1.26%
bert multitask-learning nlp sentence-representation transfer-learning transformers

jiant's People

Contributors

anhad13 avatar berlinchen7 avatar bordias avatar dependabot[bot] avatar edouardgrave avatar epavlick avatar haokunliu avatar hyinghui avatar iftenney avatar iwontbecreative avatar jeswan avatar kelina avatar najoungkim avatar narsil avatar njjiang avatar pappagari avatar phu-pmh avatar pitrack avatar pyeres avatar roma-patel avatar sheng-fu avatar shuningjin avatar sleepinyourhat avatar tommccoy1 avatar w4ngatang avatar wh629 avatar woollysocks avatar yukatherin avatar yzpang avatar zphang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

jiant's Issues

train_tasks = glue, error

The config file <defaults2.conf> is attached. (delete suffix ".txt" to use)
defaults2.conf.txt

Works fine if train_task = "sst,mrpc,rte".
When train_tasks = glue, error:

Traceback (most recent call last):
File "src/main.py", line 207, in
main(sys.argv[1:])
File "src/main.py", line 156, in main
args.shared_optimizer, args.load_model, phase="main")
File "/home/sa_112949933820817028848/jiant/src/trainer.py", line 333, in train
output_dict = self._forward(batch, task=task, for_training=True)
File "/home/sa_112949933820817028848/jiant/src/trainer.py", line 604, in _forward
return self._model.forward(task, tensor_batch) # , **tensor_batch)
File "/home/sa_112949933820817028848/jiant/src/models.py", line 351, in forward
out = self._single_sentence_forward(batch, task)
File "/home/sa_112949933820817028848/jiant/src/models.py", line 385, in _single_sentence_forward
task.scorer1(labels, preds.data.cpu().numpy())
TypeError: call() takes 2 positional arguments but 3 were given

wikiedits pretraining task

modified seq2seq that feeds sentence encoding+pointer to insertion location into MLP before decoding, to predict inserted spans at arbitrary points in the sentence.

Final reported number in eval phase may not match best reported number

cola_mcc at the bottom row shouldn't be lower than in the overall validation. This is a run training on mnli-fiction and evaluating on cola.

/nfs/jsalt/exp/sam-gpu1b-4/tuning_fine_tuning/cola_st/log.log

***** VALIDATION RESULTS *****
cola_mcc, 17, cola_loss: 0.65820, macro_avg: 0.27127, micro_avg: 0.27127, cola_mcc: 0.27127, cola_accuracy: 0.66059
micro_avg, 17, cola_loss: 0.65820, macro_avg: 0.27127, micro_avg: 0.27127, cola_mcc: 0.27127, cola_accuracy: 0.66059
macro_avg, 17, cola_loss: 0.65820, macro_avg: 0.27127, micro_avg: 0.27127, cola_mcc: 0.27127, cola_accuracy: 0.66059
Loaded model state from /nfs/jsalt/exp/sam-gpu1b-4/tuning_fine_tuning/cola_st/model_state_eval_best.th
Evaluating...
micro_accuracy: 0.382, macro_accuracy: 0.351, mnli-fiction_accuracy: 0.452, cola_mcc: 0.250, cola_accuracy: 0.711

Worry about storage

Preprocessed copies of some datasets (LM, MT) are very large. Not a problem to store a few copies, but we'll use up our NFS volume very quickly if every run results in a ~70G copy of the data.

Propose implementing a global preprocessing directory that we can store a single copy in, and trainer will look there if a local one is not found. exp_dir already provides a mechanism for this, but its difficult to share one directory across multiple workers.

bidirectional rnn language model

Using the standard Pytorch RNN won't work for multi layer bidirectional language models because the directions are aggregated between layers. See elmo lstm for an example of how to do this (and maybe an easy work around?).

Write a shell script to extract a spreadsheet line from a log file

Desired output:
$ extract_results.sh /nfs/jsalt/exp/sam-/tuning_/*/log.log
Ellie 7/2/2018 mnli N Y N 0.2 58.8 59.0 2.8 82.7 71.1 76.4 53.1 51.1 62.2 77.3 56 43.7 81.5 79.5 gs://jsalt-models/ellie_runs/take2_base_transformer_lowlr
Ellie 7/2/2018 mnli N Y N 0.2 57.8 60.5 0 83.6 71.3 76.8 56.9 54.6 61.4 76.8 56 56.3 81.8 79.6 gs://jsalt-models/ellie_runs/take2_base_transformer_lowwarm
...

The results can then be pasted into the spreadsheet directly. Ideally it should also be able to extract results from runs that have finished some but not all GLUE tasks.

My model is too big.

350m+ parameters with ELMo and attention. Many of them in places where they don't seem that useful.

PyTorch clip warning

Fix this:

.../src/trainer.py:258: UserWarning: torch.nn.utils.clip_grad_norm is now deprecated in favor of torch.nn.utils.clip_grad_norm_.

Checkpoints are huge.

100MB for the demo setup. Are we saving anything we shouldn't, like the ELMo char layer?

find sensible transformer params

There are a a few transformer parameters we've been avoiding (projection dimension, feedforward dimension) by setting it all to the same value, in addition to training hyperparameters. We should probably go through what literature there is and find sensible defaults

Runs can fail when started in quick succession with shared exp_name. Race condition?

One recent example:

07/02 05:17:40 PM: Fatal error in main():
Traceback (most recent call last):
File "main.py", line 207, in
main(sys.argv[1:])
File "main.py", line 105, in main
train_tasks, eval_tasks, vocab, word_embs = build_tasks(args)
File "/home/sbowman/jiant/src/preprocess.py", line 170, in build_tasks
load_pkl=bool(not args.reload_tasks))
File "/home/sbowman/jiant/src/preprocess.py", line 298, in get_tasks
os.mkdir(task_scratch_path)
FileExistsError: [Errno 17] File exists: '/misc/vlgscratch4/BowmanGroup/sbowman/exp//main-random/SST-2/'

Solved by restarting the later runs.

separate training parameters for train tasks and eval tasks

We want to allow for more fine-grained control between training parameters on the main task we're training on and auxiliary tasks we're evaluating on.

If we're not training on multiple tasks, it would probably make sense to switch to a deterministic trainer that validates after passing through the entire training set (as opposed to a fixed number of batches).

Scheduler params

Two issues

  • when using Transformer, we use the NoamScheduler, but that should take a step every update, not every validation check
  • when using ReduceLROnPlateau, we assume 'max', but some when training some tasks on their own, we want to minimize some metric (e.g. perplexity)

Test mode

Feature that could save us a lot of grief later today: Set up a flag to load no more than ~1000 (or some other set number) examples from each data file. This will let people make sure that it's possible to train the standard model on their tasks without waiting through potentially slow preprocessing first.

Worry about speed

Not top priority, but the largest model gets about 150 steps per minute, so a large training run (500k steps) could take two or three days. If anyone has spare bandwidth, do some CPU profiling and make sure we're not wasting time on anything. If you're very bored, try some GPU profiling too, though I doubt there's much to optimize there.

STS-B classifier is broken

06/27 10:18:45 AM: Beginning training. Stopping metric: sts-b_corr
Traceback (most recent call last):
File "main.py", line 185, in
sys.exit(main(sys.argv[1:]))
File "main.py", line 162, in main
args.shared_optimizer, load_model=False, phase="eval")
File "/Users/Bowman/Drive/JSALT/jiant/src/trainer.py", line 275, in train
output_dict = self._forward(batch, task=task, for_training=True)
File "/Users/Bowman/Drive/JSALT/jiant/src/trainer.py", line 521, in _forward
return self._model.forward(task, tensor_batch) # , **tensor_batch)
File "/Users/Bowman/Drive/JSALT/jiant/src/models.py", line 355, in forward
out = self._pair_regression_forward(batch, task)
File "/Users/Bowman/Drive/JSALT/jiant/src/models.py", line 433, in _pair_regression_forward
logits = classifier(s1, s2, s1_mask, s2_mask)
File "/Users/Bowman/anaconda3/envs/jiant/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
TypeError: forward() takes 2 positional arguments but 5 were given

@hyinghui, @W4ngatang, anyone else who's seen this code—could you take a look?

TensorboardX support?

Will be useful to monitor / debug models while training. Not sure how much work this is.

Delete checkpoints when starting training from scratch

If LOAD_MODEL is 0, all checkpoints should be deleted. Otherwise, you can wind up in an odd setup (easy to hit with demo experiments):

  1. Train model for 10 epochs with settings S.
  2. Save model checkpoints for epochs 1–10.
  3. Train new model for 5 epochs with settings S', and LOAD_MODEL=0.
  4. Save model checkpoints for epoch 1–5, overwriting old checkpoints.
  5. Try to continue training the new model with LOAD_MODEL=1. Wind up using settings S' but loading the highest-numbered epoch checkpoint, which was created with settings S. Crash.

Parameters for sequence generation, LM tasks won't fine-tune.

This is not really an issue if we're only fine-tuning on GLUE tasks but

At fine-tuning time, we get the parameters to fine-tune by finding model attribute (%s_mdl) % task.name but for several types of tasks, the attributes to be trained are set with different names, e.g. %s_decoder.

Suggested solution, wrap components in an nn.Module to easily get parameters.

Crashing when trying to use only char embs in demo.sh

bash ./demo.sh -w none
[...]
Traceback (most recent call last):
File "../src/main.py", line 236, in
sys.exit(main(sys.argv[1:]))
File "../src/main.py", line 186, in main
args.shared_optimizer, args.load_model)
File "/Users/Bowman/Drive/JSALT/jiant/src/trainer.py", line 273, in train
output_dict = self._forward(batch, task=task, for_training=True)
File "/Users/Bowman/Drive/JSALT/jiant/src/trainer.py", line 504, in _forward
return self._model.forward(task, tensor_batch) # , **tensor_batch)
File "/Users/Bowman/Drive/JSALT/jiant/src/models.py", line 291, in forward
out = self._single_classification_forward(batch, task)
File "/Users/Bowman/Drive/JSALT/jiant/src/models.py", line 309, in _single_classification_forward
sent_embs, sent_mask = self.sent_encoder(batch['input1'])
File "/Users/Bowman/anaconda3/envs/jiant/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/Users/Bowman/Drive/JSALT/jiant/src/modules.py", line 75, in forward
sent_embs = self._highway_layer(self._text_field_embedder(sent))
File "/Users/Bowman/anaconda3/envs/jiant/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/Users/Bowman/anaconda3/envs/jiant/lib/python3.6/site-packages/allennlp/modules/text_field_embedders/basic_text_field_embedder.py", line 63, in forward
raise ConfigurationError(message)
allennlp.common.checks.ConfigurationError: "Mismatched token keys: dict_keys(['chars']) and dict_keys(['words', 'chars'])"

Clarify `epoch`

Epoch is used in a few places to mean number of validations done so far. This is pretty weird. Rename and/or add some documentation?

AllenNLP vocabulary warning.

See if this is worth worrying about:

06/18 06:07:17 PM: Your label namespace was 'idxs'. We recommend you use a namespace ending with 'labels' or 'tags', so we don't add UNK and PAD tokens by default to your vocabulary. See documentation for non_padded_namespaces parameter in Vocabulary.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.