Comments (3)
Since the very recent version pytorch, the gradient buffers are initialized only when needed.
I could find a workaround by adding a check before clipping the gradients in nnsum/trainer/labels_mle_trainer.py
if hasattr(param.grad, 'data'):
nnsum/nnsum/trainer/labels_mle_trainer.py
Lines 187 to 188 in 4660dc2
from nnsum.
Hello,
I'd like to test the cnn-dailymail dataset, but I can not download the dataset. So, I used reddit dataset to test the code. I tried with the command:
python script_bin/train_model.py \
--trainer --train-inputs ../dataset/summarization/nnsum/reddit/reddit/inputs/train\
--train-labels ../dataset/summarization/nnsum/reddit/reddit/abs_lables/train \
--valid-inputs ../dataset/summarization/nnsum/reddit/reddit/inputs/valid \
--valid-labels ../dataset/summarization/nnsum/reddit/reddit/abs_lables/valid \
--valid-refs ../dataset/summarization/nnsum/reddit/reddit/human-abstracts/valid \
--weighted \
--gpu 0 \
--model ../dataset/summarization/nnsum/reddit/model \
--results ../dataset/summarization/nnsum/reddit/val_score \
--seed 12345678 \
--emb --pretrained-embeddings ../dataset/embedding/eng_word_embedding.word2vec.vec \
--enc cnn \
--ext s2s --bidirectional
However, some reference problems occurred:
{'train_inputs': PosixPath('../dataset/summarization/nnsum/reddit/reddit/inputs/train'), 'train_labels': PosixPath('../dataset/summarization/nnsum/reddit/reddit/abs_lables/train'), 'valid_inputs': PosixPath('../dataset/summarization/nnsum/reddit/reddit/inputs/valid'), 'valid_labels': PosixPath('../dataset/summarization/nnsum/reddit/reddit/abs_lables/valid'), 'valid_refs': PosixPath('../dataset/summarization/nnsum/reddit/reddit/human-abstracts/valid'), 'seed': 12345678, 'epochs': 50, 'batch_size': 32, 'gpu': 0, 'teacher_forcing': 25, 'sentence_limit': 50, 'weighted': True, 'loader_workers': 8, 'raml_samples': 25, 'raml_temp': 0.05, 'summary_length': 100, 'remove_stopwords': False, 'shuffle_sents': False, 'model': PosixPath('../dataset/summarization/nnsum/reddit/model'), 'results': PosixPath('../dataset/summarization/nnsum/reddit/val_score')}
{'embedding_size': 200, 'pretrained_embeddings': '../dataset/embedding/eng_word_embedding.word2vec.vec', 'top_k': None, 'at_least': 1, 'word_dropout': 0.0, 'embedding_dropout': 0.25, 'update_rule': 'fix-all', 'filter_pretrained': False}
{'dropout': 0.25, 'filter_windows': [1, 2, 3, 4, 5, 6], 'feature_maps': [25, 25, 50, 50, 50, 50], 'OPT': 'cnn'}
{'hidden_size': 300, 'bidirectional': True, 'rnn_dropout': 0.25, 'num_layers': 1, 'cell': 'gru', 'mlp_layers': [100], 'mlp_dropouts': [0.25], 'OPT': 's2s'}
Initializing vocabulary and embeddings.
INFO:root: Reading pretrained embeddings from ../dataset/embedding/eng_word_embedding.word2vec.vec
INFO:root: Read 559185 embeddings of size 200
INFO:root: EmbeddingContext(
(embeddings): Embedding(559185, 200, padding_idx=0)
)
Loading training data.
Loading validation data.
Traceback (most recent call last):
File "script_bin/train_model.py", line 79, in <module>
main()
File "script_bin/train_model.py", line 48, in main
sentence_limit=args["trainer"]["sentence_limit"])
File "/home/constant/anaconda3/lib/python3.7/site-packages/nnsum-1.0-py3.7.egg/nnsum/data/summarization_dataset.py", line 30, in __init__
File "/home/constant/anaconda3/lib/python3.7/site-packages/nnsum-1.0-py3.7.egg/nnsum/data/summarization_dataset.py", line 59, in _collect_references
Exception: No references found for example id: 12he9h.32
I looked the data files, there are 12he9h.32.a.txt
and 12he9h.32.d.txt
in the path reddit/human-abstracts/valid
. I do not know where the problem is.
Thanks for any help! @kedz
from nnsum.
@tlifcen i encountered the same problem.
when you run python script_bin/train_model.py make sure you are using the correct paths.
assume your current directory is from python_script directory.
from nnsum.
Related Issues (13)
- Add link to Arxiv paper HOT 1
- ERROR:ignite.engine.engine.Engine:Engine run is terminating due to exception: Loss must have at least one example before it can be computed HOT 1
- About result on CNN/DM(non-anonymized)result using SummaRuRNN? HOT 12
- Test problem HOT 8
- Pytorch Lightning as a back-end
- custom data
- Problem downloading PubMed dataset
- What Format should be the training data? json or bin files?
- how to use the rouge_papier to test lead algorithm
- ERROR:ignite.engine.engine.Engine:Current run is terminating due to exception: 'NoneType' object has no attribute 'data'. ERROR:ignite.engine.engine.Engine:Engine run is terminating due to exception: 'NoneType' object has no attribute 'data'. HOT 1
- ERROR:ignite.engine.engine.Engine:Engine run is terminating due to exception: 'rouge'. HOT 2
- ImportError: cannot import name '_to_hours_mins_secs' HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nnsum.