Hi, I encountered a problem when using your code. here is the co

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

somethings goes wrong after loading data about nnsum HOT 3 OPEN

tanyuqian commented on July 20, 2024 4

somethings goes wrong after loading data

from nnsum.

Comments (3)

avineshpvs commented on July 20, 2024 5

Since the very recent version pytorch, the gradient buffers are initialized only when needed.
I could find a workaround by adding a check before clipping the gradients in nnsum/trainer/labels_mle_trainer.py

if hasattr(param.grad, 'data'):

nnsum/nnsum/trainer/labels_mle_trainer.py

Lines 187 to 188 in 4660dc2

    
           for param in model.parameters(): 
        
               param.grad.data.clamp_(-grad_clip, grad_clip)

from nnsum.

tlifcen commented on July 20, 2024

Hello,
I'd like to test the cnn-dailymail dataset, but I can not download the dataset. So, I used reddit dataset to test the code. I tried with the command:

python script_bin/train_model.py \
    --trainer --train-inputs  ../dataset/summarization/nnsum/reddit/reddit/inputs/train\
              --train-labels ../dataset/summarization/nnsum/reddit/reddit/abs_lables/train \
              --valid-inputs ../dataset/summarization/nnsum/reddit/reddit/inputs/valid \
              --valid-labels ../dataset/summarization/nnsum/reddit/reddit/abs_lables/valid \
              --valid-refs ../dataset/summarization/nnsum/reddit/reddit/human-abstracts/valid \
              --weighted \
              --gpu 0 \
              --model ../dataset/summarization/nnsum/reddit/model \
              --results ../dataset/summarization/nnsum/reddit/val_score \
              --seed 12345678 \
    --emb --pretrained-embeddings ../dataset/embedding/eng_word_embedding.word2vec.vec \
    --enc cnn \
    --ext s2s --bidirectional

However, some reference problems occurred:

{'train_inputs': PosixPath('../dataset/summarization/nnsum/reddit/reddit/inputs/train'), 'train_labels': PosixPath('../dataset/summarization/nnsum/reddit/reddit/abs_lables/train'), 'valid_inputs': PosixPath('../dataset/summarization/nnsum/reddit/reddit/inputs/valid'), 'valid_labels': PosixPath('../dataset/summarization/nnsum/reddit/reddit/abs_lables/valid'), 'valid_refs': PosixPath('../dataset/summarization/nnsum/reddit/reddit/human-abstracts/valid'), 'seed': 12345678, 'epochs': 50, 'batch_size': 32, 'gpu': 0, 'teacher_forcing': 25, 'sentence_limit': 50, 'weighted': True, 'loader_workers': 8, 'raml_samples': 25, 'raml_temp': 0.05, 'summary_length': 100, 'remove_stopwords': False, 'shuffle_sents': False, 'model': PosixPath('../dataset/summarization/nnsum/reddit/model'), 'results': PosixPath('../dataset/summarization/nnsum/reddit/val_score')}

{'embedding_size': 200, 'pretrained_embeddings': '../dataset/embedding/eng_word_embedding.word2vec.vec', 'top_k': None, 'at_least': 1, 'word_dropout': 0.0, 'embedding_dropout': 0.25, 'update_rule': 'fix-all', 'filter_pretrained': False}

{'dropout': 0.25, 'filter_windows': [1, 2, 3, 4, 5, 6], 'feature_maps': [25, 25, 50, 50, 50, 50], 'OPT': 'cnn'}

{'hidden_size': 300, 'bidirectional': True, 'rnn_dropout': 0.25, 'num_layers': 1, 'cell': 'gru', 'mlp_layers': [100], 'mlp_dropouts': [0.25], 'OPT': 's2s'}
Initializing vocabulary and embeddings.
INFO:root: Reading pretrained embeddings from ../dataset/embedding/eng_word_embedding.word2vec.vec
INFO:root: Read 559185 embeddings of size 200
INFO:root: EmbeddingContext(
  (embeddings): Embedding(559185, 200, padding_idx=0)
)
Loading training data.
Loading validation data.
Traceback (most recent call last):
  File "script_bin/train_model.py", line 79, in <module>
    main()
  File "script_bin/train_model.py", line 48, in main
    sentence_limit=args["trainer"]["sentence_limit"])
  File "/home/constant/anaconda3/lib/python3.7/site-packages/nnsum-1.0-py3.7.egg/nnsum/data/summarization_dataset.py", line 30, in __init__
  File "/home/constant/anaconda3/lib/python3.7/site-packages/nnsum-1.0-py3.7.egg/nnsum/data/summarization_dataset.py", line 59, in _collect_references
Exception: No references found for example id: 12he9h.32

I looked the data files, there are 12he9h.32.a.txt and 12he9h.32.d.txt in the path reddit/human-abstracts/valid. I do not know where the problem is.

Thanks for any help! @kedz

from nnsum.

alexookah commented on July 20, 2024

@tlifcen i encountered the same problem.
when you run python script_bin/train_model.py make sure you are using the correct paths.

assume your current directory is from python_script directory.

from nnsum.

somethings goes wrong after loading data about nnsum HOT 3 OPEN

Comments (3)

Related Issues (13)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

	for param in model.parameters():
	param.grad.data.clamp_(-grad_clip, grad_clip)