bran,patverga

How many epochs did you train for CDR corpus?

Question regarding the biaffine score

Hi I am not sure how to understand the equation for biaffine score
a_ij = (E_head L)e_tail + (E_head l_b)

What is e_tail and l_b? What are the dimensions of each of these quantities? I assume E_head is the head embedding.

Why use add operator here?

Line 471 in 32378da

result += tf.expand_dims(self.ep_dist_batch, 3)

I am trying to use this code but I think this line should be a multiply operator as described in the section 2.4 of the paper? thank you!

Questions regarding CDR results and noise_classifier

Hello.

I have run this bran code for the CDR dataset several times and now I have few questions.

I found that the results printed on the STDOUT are from the ner_eval and the relation_eval functions.
Also I found that the "Best" score in the following sentence "Eval decreased for %d epochs out of %d max epochs. Best: %2.2f" is the result from the development part most of the time.

For some reason, when I run the code for CDR dataset (with no additional data), the F-score from the test set is 60% in average. I think my configuration is the problem. I have following weird STDOUT logs. But I got stuck here with no improvement for days. Could you give me some help, please?

Couldnt get noise_classifier/token_embeddings Couldnt get noise_classifier/w_1 Couldnt get noise_classifier/b_1 Couldnt get noise_classifier/pos_encoding Couldnt get noise_classifier/nuW Couldnt get noise_classifier/num_blocks_0/multihead_attention/dense/bias Couldnt get noise_classifier/num_blocks_0/multihead_attention/dense_1/kernel Couldnt get noise_classifier/num_blocks_0/multihead_attention/dense_1/bias Couldnt get noise_classifier/num_blocks_0/multihead_attention/dense_2/kernel Couldnt get noise_classifier/num_blocks_0/multihead_attention/dense_2/bias Couldnt get noise_classifier/num_blocks_0/multihead_attention/ln/Variable Couldnt get noise_classifier/num_blocks_0/multihead_attention/ln/Variable_1 Couldnt get noise_classifier/num_blocks_0/multihead_attention/conv1d/kernel Couldnt get noise_classifier/num_blocks_0/multihead_attention/conv1d/bias Couldnt get noise_classifier/num_blocks_0/multihead_attention/conv1d_1/kernel Couldnt get noise_classifier/num_blocks_0/multihead_attention/conv1d_1/bias Couldnt get noise_classifier/num_blocks_0/multihead_attention/conv1d_2/kernel Couldnt get noise_classifier/num_blocks_0/multihead_attention/conv1d_2/bias Couldnt get noise_classifier/num_blocks_0/multihead_attention_1/ln/Variable Couldnt get noise_classifier/num_blocks_0/multihead_attention_1/ln/Variable_1 Couldnt get noise_classifier/num_blocks_1/multihead_attention/dense/kernel Couldnt get noise_classifier/num_blocks_1/multihead_attention/dense/bias Couldnt get noise_classifier/num_blocks_1/multihead_attention/dense_1/kernel Couldnt get noise_classifier/num_blocks_1/multihead_attention/dense_1/bias Couldnt get noise_classifier/num_blocks_1/multihead_attention/dense_2/kernel Couldnt get noise_classifier/num_blocks_1/multihead_attention/dense_2/bias Couldnt get noise_classifier/num_blocks_1/multihead_attention/ln/Variable Couldnt get noise_classifier/num_blocks_1/multihead_attention/ln/Variable_1 Couldnt get noise_classifier/num_blocks_1/multihead_attention/conv1d/kernel Couldnt get noise_classifier/num_blocks_1/multihead_attention/conv1d/bias Couldnt get noise_classifier/num_blocks_1/multihead_attention/conv1d_1/kernel Couldnt get noise_classifier/num_blocks_1/multihead_attention/conv1d_1/bias Couldnt get noise_classifier/num_blocks_1/multihead_attention/conv1d_2/kernel Couldnt get noise_classifier/num_blocks_1/multihead_attention/conv1d_2/bias Couldnt get noise_classifier/num_blocks_1/multihead_attention_1/ln/Variable Couldnt get noise_classifier/num_blocks_1/multihead_attention_1/ln/Variable_1 Couldnt get noise_classifier/text/dense/kernel Couldnt get noise_classifier/text/dense/bias Couldnt get noise_classifier/text/dense_1/kernel Couldnt get noise_classifier/text/dense_1/bias Couldnt get noise_classifier/text/dense_2/kernel Couldnt get noise_classifier/text/dense_2/bias Couldnt get noise_classifier/text/dense_3/kernel Couldnt get noise_classifier/text/dense_3/bias Couldnt get noise_classifier/text/Bilinear/Weights Couldnt get noise_classifier/num_blocks_0_1/multihead_attention/ln/Variable Couldnt get noise_classifier/num_blocks_0_1/multihead_attention/ln/Variable_1 Couldnt get noise_classifier/num_blocks_0_1/multihead_attention_1/ln/Variable Couldnt get noise_classifier/num_blocks_0_1/multihead_attention_1/ln/Variable_1 Couldnt get noise_classifier/num_blocks_1_1/multihead_attention/ln/Variable Couldnt get noise_classifier/num_blocks_1_1/multihead_attention/ln/Variable_1 Couldnt get noise_classifier/num_blocks_1_1/multihead_attention_1/ln/Variable Couldnt get noise_classifier/num_blocks_1_1/multihead_attention_1/ln/Variable_1 Couldnt get noise_classifier/score_sentence/ner_w_1 Couldnt get noise_classifier/score_sentence/ner_b_1

By the way, thanks for the great work.

Error while reoloading checkpoint

Log:
Traceback (most recent call last): File "utils/checkpoint_converter.py", line 21, in <module> dump_bert_model(checkpoint, output) File "utils/checkpoint_converter.py", line 11, in dump_bert_model model = SequenceTaggingTransformer.load_from_checkpoint(checkpoint) File "/home/levi/levi/anaconda3_server1/envs/torch_nlp/lib/python3.7/site-packages/pytorch_lightning/core/saving.py", line 153, in load_from_checkpoint model = cls._load_model_state(checkpoint, *args, strict=strict, **kwargs) File "/home/levi/levi/anaconda3_server1/envs/torch_nlp/lib/python3.7/site-packages/pytorch_lightning/core/saving.py", line 192, in _load_model_state model.load_state_dict(checkpoint['state_dict'], strict=strict) File "/home/levi/levi/anaconda3_server1/envs/torch_nlp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1045, in load_state_dict self.__class__.__name__, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for SequenceTaggingTransformer: Unexpected key(s) in state_dict: "model.bert.embeddings.position_ids".

All candidate pair scores?

Hi
I was trying to understand the code. I found that you are feeding an abstract separately for each candidate pair of the abstract in the model. However, in the paper, it is written once for all candidate pairs.

Am I missing something?

Thanks

Fail to generate the CTD dataset

[lsong10@bhg0031 bran]$ ./extract.sh
Downloading Pubtator dump
--2019-03-31 21:09:22-- ftp://ftp.ncbi.nlm.nih.gov/pub/lu/PubTator/bioconcepts2pubtator_offsets.gz
=> ‘/home/lsong10/ws/exp.dep_forest/bran/data/ctd/bioconcepts2pubtator_offsets.gz’
Resolving ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)... 130.14.250.13, 2607:f220:41e:250::7
Connecting to ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)|130.14.250.13|:21... failed: Connection refused.
Connecting to ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)|2607:f220:41e:250::7|:21... failed: Network is unreachable.
Converting data from pubtator to tsv format
usage: process_CDR_data.py [-h] -i INPUT_FILE -d OUTPUT_DIR -f
OUTPUT_FILE_SUFFIX [-s MAX_SEQ] [-a FULL_ABSTRACT]
[-p PUBMED_FILTER] [-r RELATIONS]
[-w WORD_PIECE_CODES] [-t SHARDS]
[-x EXPORT_ALL_EPS] [-n EXPORT_NEGATIVES]
[-e ENCODING] [-m MAX_DISTANCE]
process_CDR_data.py: error: argument -a/--full_abstract: expected one argument
split: extra operand ‘up’
Try 'split --help' for more information.
map relations to smaller set
awk: cmd. line:1: fatal: cannot open file positive_0_genia' for reading (No such file or directory) seperate data into train dev test positive train 50 500 positive dev 50 500 positive test 50 500 negative train 50 500 awk: cmd. line:1: fatal: cannot open file negative_0_genia' for reading (No such file or directory)
negative dev 50 500
awk: cmd. line:1: fatal: cannot open file negative_0_genia' for reading (No such file or directory) negative test 50 500 awk: cmd. line:1: fatal: cannot open file negative_0_genia' for reading (No such file or directory)

Where is the lambda parameter?

hello! I found this in your paper,
“To trade off the two objectives, we penalize the named entity updates with a hyperparameter λ.”
But I didn't find it in the code. Is it called another name and I ignored it? What is the default setting for this hyperparameter?
Thanks!

Why is the number of documents in "pubmed_split_lengths.txt" inconsistent with that mentioned in paper?

The number of documents in "pubmed_split_lengths.txt" is 69771, but that mentioned in paper is 68400.
Why they are inconsistent?

Error while converting processed data to tensorflow protobufs

Hi all,
When I tried running
python ${CDR_IE_ROOT}/src/processing/labled_tsv_to_tfrecords.py --text_in_files ${processed_dir}/*tive_*CDR* --out_dir ${proto_dir} --max_len ${max_len} --num_threads 10 --multiple_mentions --tsv_format --min_count ${min_count}

I got this error:
print('\n'.join(sorted(["%s : %s" % (str(k), str(v)) for k, v in FLAGS.dict['__flags'].iteritems()])))
KeyError: '__flags'

Can anyone please help? Thanks!

Why not outputs += inputs ?

bran/src/models/transformer.py

Line 269 in 3bdb65f

inputs += outputs

Question regarding the "Bi-affine Pairwise Scores"

Hello?

I read the paper and I found the "Bi-affine Pairwise Scores" concept interesting. However, in your code, it doesn't seem to use the "Bi-affine Pairwise Scores" equation described in the paper.
I think the base classifier class, "ClassifierModel" in the classifier_models.py do not have this "Bi-affine Pairwise Scores" feature.
If I am mistaken, could you tell me where to look?

patverga / bran Goto Github PK

bran's People

Contributors

Stargazers

Watchers

Forkers

bran's Issues

Recommend Projects

Recommend Topics

Recommend Org