Coder Social home page Coder Social logo

bert-ls's Introduction

jqiang.github.io

Jipeng Qiang

bert-ls's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bert-ls's Issues

Bug in convert_whole_word_to_feature

 ind = 0
   for pos in  mask_position:
       true_word = true_word + tokens[pos]
       if(ind ==0):
           tokens[pos] = '[MASK]'
       else:
           del tokens[pos]
           del input_type_ids[pos]
       ind = ind + 1

This line at convert_whole_word_to_feature is problematic because positions of tokens changes after using del . Instead of deleting forward, I would consider deleting backward in order to avoid this problem like this. I confirmed this is an error with tokenized sentence as following,
['His', 'stories', 'g', '##lit', '##tered', 'with', 'color', ';']

count = 0
      mask_position_length = len(mask_position)
      while count in range(mask_position_length):
          index = mask_position_length - 1 - count
          pos = mask_position[index]
          if index == 0:
              tokens[pos] = '[MASK]'
          else:
              del tokens[pos]
              del input_type_ids[pos]
          count +=1

Cheers!

May I ask did you do ablation study?

In the paper, you suggested 4 type of features, did you do any ablation study, e.g. remove one or two of these features, to see if it still gives similar performance ?

Files request

Dear colleagues,
thank you for sharing your code.

May I also ask you to share the files referenced in the code and mentioned in the report but not present in the repo (if possible)? Or the scripts preparing them.
I mean word_frequency_wiki.txt and the CBT dictionary,

Thank you in advance.

How to use other models ?

Which files should be changed if loading another pretrained bert with pytorch and other fasttext embeddings ?

Dependencies unknown

This repository does not have any requirements.txt and that makes the reproducing task much harder, because all the dependencies should be installed manually. The only way to know that some dependency is missing is to encounter a runtime error.

`NoneType` object has no attribute `to`

I've just downloaded the repo, installed all the dependencies and tried to run ./run_LSBert1.sh. This is the error I got:

AttributeError: 'NoneType' object has no attribute 'to' 
Full log
(XXXXXXX) XXXXX@XXXXXX:XXXXXX$ ./run_LSBert1.sh                                                                                             
INFO:__main__:device: cpu n_gpu: 0, distributed training: False, 16-bits training: False                    
INFO:pytorch_pretrained_bert.file_utils:https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-vocab.txt not found in cache, downloading to /tmp/tmppn72yug9                         
100%|█████████████████████████████████████████████████████████████| 231508/231508 [00:03<00:00, 61222.48B/s]INFO:pytorch_pretrained_bert.file_utils:copying /tmp/tmppn72yug9 to cache at /XXXXXXXXXXXX/.cache/torch/pytorch_pretrained_bert/b3a6b2c6d7ea2ffa06d0e7577c1e88b94fad470ae0f060a4ffef3fe0bdf86730.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084                                                                   
INFO:pytorch_pretrained_bert.file_utils:creating metadata file for /XXXXXXXXXXXX/.cache/torch/pytorch_pretrained_bert/b3a6b2c6d7ea2ffa06d0e7577c1e88b94fad470ae0f060a4ffef3fe0bdf86730.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084                                                                             
INFO:pytorch_pretrained_bert.file_utils:removing temp file /tmp/tmppn72yug9                                 
INFO:pytorch_pretrained_bert.tokenization:loading vocabulary file https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-vocab.txt from cache at /XXXXXXXXXXXX/.cache/torch/pytorch_pretrained_bert/b3a6b2c6d7ea2ffa06d0e7577c1e88b94fad470ae0f060a4ffef3fe0bdf86730.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084                                                                        
ERROR:pytorch_pretrained_bert.modeling:Couldn't reach server at '/home/qiang/Desktop/pytorch-pretrained-BERT/bert-large-uncased-whole-word-masking-pytorch_model.bin' to download pretrained weights.                   
Traceback (most recent call last):                                                                          
  File "/XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/LSBert1.py", line 951, in <module>            
    main()                                                                                                  
  File "/XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/LSBert1.py", line 822, in main                
    model.to(device)                                                                                        
AttributeError: 'NoneType' object has no attribute 'to'  

Referring to local files

'bert-large-uncased-whole-word-masking': "/home/qiang/Desktop/pytorch-pretrained-BERT/bert-large-uncased-whole-word-masking-pytorch_model.bin",

'bert-large-uncased-whole-word-masking': "/home/qiang/Desktop/pytorch-pretrained-BERT/bert-large-uncased-whole-word-masking-config.json",

Why don't replace that with https://s3.amazonaws.com/models.huggingface.co/bert/... links?

cosine similarity ? sentence loss ?

  1. In the paper, it is claimed that the cosine similarity is computed by concatenating first 4 layers from BERT. But in the code, it is computed from word embedding - fasttext, why ?

  2. To compute the proposal score, why you compute the masked loss over all words in a sentence ? is this the same defined in your paper ?

Now i know where the difference comes from, I read the incomplete version of this paper.

mask words not in list ?

The following error occurs when runnning with dataset: lex_mturk.txt.
Traceback (most recent call last):
File "LS_Bert.py", line 953, in
main()
File "LS_Bert.py", line 867, in main
mask_index = words.index(mask_words[i])
ValueError: 'companies' is not in list

Issue tested

Thanks for your reply! Sorry I should have described what the problem was in detail.
Here is the result of running your code.

mask_position = [2,3,4]
tokens = ['His', 'stories', 'g', '##lit', '##tered', 'with', 'color', ';']

ind = 0
for pos in mask_position:
    if (ind == 0):
        tokens[pos] = '[MASK]'
    else:
        del tokens[pos]
    ind = ind + 1

print(tokens)

['His', 'stories', '[MASK]', '##tered', 'color', ';']
I don't think this is what you wanted to get, isn't it?
I thought the tokens should be like this after running

mask_position = [2,3,4]
tokens = ['His', 'stories', 'g', '##lit', '##tered', 'with', 'color', ';']

count = 0
mask_position_length = len(mask_position)
while count in range(mask_position_length):
    index = mask_position_length - 1 - count
    pos = mask_position[index]
    if index == 0:
        tokens[pos] = '[MASK]'
    else:
        del tokens[pos]
    count += 1
print(tokens)

['His', 'stories', '[MASK]', 'with', 'color', ';']

Thank you !

Originally posted by @TheoSeo93 in #2 (comment)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.