Coder Social home page Coder Social logo

bert_on_stilts's People

Contributors

abeljim avatar clmnt avatar davidefiocco avatar donglixp avatar elyase avatar fdecayed avatar hzhwcmhf avatar joedumoulin avatar julien-c avatar kkadowa avatar ksurya avatar liangtaiwan avatar likejazz avatar llidev avatar lukovnikov avatar matej-svejda avatar patrick-s-h-lewis avatar rodgzilla avatar sam-writer avatar sodre avatar tholor avatar thomwolf avatar tnlin avatar trault14 avatar victorsanh avatar weiyumou avatar wlhgtc avatar wrran avatar xiaoda99 avatar zphang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

bert_on_stilts's Issues

How to read the results in Table 1?

I have some questions about interpreting the results in Table 1.

Here's Table 1 for reference:

image

My questions apply to all models but for simplicity I will just ask about BERT.

My question is:

How to identify the intermediate task for "BERT on STILTS" in the Test Set Scores. For example, in CoLA, the best model in the development set is just BERT with score in 62.1. This is reflected in "BERT, Best of Each". So, in the test set, which model is used for "BERT on STILTS"?

I'm guessing it has to be the best model within the ones that had intermediate tasks so it's BERT->MNLI since its score is 59.8. So for CoLA, BERT->MNLI scored 59.8 on the development set score and 62.1 in the test set. Is the correct way to read the table?

Training error ---> SyntaxError: invalid syntax

--task_name $TASK \
--do_train --do_val --do_test --do_val_history \
--do_save \
--do_lower_case \
--bert_model bert-large-uncased \
--bert_load_path $PRETRAINED_MODEL_PATH \
--bert_load_mode model_only \
--bert_save_mode model_all \
--train_batch_size 24 \
--learning_rate 2e-5 \
--output_dir $OUTPUT_PATH

File "glue/train.py", line 224
args.output_dir, f"all_state___epoch{epoch:04d}___batch{step:06d}.p"
^
SyntaxError: invalid syntax

Availble pre-trained model?

Hi,

I was hoping to compare this approach with my own sentence embedding method.

Sorry if this is mentioned somewhere (I couldn't find it) but is the "best" pretrained model from the paper freely available? The one fine-tuned on MNLI (and not additionally the tasks of GLUE).

I would be super helpful for comparison if the model weights were on https://huggingface.co/models! (I don't see anything when searching for "STILTS")

Quality of Adapters

Hello! I am interested in whether you were able to get similar results to your paper using adapters rather than whole model tuning.

It seems like adapters might be less effective at holding the information received from an intermediate task as compared to tuning the whole BERT model.

There also doesn't seem to be much documentation on Adapters in this code base - do you have any pointers of a good example in the code?

bert_load_mode?

Hi, this is really amazing works. Would you mind explaining these bert_load_modes? Thanks a lot!

  • model_only
  • state_model_only
  • state_all
  • state_full_model
  • full_model_only
  • state_adapter

Biomedical uses?

[Adding as a place marker / bookmark.]

Awesome work! :-D

I'm interested in biomedical uses of contextual language models; I would appreciate a heads-up if anyone applies this or other approaches to that task (I am aware of BioBERT: https://arxiv.org/pdf/1901.08746.pdf).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.