zphang / bert_on_stilts Goto Github PK

View Code? Open in Web Editor NEW

106.0 106.0 30.0 1.12 MB

Fork of huggingface/pytorch-pretrained-BERT for BERT on STILTs

License: Apache License 2.0

Dockerfile 0.02% Python 54.85% Jupyter Notebook 45.13%

bert_on_stilts's People

Contributors

Stargazers

Watchers

bert_on_stilts's Issues

How to read the results in Table 1?

I have some questions about interpreting the results in Table 1.

Here's Table 1 for reference:

My questions apply to all models but for simplicity I will just ask about BERT.

My question is:

How to identify the intermediate task for "BERT on STILTS" in the Test Set Scores. For example, in CoLA, the best model in the development set is just BERT with score in 62.1. This is reflected in "BERT, Best of Each". So, in the test set, which model is used for "BERT on STILTS"?

I'm guessing it has to be the best model within the ones that had intermediate tasks so it's BERT->MNLI since its score is 59.8. So for CoLA, BERT->MNLI scored 59.8 on the development set score and 62.1 in the test set. Is the correct way to read the table?

Training error ---> SyntaxError: invalid syntax

--task_name $TASK \
--do_train --do_val --do_test --do_val_history \
--do_save \
--do_lower_case \
--bert_model bert-large-uncased \
--bert_load_path $PRETRAINED_MODEL_PATH \
--bert_load_mode model_only \
--bert_save_mode model_all \
--train_batch_size 24 \
--learning_rate 2e-5 \
--output_dir $OUTPUT_PATH

File "glue/train.py", line 224
args.output_dir, f"all_state___epoch{epoch:04d}___batch{step:06d}.p"
^
SyntaxError: invalid syntax

Availble pre-trained model?

Hi,

I was hoping to compare this approach with my own sentence embedding method.

Sorry if this is mentioned somewhere (I couldn't find it) but is the "best" pretrained model from the paper freely available? The one fine-tuned on MNLI (and not additionally the tasks of GLUE).

I would be super helpful for comparison if the model weights were on https://huggingface.co/models! (I don't see anything when searching for "STILTS")

Release pretrained models for BERT base?

I noticed all the pretrained models are using BERT Large.

Why not release pretrained models on BERT Base too?

请问下，您的torch版本是多少，为什么我输入from pytorch_pretrained_bert.optimization import warmup_linear，结果显示cannot import name 'warmup_linear'？

Quality of Adapters

Hello! I am interested in whether you were able to get similar results to your paper using adapters rather than whole model tuning.

It seems like adapters might be less effective at holding the information received from an intermediate task as compared to tuning the whole BERT model.

There also doesn't seem to be much documentation on Adapters in this code base - do you have any pointers of a good example in the code?

bert_load_mode?

Hi, this is really amazing works. Would you mind explaining these bert_load_modes? Thanks a lot!

model_only
state_model_only
state_all
state_full_model
full_model_only
state_adapter

Biomedical uses?

[Adding as a place marker / bookmark.]

Awesome work! :-D

I'm interested in biomedical uses of contextual language models; I would appreciate a heads-up if anyone applies this or other approaches to that task (I am aware of BioBERT: https://arxiv.org/pdf/1901.08746.pdf).

zphang / bert_on_stilts Goto Github PK

bert_on_stilts's People

Contributors

Stargazers

Watchers

Forkers

bert_on_stilts's Issues

How to read the results in Table 1?

Training error ---> SyntaxError: invalid syntax

Availble pre-trained model?

Release pretrained models for BERT base?

请问下，您的torch版本是多少，为什么我输入from pytorch_pretrained_bert.optimization import warmup_linear，结果显示cannot import name 'warmup_linear'？

Quality of Adapters

bert_load_mode?

Biomedical uses?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent