Using Google's pretrained language model, BERT, I aim to train a smaller bidirectional LSTM model with knowledge distillation. The kaggle dataset from Toxic Comment classification is used for this task.
liangzongchang / knowledge-distillation-using-bert Goto Github PK
View Code? Open in Web Editor NEWThis project forked from adi218/knowledge-distillation-using-bert
Using Google's pretrained language model, BERT, I aim to train a smaller bidirectional LSTM model with knowledge distillation. The kaggle dataset from Toxic Comment classification is used for this task.
License: MIT License