snunlp / kr-bert Goto Github PK

View Code? Open in Web Editor NEW

199.0 199.0 25.0 23.79 MB

KoRean based BERT pre-trained models (KR-BERT) for Tensorflow and PyTorch

Python 100.00%

nlp

kr-bert's People

Contributors

Stargazers

Watchers

Forkers

ybhwang ilya-palachev haconedu meh9184 sorensenjs htw5295 kimsijin33 boraseo560 zinzinbin laplacekorea aki6022 sanajlee ocean-joo zechuncao hhosu107 bkeit sungwon-chae

kr-bert's Issues

Model weights for MLM fine-tuning

Thanks for making the models for sequence classification accessible to public!!

We wonder whether you still have the model weights for MLM training.
Specifically, the below part of the network when using transformers library:

(cls): BertOnlyMLMHead(
    (predictions): BertLMPredictionHead(
      (transform): BertPredictionHeadTransform(
        (dense): Linear(in_features=768, out_features=768, bias=True)
        (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
      )
      (decoder): Linear(in_features=768, out_features=16424, bias=True)
    )
  )

Is there any published paper describing your work?

Hello,

First of all, thank you for publishing this repository. Is there any published paper describing your work? I mean a paper in some journal or conference proceedings. This information would help to understand your work a lot.

Thanks in advance!

Detailed question about your paper

I read your paper and good to see that less data is needed to train the model.

I would like to know some details like the number of layers, input length, whether additional training objectives like SOP is used related to pretraining.
I tried to find the info on your paper but failed.
Let me know.
Thanks.

snunlp / kr-bert Goto Github PK

kr-bert's People

Contributors

Stargazers

Watchers

Forkers

kr-bert's Issues

Model weights for MLM fine-tuning

Is there any published paper describing your work?

Detailed question about your paper

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent