Coder Social home page Coder Social logo

pretraining_models's Introduction

Pretraining Language Models [PLM]

RNN-LM: Recurrent Neural Network Language Model
BoW: Bag of Word
CBoW: Continuous Bag of Word
FM: Factor Machine
LBL: Log Bi-Linear
Glove: Global Vectors for Word Representation
CoVe: Contextualize word Vectors
ELMO: Embeddings from Language Models 
AWD-LSTM:  ASGD Weight-Dropped LSTM
ULMFit : Universal Language Model Fine-tuning 
STLR: Slanted triangular learning rate�GLU: Gradual layer unfreezing
GPT: Generative Pre-Training
GELU:  Gaussian Error Linear Unit
CST:  Contiguous sequence of tokens
BERT: Bidirectional Encoder Representations from Transformers 
MLM: Masked language model
NSP: Next sentence prediction
TSPE: Token, Segment, Position Embeddings
BPE: Byte Pair Encoding
XLNet: Transformer-XL Net


ARLM : autoregressive language modelling
AELM: autoencoding language modelling
ERNIE: Enhanced Representation through kNowledge IntEgration
BLM+PLM+ELM: Basic-level masking + Phrase-level masking + named entity-level masking
TDPE: Token, Dialogue, Position Embeddings
TSPTE: Token, Sentence, Position, Task Embeddings
THU-ERNIE: Enhanced Language RepresentatioN with Informative Entities
dEA: denoising entity auto-encoder
UniLM:  Unified pre-trained  Language  Model
MT-DNN: Multi-Task Deep Neural Network 
SAN: stochastic answer network
XLM: Cross-lingual language model
TLPE: Token , Language, Position Embeddings
AELMARLM: autoregressive language modelling  autoencoding language modelling
PLM: Permutation Language Model
NADE: Neural Autoregressive Distribution Estimation
SG-Net: Syntax-Guided Network


SGSA: Syntax-guided self-attention
DOI Mask: dependency of interest mask
SBO: Span boundary objective
RoBERTa: A Robustly Optimized BERT Pretraining Approach
MASS:masked sequence to sequence for language generation
FEP: factorized embedding parametrization
SOP: Sentence-order prediction
CLPS: Cross-layer parameter sharing
KD: Knowledge Distillation
T5: Text-to-Text Transfer Transformer
4C:  Colossal Clean Crawled Corpus
ELECTRA: Efficiently Learning an Encoder that Classifies Token Replacements Accurately
RTD: Replaced token detection
ML-MLM: Multi-lingual masked language model
BART: Bidirectional and Auto-Regressive
Transformers
ANT: arbitrary noise transformations





pretraining_models's People

Contributors

chunqishi avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.