tianjianjiang / megatron-deepspeed Goto Github PK
View Code? Open in Web Editor NEWThis project forked from bigscience-workshop/megatron-deepspeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
License: Other