Coder Social home page Coder Social logo

hulmona's Introduction

hULMonA: tHe first Universal Language MOdel iN Arabic

Paper: https://www.aclweb.org/anthology/W19-4608

Introduction

Recent state-of-the-art models in NLP (e.g., BERT, GPT, ULMFiT) utilize transfer learning by pre-training a language model on large curpos and then fine-tuning it on any downstream task. We developed the first Arabic specific universal language model, hULMonA, that can be fine-tuned for almost any Arabic text classification task. We evaluated hULMonA on Sentiment Analysis and achieved state-of-the-art on 4 Arabic datasets. hULMonA consists of three main stages:

1. General domain hULMonA pretraining

To capture the various properties of the Arabic language, we train the SOTA-ish language model AWD-LSTM on all Arabic Wikipedia.

This step is time consuming, but it should be done only once. We publish our pre-trained model, and it is availabe in models directory. To check the implementation details, or to pre-train your own LM, check build_arabic_language_model.ipynb

2. Target task hULMonA fine-tuning

The target task data (e.g., Twitter) will likely come from a different distribution than the general-domain data (Wikipedia). Therefore, fine-tuning the pretrained general-domain LM on the target task data is necessary for the LM to adapt to the new textual properties (e.g., dialects).

To fine-tune the pre-trained hULMonA on your own dataset, please check fine_tune_LM.ipynb

3. Target task classification

Finally, for downstream task classification, we augment the fine-tuned hULMonA with two fully connected layers with ReLU and Softmax activations respectively. Implementatoin details can be found here: fine_tune_LM.ipynb

How do I cite hULMonA?

Please cite this paper:

@inproceedings{eljundi2019hulmona,
  title={hULMonA: The Universal Language Model in Arabic},
  author={ElJundi, Obeida and Antoun, Wissam and El Droubi, Nour and Hajj, Hazem and El-Hajj, Wassim and Shaban, Khaled},
  booktitle={Proceedings of the Fourth Arabic Natural Language Processing Workshop},
  pages={68--77},
  year={2019}
}

Contact information

For help, issues, or personal communication related to using hULMonA, please contact Obeida ElJundi ([email protected]), Wissam Antoun ([email protected]), or Nour El Droubi ([email protected]).

hulmona's People

Contributors

obeidaeljundi avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.