arabic-dialect-identification's Introduction

Arabic Dialect Identification

Due to the social media now, Arabic dialects have begun to appear in written form. The problem of automatically determining the dialect of an Arabic text remains a major challenge for researchers. This project use deep learning model and machine learning model for the problem of automatically identifying the dialect of a text written in Arabic.

Indroduction

The process of computationally identifying the language of a given text is considered the cornerstone of many important NLP applications such as machine translation, social media analysis, etc. Since the dialects could be considered as a closely related languages, dialect identification could be referred to as a special (more diﬀicult) case of language identification problem.

Dataset collected of tweets belonging to a wide range of country level Arabic dialects covering 18 different countries in the Middle East and North Africa region. Our method for building this dataset relies on applying multiple filters to identify users who belong to different countries.

Dataset Link

This Link Model trained

To run this project

download the dataset from link above
Download model trained above link if not train from scratch
install Libraries in requirment
run app.py in folder Deployment

run app.py

open command line set path of project Then write flask run

Recommend Projects

dinaabdalla2018 / arabic-dialect-identification Goto Github PK