Coder Social home page Coder Social logo

nadi's Introduction

Nuanced Arabic Dialect Identification Shared Task Series (NADI)

This repository lists information relevant to the Nuanced Arabic Dialect Identification Shared Task Series (NADI).


NADI 2021: The Second Nuanced Arabic Dialect Identification Shared Task

NADI2021 We present the findings and results of the Second Nuanced Arabic Dialect Identification Shared Task (NADI 2021). This Shared Task includes four subtasks: country-level Modern Standard Arabic (MSA) identification (Subtask 1.1), country-level dialect identification (Subtask 1.2), province-level MSA identification (Subtask 2.1), and province-level sub-dialect identification (Subtask 2.2). The data for the shared task covers a total of 100 provinces from 21 Arab countries and are collected from the Twitter domain. A total of 53 teams from 23 countries registered to participate in the tasks, thus reflecting the interest of the community in this area. We received 16 submissions for Subtask 1.1 from five teams, 27 submissions for Subtask 1.2 from eight teams, 12 submissions for Subtask 2.1 from four teams, and 13 Submissions for subtask 2.2 from four teams.

Offical website

Access the offical website via this link.

Download the data

To download the data you need to fill the registration form. link

Sub-tasks

(1) Subtask 1 (Country Level)

  • Subtask 1.1: Country-level MSA identification: A total of 21,000 tweets, covering 21 Arab countries. CODALAB link
  • Subtask 1.2: Country-level DA identification: A total of 21,000 tweets, covering 21 Arab countries. CODALAB link

(2) Subtask 2 (Province level)

  • Subtask 2.1: Province-level MSA identification: A total of 21,000 tweets, covering 100 provinces. CODALAB link
  • Subtask 2.2: Province-level DA identification: A total of 21,000 tweets, covering 100 provinces. CODALAB link

Please cite NADI 2021 as follows:

@inproceedings{mageed:2021:nadi,
    author = {Abdul-Mageed, Muhammad and Zhang, Chiyu and Elmadany, AbdelRahim and Bouamor, Houda and Habash, Nizar}, 
    title = {{NADI 2021: The Second Nuanced Arabic Dialect Identification Shared Task}},
    booktitle ={Proceedings of the Sixth {A}rabic Natural Language Processing Workshop (WANLP 2021)},
    year = {2021},
}


NADI 2020: The First Nuanced Arabic Dialect Identification Shared Task

NADI2021 We present the results and findings of the First Nuanced Arabic Dialect Identification Shared Task (NADI). This Shared Task includes two subtasks: country-level dialect identification (Subtask 1) and province-level sub-dialect identification (Subtask 2). The data for the shared task covers a total of 100 provinces from 21 Arab countries and are collected from the Twitter domain. As such, NADI is the first shared task to target naturally-occurring fine-grained dialectal text at the sub-country level. A total of 61 teams from 25 countries registered to participate in the tasks, thus reflecting the interest of the community in this area. We received 47 submissions for Subtask 1 from 18 teams and 9 submissions for Subtask 2 from 9 teams.

Offical website

Access the offical website via this link.

Download the data

To download the data you need to fill the registration form. link

Sub-tasks

  • Subtask 1 Country-level dialect identification: A total of 21,000 tweets, covering all 21 Arab countries. This is a new dataset created for this shared task. CODALAB link
  • (2) Subtask 2 Province-level dialect identification. A total of 21,000 tweets, covering 100 provinces from all 21 Arab countries. This is the same dataset as in Subtask 1, but with province labels. CODALAB link

Please cite NADI 2020 as follows:


@inproceedings{mageed:2020:nadi,
  title={{NADI 2020: The First Nuanced Arabic Dialect Identification Shared Task}},
  author={Abdul-Mageed, Muhammad and Zhang, Chiyu and Bouamor, Houda and Habash, Nizar},
  booktitle={Proceedings of the Fifth Arabic Natural Language Processing Workshop},
  pages={97--110},
  year={2020}
}

nadi's People

Contributors

elmadany avatar mageed avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.