Coder Social home page Coder Social logo

Comments (10)

NirantK avatar NirantK commented on April 28, 2024 2

Hey @sebastianruder @Hrant-Khachatrian ,

We maintain datasets and tools for several languages e.g. Korean, Arabic at awesome-nlp

To avoid duplication of effort, we can either

  • Add the State of the Art results there (we have a research section, which we'll remove shortly and instead point here)
  • Or we can move the entire non-English content to this repository and reorganize as you would like
  • Other options?

Let's avoid duplication either way, what do you think?

from nlp-progress.

sebastianruder avatar sebastianruder commented on April 28, 2024 2

Hey, I totally agree. The most important thing for me is not to duplicate effort. I've been following awesome-nlp and really like the collection of tools, particularly across languages.
For me personally, knowing about a dataset without links to papers or results on that dataset, however, hasn't been that useful.
For that reason, what would make the most sense for me would be to add non-English datasets and results to this repo. I'd love to add you as collaborators/maintainers to this repo if that doesn't seem like too much additional work.

from nlp-progress.

sebastianruder avatar sebastianruder commented on April 28, 2024

Yes! Including other languages is definitely on the road map!

I'm not sure what's the best way to include them at the moment. We could break out the results first per task and then per language. Breaking everything out per language first might make it hard to find things. Maybe with some better visualization (which we're working on at the moment), including other languages would be easier. Do you have any suggestions?

from nlp-progress.

NirantK avatar NirantK commented on April 28, 2024

Sure, happy to contribute.

We don't maintain results/papers for the libraries and datasets at awesome-nlp yet.

I will start by adding different language datasets, and we can add results as and when we find them. Does that sound good?

How do we handle libraries or tools?

In parallel, I'll add a link to nlpprogress.com to every language that is migrated here.

from nlp-progress.

PiotrCzapla avatar PiotrCzapla commented on April 28, 2024

Guys, we have results and data sets for polish nlp, where should we post them, this is for NER and for Language Modeling. How about we add a headline to each English pages as the most advanced language with links to other languages?
I will propose a change in a second.

from nlp-progress.

sebastianruder avatar sebastianruder commented on April 28, 2024

Let's have the discussion what is the best format for adding other languages here. @PiotrCzapla summarized this well in #105:
Either we have a file for each task linking to the task in each language:

  • language_modeling.md
    • ar_language_modeling.md
    • pl_language_modeling.md

Or we'd have a file for each language linking to each task in the language:

  • ar.md
    • language modeling
    • sentiment analysis

I think this mainly depends on what is the preferred way that people will look up tasks / results. Personally, I think the preferred setting is to look for one's own language and then at the tasks in that language. As long as we don't have too many tasks and languages yet, we could thus simply extend the main README.md file with additional languages and the corresponding tasks, e.g.

  • Arabic
    • Language modeling
    • Sentiment analysis
    • etc.
  • English
    • Language modeling
    • Sentiment analysis
    • etc.

What do you think?

from nlp-progress.

NirantK avatar NirantK commented on April 28, 2024

This sounds good to me!

Since most libraries are around POS tagging, tokenizers etc. in several languages, will start there over coming weekend.

from nlp-progress.

PiotrCzapla avatar PiotrCzapla commented on April 28, 2024

from nlp-progress.

sebastianruder avatar sebastianruder commented on April 28, 2024

Yep, good point. Let's keep English on the top.
@NirantK, given your great work on awesome-nlp and other NLP projects, would you like me to add you as a collaborator to the repo?

from nlp-progress.

NirantK avatar NirantK commented on April 28, 2024

Sure, happy to help.

Even then, I'll start by raising a few PR's so that we can workout any accidentally left out details.

from nlp-progress.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.