Coder Social home page Coder Social logo

Comments (10)

sebastianruder avatar sebastianruder commented on April 29, 2024 2

Ok. So as things stand now, I think it'll be more beneficial to the community to have things in the more readable Markdown format to facilitate reading and contributing. We can think again about converting to YAML if there's a more immediate need in the future.

from nlp-progress.

NirantK avatar NirantK commented on April 29, 2024

Thinking out loud:

Assuming that markdown tables can be parsed with something like fsm, we can probably use markdown tables + git logs for plotting and trend spotting.

We could also automate a bot which periodically, say, every 2 weeks - dumps markdown data into more machine readable _data folder for such usage.

from nlp-progress.

stared avatar stared commented on April 29, 2024

@NirantK
It is nowhere near that simple. Turning Markdown tables to YAMLs required a lot of my manual labour (even with some automatization) - various formats, some formatting mistakes, etc.

Also, for converting tables to YAML I wrote this script:
https://gist.github.com/stared/ec29b1e8d3c99a6288dcc20d77affc93

It requires some manual inspection, as:

  • there is some inconsistency with table formats
  • there is some misformatting (e.g. no closing |)
  • I manually check if to use &Author2018 and <<: *Author2018 mappings

from nlp-progress.

NirantK avatar NirantK commented on April 29, 2024

Thanks for sharing that script @stared ! Some neat hacks there.

I am hoping that if we enforced a markdown table linter of some sort, this would be slightly less tedious to do. I definitely don't claim that it is simple.

To focus on the issue at hand, I am simply asking if the loss in reader (and contributor) ease of access is worth the gain from visualizations?

from nlp-progress.

sebastianruder avatar sebastianruder commented on April 29, 2024

Yep, a table linter or better enforcement of style guidelines is something we'd definitely want to do.

So far, I haven't really seen any visualizations that added much value beyond what the tables provide. The progress visualizations at AI metrics are nice, but I don't think they're that helpful if a task doesn't have a clear metric of human performance.
@stared, do you have any thoughts regarding a "killer visualization" that would clearly warrant using YAML files?

from nlp-progress.

NirantK avatar NirantK commented on April 29, 2024

Hey @stared - just following up :)

from nlp-progress.

stared avatar stared commented on April 29, 2024

OK, I know it is a matter of taste. Personally for me YAML files are easier to edit than Markdown tables, and are less error-prone (end certainly simpler than Markdown table + enforcing linter). I admit that for others can have different opinions, depending on the background.

With killer features:

  • visualization (all markdown scraping will be clunky)
  • possibility to add OTHER data (e.g. comments, other fields when they become necessary)
  • possibility of copying entries (before there was redundancy and there were errors)

For contributions, I think that the tricky part is to inform where is the
(can be done easily, by adding an automatic link [edit entry in filename]).

For viewing changes - by pushing to one's own repos, one can see it online.

When it comes to visualizations - true, that for many area (especially if there are only 4 entries or so) it does not provide that much additional information.

from nlp-progress.

sebastianruder avatar sebastianruder commented on April 29, 2024

While I really like the idea of separating the presentation from the data and storing the data in a dedicated format, the benefits at this point to me seem to be overshadowed by the additional burden placed on the contributor (who might not have used YAML before) and on the reader (who won't be able to view the tables on GitHub).

As at this point the objective should be to get more data (for more tasks and languages) in this repo, these two disadvantages to me outweigh the potential upsides of using YAML.

from nlp-progress.

NirantK avatar NirantK commented on April 29, 2024

@sebastianruder should I go ahead and refactor the Hindi and Korean pages to use Markdown?

from nlp-progress.

sebastianruder avatar sebastianruder commented on April 29, 2024

Yes, let's do that. Thanks!

from nlp-progress.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.