I'd like to discuss here the pros and cons of using YAML going forward or whether we s

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Thanks for sharing that <a class="user-mention notranslate" data-hovercard-type

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

YAML - pros and cons about nlp-progress HOT 10 CLOSED

sebastianruder commented on April 29, 2024 1

YAML - pros and cons

from nlp-progress.

Comments (10)

sebastianruder commented on April 29, 2024 2

Ok. So as things stand now, I think it'll be more beneficial to the community to have things in the more readable Markdown format to facilitate reading and contributing. We can think again about converting to YAML if there's a more immediate need in the future.

from nlp-progress.

NirantK commented on April 29, 2024

Thinking out loud:

Assuming that markdown tables can be parsed with something like fsm, we can probably use markdown tables + git logs for plotting and trend spotting.

We could also automate a bot which periodically, say, every 2 weeks - dumps markdown data into more machine readable _data folder for such usage.

from nlp-progress.

stared commented on April 29, 2024

@NirantK
It is nowhere near that simple. Turning Markdown tables to YAMLs required a lot of my manual labour (even with some automatization) - various formats, some formatting mistakes, etc.

Also, for converting tables to YAML I wrote this script:
https://gist.github.com/stared/ec29b1e8d3c99a6288dcc20d77affc93

It requires some manual inspection, as:

there is some inconsistency with table formats
there is some misformatting (e.g. no closing |)
I manually check if to use &Author2018 and <<: *Author2018 mappings

from nlp-progress.

NirantK commented on April 29, 2024

Thanks for sharing that script @stared ! Some neat hacks there.

I am hoping that if we enforced a markdown table linter of some sort, this would be slightly less tedious to do. I definitely don't claim that it is simple.

To focus on the issue at hand, I am simply asking if the loss in reader (and contributor) ease of access is worth the gain from visualizations?

from nlp-progress.

sebastianruder commented on April 29, 2024

Yep, a table linter or better enforcement of style guidelines is something we'd definitely want to do.

So far, I haven't really seen any visualizations that added much value beyond what the tables provide. The progress visualizations at AI metrics are nice, but I don't think they're that helpful if a task doesn't have a clear metric of human performance.
@stared, do you have any thoughts regarding a "killer visualization" that would clearly warrant using YAML files?

from nlp-progress.

NirantK commented on April 29, 2024

Hey @stared - just following up :)

from nlp-progress.

stared commented on April 29, 2024

OK, I know it is a matter of taste. Personally for me YAML files are easier to edit than Markdown tables, and are less error-prone (end certainly simpler than Markdown table + enforcing linter). I admit that for others can have different opinions, depending on the background.

With killer features:

visualization (all markdown scraping will be clunky)
possibility to add OTHER data (e.g. comments, other fields when they become necessary)
possibility of copying entries (before there was redundancy and there were errors)

For contributions, I think that the tricky part is to inform where is the
(can be done easily, by adding an automatic link [edit entry in filename]).

For viewing changes - by pushing to one's own repos, one can see it online.

When it comes to visualizations - true, that for many area (especially if there are only 4 entries or so) it does not provide that much additional information.

from nlp-progress.

sebastianruder commented on April 29, 2024

While I really like the idea of separating the presentation from the data and storing the data in a dedicated format, the benefits at this point to me seem to be overshadowed by the additional burden placed on the contributor (who might not have used YAML before) and on the reader (who won't be able to view the tables on GitHub).

As at this point the objective should be to get more data (for more tasks and languages) in this repo, these two disadvantages to me outweigh the potential upsides of using YAML.

from nlp-progress.

NirantK commented on April 29, 2024

@sebastianruder should I go ahead and refactor the Hindi and Korean pages to use Markdown?

from nlp-progress.

sebastianruder commented on April 29, 2024

Yes, let's do that. Thanks!

from nlp-progress.

YAML - pros and cons about nlp-progress HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent