Comments (16)
Hey,
I think that's a great idea! If we could include an up-to-date picture of the overall progress (in terms of new results) across all tasks, that would be awesome!
For the graph, I guess the main thing we want to track is new results that are added. As a PR can include multiple commits, it might get confusing if we ask people to include "new result" in a commit. So maybe asking to include "new result" in the title of the PR might make more sense?
from nlp-progress.
Yep, good idea. I've created one here. I'll link to this more prominently.
from nlp-progress.
Yeah. that's right.
Also, markdown doesn't provide dynamic charts otherwise we could've had something like "time period" over which the results are added. :(
from nlp-progress.
Maybe we can move to gh-pages.
https://nirmalsinghania2008.github.io/NLPprogress/
Looks good in my opinion.
from nlp-progress.
So, do you think if we add the instruction for people to include "new result" in the title of PR if the PR contains a new result, you could create a chart based on the PRs?
from nlp-progress.
Yep. I think so
from nlp-progress.
👍 I've just added a note for people to include "new result" in the title (bd91d2a) and will point future PRs to that.
from nlp-progress.
Plots may be very important, vide Performance section in Keras vs PyTorch vs plain tables.
And see also Measuring the Progress of AI Research by EFF, which you may already know.
Recreating with each addition is time consuming. At the same, if instead of MD files, data were stored in some semantic files (e.g. for human readability YAML), it would be easy to create HTML plots (as in: I am happy to create such). Jekyll has support for YAML, which I already (ab)use (for list of projects, conferences, etc).
With that we can create both a table (maybe even sortable) and plots. Plus, separating semantic (pun kind of intended) data from its presentation is a plus.
from nlp-progress.
Hi Piotr, thanks a lot for contributing to the discussion here. I had a look at EFF initially, but opted to keep it simple for the beginning.
I agree that having plots / graphs is a good idea. Given that we're not currently breaking out things by exact date of publication at the moment, my main concern is how much value they'll add vs. increased complexity, particularly for the rarer tasks. That doesn't mean that I don't want plots; I just want to make sure we get them right. :)
If you have time, would you mind taking one task and creating a plot for it, which we can use as a reference for further discussion?
from nlp-progress.
Sebastian, sure. I would be happy to give it a try this week.
I understand that EFF-style would be harder to maintain and collaborate on. For YAML, it won't add much complexity or confusion. Instead of tables, there will be something like:
- paper: "Blablisation of tokens"
year: 2018
score: 78.9
link: https://dx.doi.org/1234.1234.1234
implementation: https://github.com/aaa/bbb
However, how to locally build the site? (While I am familiar with Jekyll, the file structure is somewhat strangely flat). jekyll serve
does not generate any index.html
file. Is it some GitHub-only style?
from nlp-progress.
You mean the site at http://nlpprogress.com/?
That's just GitHub pages. In your fork, you can just go to Settings/GitHub Pages
and then just switch it on.
from nlp-progress.
I see (i works via GitHub Pages after deleting CNAME, but due to some installation hell cannot set it up locally). I am not sure this kind of Jekyll project supports data (cf. more generic structure of a Jekyll project, as in https://github.com/stared/stared.github.io).
If I come up with sth, I will let you know (most likely via PR :)).
from nlp-progress.
Yeah, I'm aware of Jekyll. I wanted to keep it as simple as possible for now to make maintenance and collaboration easy.
I'm open to using Jekyll, if we can still make it easy for people to contribute. Would love to see what you come up with. :)
from nlp-progress.
@sebastianruder Added a Pull Request #64 (right now mostly for viewing; if it works, we can decide on the exact form of entries and I am happy to rewrite quite a few things).
As a side note, the project may benefit from some structure, and splitting index.md
(automatically listing all pages) from README.md
(not there is some UX overlap between GitHub files and the website). See: https://github.com/stared/NLP-progress/tree/feat-restructure
I didn't include these changes as they are controversial, but not necessary to introduce plots. Let me know if it is also something you would like to consider (IMHO much cleaner structure).
from nlp-progress.
Thanks! I agree that a restructuring makes sense. Automatically listing the table of contents is also a good idea.
from nlp-progress.
Great, so I will try to incorporate that as well.
from nlp-progress.
Related Issues (20)
- How "SOTA" should results be? HOT 2
- SOTA entity linking is based on validation set not test set
- Add FinNLP Section HOT 3
- Hindi and Indian languages resource HOT 1
- NLP Results on code-mixed text HOT 1
- Maybe we should add readability assessment task, too? HOT 2
- Add Text-to-SQL progress (Dialogue) HOT 1
- Did you release dialogue progress? thanks
- For Grammar Error Correction task, why F0.5 is consider for evaluation and not F1? (Giving twice weight to precision than recall) HOT 1
- Add CFF (citation file format) to the repository HOT 1
- Add Dataset for Twitter
- DynaSent: Dynamic Sentiment Analysis Dataset
- English information extraction has incorrect F1 scores
- Language recognition? HOT 5
- Add sentence boundaries disambiguation section
- A Knowledge Graph resource of NLP-progress HOT 7
- NLP Repository
- Regarding the PreCo dataset
- Dependency parsing using NLP for list of words rather than a given sentence
- Tasks are not the right measure anymore
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nlp-progress.