Comments (10)
I think it would be best if the documentUri be the actual wikipedia page URL (if you
paste it in the browser, it should work). The collectionId might be the base URL of
the particular language version of the wikipedia.
Example:
baseUri: http://en.wikipedia.org/
documentUri: http://en.wikipedia.org/w/index.php?title=Main_Page&oldid=433914749
I think the URI should always contain the revision, not only when reading texts from
the RevisionMachine.
Original issue reported on code.google.com by richard.eckart
on 2011-08-30 18:27:16
from dkpro-core.
Nice idea.
I see one major issue:
We would then need to translate the language stored in the JWPL database (english,
simple_english, german, etc) into the prefix for the URL (en, de, etc).
I am not aware of any such translation table. We would have to build and maintain it
on our own :/
Original issue reported on code.google.com by torsten.zesch
on 2011-09-14 07:52:01
from dkpro-core.
We could use this list to map from a Wikipedia language String to the wiki:
http://meta.wikimedia.org/wiki/List_of_Wikipedias
We "just" have to parse the page :)
Original issue reported on code.google.com by oliver.ferschke
on 2011-09-14 08:50:59
from dkpro-core.
Here's the current language statistics (the source for the list of the static wiki page
mentioned before):
http://s23.org/wikistats/wikipedias_wiki.php
Original issue reported on code.google.com by oliver.ferschke
on 2011-09-14 08:53:36
from dkpro-core.
(No text was entered with this change)
Original issue reported on code.google.com by richard.eckart
on 2012-02-08 22:51:53
- Labels added: Milestone-1.4.0
from dkpro-core.
(No text was entered with this change)
Original issue reported on code.google.com by richard.eckart
on 2012-06-12 09:21:59
from dkpro-core.
(No text was entered with this change)
Original issue reported on code.google.com by richard.eckart
on 2012-07-19 09:38:57
from dkpro-core.
(No text was entered with this change)
Original issue reported on code.google.com by richard.eckart
on 2012-10-13 18:31:40
- Labels added: DKPro-ASL
from dkpro-core.
(No text was entered with this change)
Original issue reported on code.google.com by richard.eckart
on 2012-10-13 18:33:40
- Labels added: Milestone-1.5.0
- Labels removed: Milestone-1.4.0
from dkpro-core.
Only provided a few language mappings.
Additional ones need to be added when needed.
Original issue reported on code.google.com by torsten.zesch
on 2013-01-20 15:05:59
from dkpro-core.
Related Issues (20)
- Upgrade dependencies (2.3.x)
- Drop maui module
- Drop ARK module
- Drop Cermine
- Drop classic Stanford NLP integration
- Remove SemanticFieldAnnotator
- Remove constraints parameter on token merger
- Upgrade dependencies (2.4.0)
- Upgrade dependencies (2.3.1)
- Option to replace illegal characters in XMI files
- Support xml id on certain TEI elements
- Upgrade dependencies (2.5.0)
- Relation offsets not set in WebAnnoTsv3XReader
- Allow defining features as IRI features so they are not rendered as literal strings
- Strip out BOM when reading text files
- Drop ClearNLP module
- Drop NLP4J module
- Drop cogroo module
- Drop mate-tools module
- Drop BerkeleyParser module
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dkpro-core.