Comments (6)
Dear Ryan,
Thank you for opening this issue. HeidelTime is available as
- a UIMA component, and as
- a standalone version, which can be used without UIMA.
In both cases, HeidelTime requires some preprocessing, namely, sentence
splitting, tokenization, and part-of-speech tagging. For all languages except
Arabic and Vietnamese, we use the TreeTagger for these tasks.
As explained in the readme of the UIMA version and in the Manual of the
standalone version, you have to download the TreeTagger and its modules for the
languages you want to process. In the standalone version, you then have to set
the path to the TreeTagger in the config.props.
Please have a look in the Manual for more details.
Thanks,
Jannik
Original comment by [email protected]
on 21 Jun 2013 at 7:51
- Changed state: Done
from heideltime.
Ok, thanks. My goal is to use heideltime from Hadoop. If it's all in Java, this
will be easy.
Original comment by [email protected]
on 21 Jun 2013 at 4:32
from heideltime.
Ok, I'm still real confused here. I can't use perl for what I am doing. Is
Heideltime Java or not?
Original comment by [email protected]
on 24 Jun 2013 at 10:41
from heideltime.
Dear Ryan,
I would kindly refer to the Manual where you can find a description how to run
the standalone version from the command line. Make sure that you have the
TreeTagger installed and the path to the Treetagger set correctly in the
config.props, but as already mentioned, everything is explained in the Manual.
If you run into specific problems, we are happy to help, but then we need to
know what you are actually trying to do in more detail.
Thanks,
Jannik
Original comment by [email protected]
on 26 Jun 2013 at 12:33
from heideltime.
My goal is to deploy Heideltime on a Hadoop cluster. Currently, I search for
dates with regex. I'd like to improve on that.
The Treetagger dependency is where I am stuck. It runs fine on my laptop, but,
because it's not Java, it's difficult (impossible?) to install/run Treetagger
on every node in my cluster.
Can I somehow remove Treetagger and still get ok results? Perhaps there is a
Java library out there that I can use in Treetaggerwrapper.java instead?
Original comment by [email protected]
on 26 Jun 2013 at 7:05
from heideltime.
Hi Ryan,
You can use the Stanford POS Tagger instead of the TreeTagger -- however
currently not with the standalone version. We will add parameter to the
standalone version to decide which POS tagger should be used. However, this is
not implemented yet.
You could replace the TreeTaggerWrapper with the Stanford POS Wrapper in the
source code of the standalone version. What you should keep in mind is that
HeidelTime requires Sentence information. Without sentence information, you
won't get any results. Without token and pos information, you can get results,
however, they will probably be worse.
Thanks,
Jannik
Original comment by [email protected]
on 28 Jun 2013 at 9:30
from heideltime.
Related Issues (20)
- Strange online result HOT 3
- Inflected variants of "ein"(einer, einem) not recognized HOT 2
- Installation issue HOT 7
- I am new to java . I have configured as it is written in Manual. I am using windows. I am facing problem in executing standalone version HOT 3
- Incorrect value for decades/centuries? HOT 4
- Sharing resources for Russian HOT 4
- Sentence splitting bug in de.unihd.dbs.uima.annotator.stanfordtagger.StanfordPOSTaggerWrapper
- "Charset mismatch" when running the standalone version under Ubuntu HOT 8
- heideltime standalone not working under my ubuntu system HOT 7
- problems to run heideltime on ubuntu HOT 1
- improper handling of newline when reading files HOT 2
- strange rule matching error HOT 3
- POS matching HOT 2
- NullPointerException in TreeTaggerWrapper HOT 7
- availability of resources for Portuguese HOT 1
- German compounds consisting of weekday + time of day not extracted HOT 2
- StanfordPOSTaggerWrapper model path HOT 4
- Regular expression
- Descriptors of Chinese text HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from heideltime.