Coder Social home page Coder Social logo

Comments (6)

GoogleCodeExporter avatar GoogleCodeExporter commented on June 30, 2024
Dear Ryan,

Thank you for opening this issue. HeidelTime is available as
- a UIMA component, and as
- a standalone version, which can be used without UIMA.

In both cases, HeidelTime requires some preprocessing, namely, sentence 
splitting, tokenization, and part-of-speech tagging. For all languages except 
Arabic and Vietnamese, we use the TreeTagger for these tasks.

As explained in the readme of the UIMA version and in the Manual of the 
standalone version, you have to download the TreeTagger and its modules for the 
languages you want to process. In the standalone version, you then have to set 
the path to the TreeTagger in the config.props. 

Please have a look in the Manual for more details.

Thanks,
Jannik

Original comment by [email protected] on 21 Jun 2013 at 7:51

  • Changed state: Done

from heideltime.

GoogleCodeExporter avatar GoogleCodeExporter commented on June 30, 2024
Ok, thanks. My goal is to use heideltime from Hadoop. If it's all in Java, this 
will be easy.

Original comment by [email protected] on 21 Jun 2013 at 4:32

from heideltime.

GoogleCodeExporter avatar GoogleCodeExporter commented on June 30, 2024
Ok, I'm still real confused here. I can't use perl for what I am doing. Is 
Heideltime Java or not?

Original comment by [email protected] on 24 Jun 2013 at 10:41

from heideltime.

GoogleCodeExporter avatar GoogleCodeExporter commented on June 30, 2024
Dear Ryan,

I would kindly refer to the Manual where you can find a description how to run 
the standalone version from the command line. Make sure that you have the 
TreeTagger installed and the path to the Treetagger set correctly in the 
config.props, but as already mentioned, everything is explained in the Manual.

If you run into specific problems, we are happy to help, but then we need to 
know what you are actually trying to do in more detail.

Thanks,
Jannik

Original comment by [email protected] on 26 Jun 2013 at 12:33

from heideltime.

GoogleCodeExporter avatar GoogleCodeExporter commented on June 30, 2024
My goal is to deploy Heideltime on a Hadoop cluster. Currently, I search for 
dates with regex. I'd like to improve on that.

The Treetagger dependency is where I am stuck. It runs fine on my laptop, but, 
because it's not Java, it's difficult (impossible?) to install/run Treetagger 
on every node in my cluster.

Can I somehow remove Treetagger and still get ok results? Perhaps there is a 
Java library out there that I can use in Treetaggerwrapper.java instead?

Original comment by [email protected] on 26 Jun 2013 at 7:05

from heideltime.

GoogleCodeExporter avatar GoogleCodeExporter commented on June 30, 2024
Hi Ryan,

You can use the Stanford POS Tagger instead of the TreeTagger -- however 
currently not with the standalone version. We will add parameter to the 
standalone version to decide which POS tagger should be used. However, this is 
not implemented yet. 

You could replace the TreeTaggerWrapper with the Stanford POS Wrapper in the 
source code of the standalone version. What you should keep in mind is that 
HeidelTime requires Sentence information. Without sentence information, you 
won't get any results. Without token and pos information, you can get results, 
however, they will probably be worse.

Thanks,
Jannik


Original comment by [email protected] on 28 Jun 2013 at 9:30

from heideltime.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.