Can there be some more ways of gaining knowledge please? At least do

Documentation and Knowledge base is so incredible poor about epic HOT 4 OPEN

dlwh commented on September 23, 2024

Documentation and Knowledge base is so incredible poor

from epic.

Comments (4)

dlwh commented on September 23, 2024

Sorry about that. I agree the documentation is pretty shoddy. What would you like to be able to do?

Did you look at https://github.com/dlwh/epic-demo ?

from epic.

MarcusSjolin commented on September 23, 2024

My biggest problem I guess is to know what I can combine, what goes where and how things integrate with each other.

I'd like to know how I implement a simple feature to use when going through a text?

I'd like to know how to use multiple custom ones?

I've seen the Epic demos, and they all work

What are these representing?
preprocess?

Do something with the data before running something on it, but what can be achieved here?

slab?

A data source that you can do something with?

models?

Reference to a set of features that can pick out certain things in a text? (pre build ones are language feature detectors?)

parser?

Something that goes through the text to work out what is necessary?

trees?

A representation of what words are, like noun and after that there's a verb etc?

sequences?

Segment data to pick up if it is a set of two words or one?

Some of these concepts, I think it would be much easier to get started if they can be explained. Why they are there, and what I can do with them. If I'm looking for a certain feature, where should I look?

Might be a lot to answer, but I do think you got something useful here and I'd like to see it being developed further!

/Marcus

from epic.

dlwh commented on September 23, 2024

Thanks. That is helpful.

At the moment, the internals of Epic (making features, etc) are kind of
targeted at people with a good bit of NLP ML expertise. Really some of the
external bits are too. I would like to make it more friendly, but it's a
long way from that, obviously.

On Sun, Dec 21, 2014 at 3:16 PM, Marcus Sjölin [email protected]
wrote:

My biggest problem I guess is to know what I can combine, what goes where
and how things integrate with each other.

I'd like to know how I implement a simple feature to use when going
through a text?

I'm not sure what you mean here?

I'd like to know how to use multiple custom ones?

Featurizers in Epic can be added together with the "+" operator to create
composite featurizers.
"Featurizers" turn a sentence into a set of features. I think you might
have a misconception about what I mean by features (which is the standard
ML terminology?), which is property of (part of) an input data point (like
a sentence) that can be used to predict the appropriate output.

I've seen the Epic demos, and they all work

What are these representing?
preprocess?

Do something with the data before running something on it, but what
can be achieved here?

preprocess can:

segment sentences
val segmenter = MLSentenceSegmenter.bundled().get
segmenter.segment(text)
Tokenize sentences into words and punctuation.
epic.preprocess.tokenize(sentence)
Do both at once (epic.preprocess.preprocess) as demonstrated in the demo.
Extract content from arbitrary files or urls using Apache Tika
(epic.extractText(url))

slab?

A data source that you can do something with?

Slabs hold annotations (parse trees, named entities, etc) for a text in a
uniform way. We're actually reworking them, so don't put a lot of effort
into learning them.

models?

Reference to a set of features that can pick out certain things in a
text? (pre build ones are language feature detectors?)

Something like that. Models refer to the result of a machine learning
algorithm, with a featurizer, some weights, and a dynamic program which can
build structures over a text, like (I overload terminology and sometimes
use "model" to mean everything except the weights.)

parser?

Something that goes through the text to work out what is necessary?

Parsers produce parse trees, as below.

trees?

A representation of what words are, like noun and after that there's
a verb etc?

That and how the words are related to one another: what are the noun
phrases in a sentence, what verb has what object, etc.
http://en.wikipedia.org/wiki/Parse_tree

If you didn't know what these were going in, they will probably not be
useful to you---I'm working in the background on a format that's more
useful to laymen, but it will be some time.

sequences?

Segment data to pick up if it is a set of two words or one?

There are two kinds of predictions we have under sequences: something that
assigns a label to every word (e.g. part of speech tags like noun, verb,
etc), and those that assign a label to disjoint contiguous sequences of
words (e.g. which phrases are people, places, or things.)

Some of these concepts, I think it would be much easier to get started if
they can be explained. Why they are there, and what I can do with them. If
I'm looking for a certain feature, where should I look?

Might be a lot to answer, but I do think you got something useful here and
I'd like to see it being developed further!

/Marcus

—
Reply to this email directly or view it on GitHub
#18 (comment).

from epic.

MarcusSjolin commented on September 23, 2024

Thanks! That was really helpful, I think these answers were what I needed to grasp how things are connected. I now see more clearly how the process from input to output should be formed and what I can use in between. Thanks a lot!

Good going with the library as well, there seem to be a lot of work put into this.

/Marcus

from epic.

Documentation and Knowledge base is so incredible poor about epic HOT 4 OPEN

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent