talnupf / dkpro-core Goto Github PK

View Code? Open in Web Editor NEW

This project forked from dkpro/dkpro-core

0.0 0.0 0.0 83.61 MB

Collection of software components for natural language processing (NLP) based on the Apache UIMA framework.

Home Page: https://dkpro.github.io/dkpro-core

License: Other

Java 98.70% HTML 0.87% Groovy 0.30% Rich Text Format 0.13%

dkpro-core's People

Contributors

Watchers

dkpro-core's Issues

Modify the udpipe segmenter to add splitted contractions

The UDPipe segmenter is not annotating the splitted contractions due to the fix from #4. Implement the usual strategy for annotating contractions in UIMA, it is, annotating the new tokens contiguously under the span of the contraction.

DoR: #4
DoD: The segmenter annotates the components of the contractions contiguously. If the components doesn't fit, throws an exception.

Put order in our DKPro changes

Submit issues for changes to be integrates in the DKPro
Create an internal version with the changes over an stable version while changes are not integrated into the DKPro
Create independent modules for changes not to be integrated in DKPro

brat exporter

Create a brat exporter (different from brat writer)
that is thought for representing data

create the class and arrange the code so the brat writer remains unchanged (or minimal changes)
change the parameters to allow different improvments:
- select the feature to represent
- same color for each feature
- color depending of a numerical value (between min(clear) /max(darker) )

Adapt UDPipe Segmenter to UDPipe 2

In UDPipe 2, the words/tokens are splitted in two different lists, words list for single tokens, included the tokens resulting from separating contractions but not the contractions themselves, which are stored into multiwordTokens list.

The 1.12.x version of the UDPipe Segmenter does not seem to take this into account and this seems to be causing random errors (inexplicably, sometimes it seems to work...). Adapt the UDPipeSegmenter according to the described list setup.

DoR: Already ready.
DoD: UDPipeSegmenter works with any contractions in the text.

talnupf / dkpro-core Goto Github PK

dkpro-core's People

Contributors

Watchers

dkpro-core's Issues

Modify the udpipe segmenter to add splitted contractions

Put order in our DKPro changes

brat exporter

Adapt UDPipe Segmenter to UDPipe 2

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent