hist-pl,kawu

Website: mark words recognized by Morfeusz

Extension: button images

Images should be shown instead of "W tył" and "W przód" buttons.

Website: show tooltip definition

When a mouse hovers over a historical word, a tooltip with a definition (or an equivalent) of the word should appear.

Add entries from Linde

Add new entries on the basis of the Linde dictionary, but only those entries which can be found also in historical texts.

Problem: we should think about a potencial problem of entry duplication -- while adding an entry from the Linde dictionary, it is possible that there is already an entry representing the same lexeme in our dictionary. We should be able to identify such a situation and merge both entries, accordingly.

Sentence-level segmentation

The system doesn't perform any sentence-level segmentation right now. Tools assume, that text is already divided into sentences.

Are we gonna use SRX rules in the future?

Extension: pop-up window on top

The pop-up window, in which the description of the searched phrase is shown, should be always on top of the current window. Is it possible?

Add contexts of form occurences

After #25 is closed, we should work on adding contexts of form occurences on the basis of the collection of historical documents.

Solution 1

We have a preliminary implementation of the BaseX API. We can use it to easily perform modifications on the LMF version of the dictionary.

Question: do we need the binary version of the dictionary here? If so, does the modification introduced on the LMF version needs to be immediatelly visible in the binary version? If so, this solution is completely impractical, since it is not possible to update the binary version on-the-fly (yet).

Otherwise, if there is no need to use the binary version (or at least to update it on-the-fly), we can use the BaseX-based solution.

Solution 2

There is no need to update the dictionary (either binary or LMF) on-the-fly. We can keep generated contexts in a key-value store (using e.g. http://hackage.haskell.org/package/cassy), and only in the end perform the update as a single pass on the LMF dictionary.

Extension: proposed functionality

The extension should provide two kinds of functionality:

Looking up an individual word
Annotating a phrase

The easiest way to do that is to provide a single page, which -- depending on the type of argument -- will present a description of a word or a sentence with marked historical forms. The extension can be then based on Dictionary Tooltip.

Website: disable horizontal textarea resize

Website: problem with fonts

[Testing in Windows Firefox] In an entry description, occurence contexts are shown using a bigger font than headers! That's strange, it should be corrected.

Website: analyse contexts

It will be useful when contexts are automatically analysed. A user will be able to look at definitions and use links to get detailed information about individual words.

Similar: add info about installation process

Extension: localize prev/next button tooltips

Website UI: viewing long chunks of text

The current analysis UI is not very well adapated for long texts. There are several problem, among others:

Part of the text is hidden,
The Znakuj button has to be clicked multiple times,
Focus goes to the top of the page when writing something at the bottom.

Word-level segmentation ambiguities

It would be nice to implement some word-level segmentation rules. The question is, how such a segmentation should work given the historical dictionary structure? For example, the "chciałabyś" word is one word in the dictionary and in Morfeusz it consists of three segments. Perhaps, then, there won't be any ambiguities in the word-level historical segmentation?

Use span instead of inline div

Website: local links

Add support for links, which modify only the form paramter of the current request.

Website/extension: add transcription functionality

Extension: add description

Extension should have a more informative description (which is shown, e.g., when installing the extension in firefox).

Improve presentation of the analaysed text

In particular, due to the HTML formatting, spaces are not shown properly right now.

Build (doc path <-> doc ID) correspondence

In order to be able to make references (in a form of the @SourceID attribute) we need a concise way of identifying documents. That's why we need a file with (doc path <-> doc ID) correspondence, which will allow us to use IDs as references.

Binary dictionary: missing element types

LMF element types missing in the binary representation of the dictionary:

Equivalent (no such entries in LMF version at the moment)
Statement (no such entries in LMF version at the moment)
Sense Relation (no such entries in LMF version at the moment)

Lexicon: uncover lower-level IO exceptions

Low-level IO exceptions should be accesible for the user. Right now, generalized descriptions of individual exceptions are shown, for example:

load: failed to open entry with the Key {path = "z", uid = 1} key

Website: handle long chunks of text

The website cannot handle long chunks of text, most likely due to lazy IO -- the program opens too many file handles or somethinkg like that. The web handler shows, for example:

A web handler threw an exception. Details:
user error (load: failed to open entry with the Key {path = "z", uid = 1} key)

LMF dictionary: strange word forms

It is not directly related to the binary dictionary, but the problem is conspicuous: there are many word forms in the LMF verrsion of the dictionary which look like this:

<feat att="writtenForm" val="potrzebno&amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;gt;"/>

Extension: doesn't show definitions

The title attribute doesn't work in the embedded browser, apparently.

Website: href encoding

While there should be no forms with special characters (e.g &), the site should support them anyway.

Poser: doesn't write XML header in the output file

Extension: add configuration

User should be able to change the address of the web service.

Handle capitalization when looking up a form

Extension: add icon with link to configuration

Website: improve entry presentation

Add custom subpage for lexical entry presentation.

Website: add page with dictionary index

One of the website pages should provide an alphabetic index which will allow to view the entire dictionary entry by entry.

Binary dictionary: lookup by ID

It should be possible to lookup entry by its identifier (stored as id attribute of the LexicalEntry element). The reason: we want to be able to follow pointers (which have a form of identifiers) occuring in some dictionary elements, e.g. in Related Forms.