The korp-frontend from spraakbanken

Comparisons use the wrong saved searches

Save several different searches (more than two).
Do a comparison between two of them.

Result: If you're unlucky it won't use the two you selected.

Search words are not highlighted in KWIC when "in order" is disabled

In current dev version:
https://spraakbanken.gu.se/korplabb/#?stats_reduce=word&cqp=%5B%5D&corpus=romi,romii&in_order=false&lang=en&page=0&search=word%7Ckatt

Escaping of quotes should be done by doubling them

Currently quotes are escaped by prefixing them with backslash. This doesn't always work, and the following query will lead to a crash:

[word = "\""] [word = "och"]

Instead escaping should be done by doubling the quote characters:

[word = """"] [word = "och"]

Trend graph width wrong if opened in background tab

The trend graph gets a default width of 400px if opened in a background tab (within the korp interface, not background browser tab).
This only occurs if the tab is inactive at the time the graph loads.

Trend diagram shows "no data" for periods with no hits

Currently the trend diagram shows "we have no data for this period" for periods covered by corpora with no hits by greying them out.

The problem seems to be that Korp omits corpora with no hits from the trend diagram query.

Example

GP 2001 has no hits for this query. Open the trend diagram and note that 2001 is greyed out due to GP2001 not being part of the parameters to the backend.

This is probably fixed by simply including all corpora in the query.

Map can be opened with no lines selected

You can open a map tab even when there are no lines selected in the statistics table. This should not be possible.

Auto-selecting a new KWIC tab broken sometimes

For example: clicking a link in statistics to open a new KWIC.

Before closing any tabs, everything works as expected. After closing a tab and then opening a new tab, the new tab will not be selected and the current tab becomes replaced with nothing.

Trend diagram x-axis labels are incorrect for 1900 and earlier

The label 1900 is missing completely, and in its place it says 1890. For 1890 it says 1880, and so on. Everything before the year 1910 is offset by 10 years.

This is unfortunately a bug in the library we use to draw the diagram: shutterstock/rickshaw#606

Port code from coffeescript to javascript

This is the first step towards modernising tooling, with the goal of ending up using TypeScript to improve refactoring support.

Select first keyword in KWIC after searching or switching page

Always select the first keyword in the KWIC after doing a search or navigating between KWIC pages. Currently this is only done for the first search.

Wrong dependency head when Swedish is the parallel language

There is a problem with the highlighting of dependency head, when the Swedish corpus is the second language. E.g. here, the word "sågspån" has highlighted the word "ser" as head, but it should be "spillt":

This is not a problem when Swedish is the first language. Here is the same example, and "spillt" is correctly marked as the head:

I've tested on the corpus "ASPAC svenska-engelska", but the problem is everywhere in that corpus, so I don't think it's a corpus problem but a Korp bug.

Case sensitivity preserved after changing annotation

In extended search, select a case insensitive word search.
Switch to lemgram instead.
The case sensitivity switch disappears but the search is still case insensitive. Use the advanced tab to confirm.

Date format in trend diagram popup should reflect date granularity

The date format of the popup should reflect the current granularity. For example, when the granularity is set to "month", the popup should read "February 2018", not "2018-02-01 00:00:00".

Word links in parallel corpora don't work

When selecting a word in the KWIC for parallel corpora with word linking, the corresponding word(s) in the other language should be highlighted. This has stopped working.

Example

KWIC download is broken in parallel mode

Related words-function only works sometimes

It seems like the related words always pop up for this specific search:

https://spraakbanken.gu.se/korp/#?stats_reduce=word&cqp=%5B%5D&corpus=suc3&search=lemgram%7Chund%5C.%5C.nn%5C.1&page=0

But not for the standard selection of corpora:

https://spraakbanken.gu.se/korp/#?stats_reduce=word&cqp=[]&page=0&search=lemgram|hund\.\.nn\.1

The search should always be a lemgram for it to work. And check it works regardless of login status.

Add heatmap to the map view

A heatmap mode would be very useful in our map view.

There are several plugins available for this:
https://leafletjs.com/plugins.html#heatmaps

Stats download: add info about the search to output TSV or CSV

The order in the KWIC is off

Context view doesn't highlight sentences containing hits

The context view should (as it previously did) show sentences containing the keywords (i.e. the sentences from the KWIC) in black and grey out the rest. Currently everything is greyed out except for the actual keywords.

Extra data column in KWIC for sentence structural attributes

JSON button in statistics tab doesn't update for new searches

Perform a search.
Go to the statistics tab.
Perform a new (different) search.
Click the JSON-button.
Result: You get the JSON for your previous search instead of the current one.

Lemgram suggestions temporarily broken after searching

Complete a simple search.
Start typing something new in the search field. Lemgram suggestions does not work! 😞

Clicking anywhere else and then refocusing the input field enables lemgram suggestions again.

Downloading from secondary KWIC tabs is broken

Downloading results is broken on the secondary KWIC tabs you can open from word pictures etc.

Trend diagram table export button does not work in Firefox

Pressing the “Export” button in the trend diagram table view seems to have no effect when using Firefox (version 66.0.5 on Linux, reportedly also versions on Windows and Mac). It works correctly in Chrome and reportedly Safari.

This bug affects at least Korp 7.0.0 at Språkbanken and Korp 5.0.10 at the Language Bank of Finland. I haven’t tried if it has been fixed in the development version.

(The bug was first reported by Tommi Jauhiainen of FIN-CLARIN.)

Add relative hits to map view

Currently we only use absolute hits, meaning that places from which we have a lot of material become over-represented in the map view, with big circles even when the search word is relatively unusual there. We should let the user switch between absolute and relative hits in the map view, and use the relative_to_struct parameter to get the relative frequencies from the backend. The relative view should possibly be the default one.

Example:
/count?...&group_by_struct=text__geoauthorhome&relative_to_struct=text__geoauthorhome
The relative numbers in this result are different from the same query without relative_to_struct=text__geoauthorhome.

Alphabetic sorting of statistics columns

Currently they are sorted by internal corpus names, making "Norstedtsromaner" come before "Bonniersromaner". This is confusing to the user. Alphabetic sorting based on the display names would be better.

Remove `command`-argument to backend

Remove the command-arguments to the backend since it is not needed anymore.

Add support for "not in order" in extended search

Currently only available in simple search.

Add text length data to ASU corpus

I ASU är det viktigt att kunna få uppgifter om textlängder, eftersom det är centralt att kunna jämföra frekvenser i textenheter av varierande längd och därför kunna ta fram relativa värden på antal ord i valda textdelar. Där är antalet egentliga ord ett relevantare mått än Korps antal token, som också räknar in skiljetecken. Transkriptionen i ASU har många markeringar för olika syntaktiska skiljetecken, pauser, pausfyllare och kodväxlingsmarkörer, och dessa varierar i antal mellan olika textenheter. Att räkna in dessa i textlängden kan ge betydande missvisningar, inte minst vid jämförelser över inlärningsstadier, vilket ju ofta blir aktuellt i ASU. Det finns därför ett behov att få uppgift om antalet verkliga ord i texterna.

Lemgram and sense links in sidebar only work once

Perform a search.
Click on a lemgram or sense in the sidebar to search for it.
Try clicking on another lemgram or sense.

Result: Nothing happens.

"Go to page" input field doesn't work on word picture KWICs

Open a KWIC from a word picture.
Try using the "Go to page" input field to navigate to another page.

Result: Page indicator updates, but page doesn't actually change.

Time graph in corpus selector breaks when no corpus has time data

Example: https://spraakbanken.gu.se/korplabb/?mode=siberian_german
Opening the corpus selector results in an error in the console.

Statistics columns can not be resized, making content unreadable

When enough corpora are selected it becomes impossible to resize the columns of the statistics table, rendering the table useless in many cases as you can't read the content.

Example search where you can't see the whole content for the "word" column.

KWIC downloading broken

KWIC downloading is partially broken.

Downloading KWIC from the context view results in an empty file.
Downloading KWIC when "in order" is disabled also results in an empty file, or, when selecting "one token per row", a crash .
TSV uses spaces instead of tabs.

Export all KWIC hits to CSV/TSV, not just the current page

Dependency tree is broken

Opening the dependency tree just yields an empty popup. No errors in log.

Overflow/scroll is broken for "About Korp" modal

Open "About Korp" window.
Change height of browser window to something smaller than the contents of the window.
😱

Folder name in corpus URL parameter broken

E.g. https://spraakbanken.gu.se/korp/#?corpus=fisk should select the "Finlandssvenska texter" group, currently results in a broken GUI

Hovering over lines in trend diagram is partially broken

Hovering over a line shows a popup with details. When more than one line is visible, you have to hover far above the lines to get the popup to show.

Short KWIC rows cut off corpus names

Corpus name headers in the KWIC get cut off when all the rows are short:

https://spraakbanken.gu.se/korplabb/#?stats_reduce=word&cqp=%5Bword%20%3D%20%22och%22%20%26%20lbound(sentence)%20%26%20rbound(sentence)%5D&corpus=drama&search_tab=1&search=cqp

Also when the hit is the first word of the sentence:

https://spraakbanken.gu.se/korp/#?lang=en&stats_reduce=word&cqp=%5Bword%20%3D%20%22Han%22%20%26%20lbound(sentence)%5D&corpus=romi&search_tab=1&search=cqp

Search history is broken

Selecting an old search from the list does nothing.
The code seems to be looking for "http://", but the world is using https://.

Linking to corpus folders does not work

corpus=xyz is supposed to select all corpora under the folder with the id xyz. This does not currently work.

"Compile based on" empty after changing corpus

Select a corpus with text attributes.
"Compile based on" one of these text attributes.
Select a new corpus that does not have this attribute.
Deselect the first corpus.
Perform a search.

Result: "Compile based on" is empty, and the statistics tab crashes.

Menu opens outside of window

This applies to the current dev branch.

Open the menu in the top right.
Close it.
Change the size of the browser window.
Open the menu again.

Result: Menu is positioned outside of the browser window.

Trend diagram table representation: add links to KWIC

It would be useful if you could get the KWIC of the hits for a certain period of time also from the table representation of the trend diagram, as well as from the graph representations. Each cell of the table would then need to be (or contain) a link to a search for the hits for a value (row) in a period of time (column).

This was wished for by a user of the Korp at the Language Bank of Finland. He found it difficult to choose an exact period of time in the graph representations.

Extra data column in KWIC for match word attribute

ASU corpus use case:

Det vore värdefullt om taggen till varje träfford visades i en kolumn i konkordansen (som i ITG). Man har nytta av det om konkordansen omfattar träfford med olika taggar.

Secondary KWIC broken in parallel mode

Opening KWIC from statistics does not work.

The corpus parameter to the backend is set to:
EUROPARL-EN,ASPACSVEN-EN
instead of the correct:
EUROPARL-EN|EUROPARL-SV,ASPACSVEN-EN|ASPACSVEN-SV

JSON button doesn't properly URL encode CQP queries

Perform the following search: [word = "national.*" & word != "nationaliteter" %c]
Click on the JSON button.

Result: You get an error. The query string can't be parsed due to the CQP query not being properly encoded.

Trend diagram table export options not correctly localized initially

In the trend diagram table export, the values in the selection lists for the type of frequencies and file format are not correctly localized initially: they always show Swedish texts at first, regardless of the UI language selection:

Changing the UI language while the trend diagram tab is open resets the values to correctly localized ones. And in fact, even the Swedish texts shown after changing the UI language are different from the values shown at first: the texts are at first “Relativa tal” and “CSV (kommaseparerade värden)”, but after changing the UI language, they become “Relativa frekvenser” and ”CSV (semikolonseparerade värden)”:

The initial, non-localized texts seem to be shown always after opening a new trend diagram tab: changing the UI language while a tab is being shown does not affect tabs opened after the language change.

This bug affects at least Korp 7.0.0 at Språkbanken and Korp 5.0.10 at the Language Bank of Finland. I haven’t tested if it has been fixed in the development version.

(The bug was first reported by Tommi Jauhiainen of FIN-CLARIN.)

spraakbanken / korp-frontend Goto Github PK

korp-frontend's People

Contributors

Stargazers

Watchers

Forkers

korp-frontend's Issues

Recommend Projects

Recommend Topics

Recommend Org