coneda / kor Goto Github PK

ConedaKOR – store.manage.retrieve.

License: GNU Affero General Public License v3.0

Ruby 19.13% JavaScript 69.82% CoffeeScript 0.32% CSS 5.76% Shell 0.39% HTML 0.65% Gherkin 3.19% Python 0.14% SCSS 0.56% EJS 0.02% Dockerfile 0.02%

digital-humanities graph-database digital-asset-management research art-history

kor's Introduction

ConedaKOR

ConedaKOR is a web based application which allows you to store arbitrary documents and interconnect them with relationships. You can build huge semantic networks for an unlimited amount of domains. This integrates a sophisticated ontology management tool with an easy to use media database.

Features

Instead of filling countless lists with your metadata, shape it as entities within a graph ... never repeat yourself!
Add relationships between your entities
A carefully designed user interface
Upload any kind of media (pictures, video, spreadsheets, …), also many at a time
Images and videos are automatically converted for playback on the web
Define which kind of entities can be related by what relations
Put your entities in one or many groups and share them with other users
A Fine-grained permission system with user groups and entity collections
Easy extension of the schema for every kind of entity: Add fields for all entities of a specific kind or occasionally add data to arbitrary entities
Tagging with autocomplete and sensible permissions
Full text search through all your metadata
A rich API facilitating additional frontends and data harvesting
Excel import and export
Deliver one-click zip downloads to your users
Identify isolated entities
Merge entities to further normalize your data
External authentication (for example LDAP) by simple shell scripts
Easy identifier management and resolution
Many configurable aspects (welcome page, terms of use, help, primary relations, brand, …)
Access data via an OAI-PMH interface
Vagrant dev environment
good unit and integration test coverage
checked for security problems with brakeman
support for using Erlangen CRM and similar standards as basis for your ontology (including a convenient OWL import tool)

Documentation

See the documentation overview for help with usage, installation and development.

Changelog

We keep it updated at CHANGELOG.md

License

See file COPYING

kor's People

Contributors

Stargazers

Watchers

Forkers

ifraixedes bitno anthon s-peter lizuoyue telota aramaki-san

kor's Issues

OAI PMH Identify: export repository uuid

it would be useful if OAI PMH's Identify request would not only return the repository's name or its URL but also a uuid for this repository

Overhaul configuration system

This is now a mess of several config files being managed partly by ops directly and partly via the web interface. We should:

have a configs table within the db
make every configuration option a record with a key, a value and a default value

When a fully javascript based frontend is ready, we can even tie in the database configuration via a small dedicated rack app. This could give a php-style experience where you can set the database details often directly via the web interface.

Entity type not considered in new inline relationship functionality

starting point: The user is on a specific entity and is trying to add a new relationship with the new functionality in v2.0.
steps taken: The user selects a possible relationship from the dropdown menu and starts typing the target entity. Results are being shown.
problem: The results displayed don't take into account the possible target entity types. Although each relationship has clearly defined entity types on both ends.

enhancing new relationship functionality by dynamically filtering relationship types

starting point: The user is on a specific entity and is trying to add a new relationship with the new functionality in v2.0.
steps taken: The user selects a target entity from one of the three possible tabs (search, last visited, last created), before choosing a relationship type.
problem: Now the dropdown menu is still showing all relationships and is not dynamically reducing them to the ones possible for the two entities chosen.
reasoning: this addition would help databases with large amounts of relationship types, as the non relevant and not usable relationship types would not even be shown as an option.

are all relationships provided?

there are 528337 relationship records in the provider's database (Frankfurt)
harvesting from the relationship repo only gets me 264577 records + 4078 invalid records
the invalid records are due to missing from or to entities, which is normal given that not all entities could be imported previously either
so there are still 259682 records missing
do you have an idea why the harvest is not complete?

GetRecord response 403 Forbidden

trying to access OAI-PMH interface with verb GetRecord triggers a 403 Forbidden response even though logged in

Write installer to allow basic configuration via web interface

This could be done by modifying config.ru à la:

setup = Proc.new do |env|
  [200, {'content-type' => 'text/html'}, ["setup"]]
end

wrapper = Proc.new do |env|
  if File.exists?("#{RAILS_ROOT}/config/database.yml")
    unless defined?(Kor::Application)
      require "#{RAILS_ROOT}/config/environment"
    end
    Kor::Application.call(env)
  else
    setup.call(env)
  end
end

run wrapper

Uploaded images are not converted

They are converted only after editing and saving the entity again.

It seems that this is only happening when using the multi file upload. Currently, files are transmitted with a content type "aplication/octet-stream" which leads paperclip to believe that it shouldn't be doing any image-related processing.

local ids of kinds not exported via oaipmh

in order to map the local from and to ids of relations, the local id of kinds is necessary. These are not exported via oaipmh's ListRecords

Add easy editor for relationships

Now, the workflow is to "mark" an entity, then navigate to another and then create the link between them. This should be done by an inline editor on the entity page.

are all entity records provided?

I harvest only 129494 (+290 invalid) entity records of 261504 (Frankfurt database)
in particular I don't harvest any medium entities (130906 of 261504)
are all medium entities provided via ListRecords of entity repo? There are 130906 medium entities and 814 non-medium entities missing.
with a smaller database (easydb dfk database) all records are harvested as expected.
Do you have an idea why the harvest from the big database is incomplete?

remove the session panel

afaik the version 2.0 offers only the new way to link entities (btw: smart function but a little bit confusing under ui resp. usabilty aspects). anyway, why do we still need the section „session“?

Some links are rewritten the wrong way

There is javascript in place that should change old /entities/1234 links to the newer /blaz#/entities/1234 version. However, it changes them to /blaze/1234

description of kinds is not exported

cf. title

selection by kind doesn't work on the clipboard

OAI PMH repo for relations

currently information concerning relations can only be derived via instance data represented by relationships; it would be nice to be able to request relations directly, so that one could deal more easily with changes at the graph's schema level

collections have no uuid

how shall they be differentiated when there are collections with the same name in different instances?

More user friendly management of background jobs

A few improvements to the admin interface:

the web interface should display a list of pending/running jobs
the web interface should guess if a background-job is running (some running jobs or no jobs at all)
KOR should be configurable to either manage the processes by hand (to satisfy more specific scenarios as having the processes running on another host, etc) or to manage them by web interface (default): start, stop, amount, etc
a default process should be started in an initializer if there is none

Spawning and detaching processes in ruby:

pid = Process.spawn('bin/delayed_job -n 2 run')
Process.detach pid

There should also be a public read-only action that allows querying the status of background jobs (pending/running or not)

OAI PMH repo for Kinds

currently information concerning kinds can only be derived via instance data represented by entities; it would be nice to be able to request kinds directly, so that one could deal more easily with changes at the schema level

Enities are not pushed to elasticsearch when they are the result of a merge

provide an additional metadata format

We are now trying to embed a KOR format within DC. This is very ugly so we will add a second metadata format in the OAI-PMH responses. The DC version will just contain very basic information (title, uuid, type, etc).

In „new entries“ a (work-)title is displayed above a medium

a) Why?
b) It isn't linked

Overhaul error handling

Errors result in an entry within the database that can later be inspected. This is generally good but the catching code has to throw more meaningful and consistent error messages.

collection id missing in GetList response

the records of response for OAI-PMH verb GetList request don't contain the record's collection id

The upload size is limited to 5 MB but just 0,5 are allowed and the error message is misleading

Saving configuration leads to a UndefinedConversionError

Use localstorage for clipboard and ditch activerecord-session_store

The actual entity is not visible in the section „session“.

entity validation errors when working with Frankfurt database

there are entity datasets in the Frankfurt database that are not valid according to the current validation criteria. Errors are mostly due to already taken names or invalid distinct_names, but not exclusively. If you need more detailed information let me know. I logged the errors and corresponding entities.
consecutive spaces in name and distinct_name fields are already taken care of; these validation errors do not appear in my logged errors file

reset-admin-account

kor subcommand reset-admin-account doesn't work

oaipmh xml doc not valid

I get the following errors when validating the ListRecords-XML doc for repo .../api/oai_pmh/kinds.xml against the aoipmh schema:

Element '{http://www.openarchives.org/OAI/2.0/}request', attribute 'verb': [facet 'enumeration'] The value 'ListRecord' is not an element of the set {'Identify', 'ListMetadataFormats', 'ListSets', 'GetRecord', 'ListIdentifiers', 'ListRecords'}.
Element '{http://www.openarchives.org/OAI/2.0/}request', attribute 'verb': 'ListRecord' is not a valid value of the atomic type '{http://www.openarchives.org/OAI/2.0/}verbType'.
Element '{http://www.openarchives.org/OAI/2.0/oai_dc/}dc': No matching global element declaration available, but demanded by the strict wildcard.
Element '{http://www.openarchives.org/OAI/2.0/oai_dc/}dc': No matching global element declaration available, but demanded by the strict wildcard.
Element '{http://www.openarchives.org/OAI/2.0/oai_dc/}dc': No matching global element declaration available, but demanded by the strict wildcard.
Element '{http://www.openarchives.org/OAI/2.0/oai_dc/}dc': No matching global element declaration available, but demanded by the strict wildcard.
Element '{http://www.openarchives.org/OAI/2.0/oai_dc/}dc': No matching global element declaration available, but demanded by the strict wildcard.
Element '{http://www.openarchives.org/OAI/2.0/oai_dc/}dc': No matching global element declaration available, but demanded by the strict wildcard.
Element '{http://www.openarchives.org/OAI/2.0/oai_dc/}dc': No matching global element declaration available, but demanded by the strict wildcard.
Element '{http://www.openarchives.org/OAI/2.0/oai_dc/}dc': No matching global element declaration available, but demanded by the strict wildcard.
Element '{http://www.openarchives.org/OAI/2.0/oai_dc/}dc': No matching global element declaration available, but demanded by the strict wildcard.
Element '{http://www.openarchives.org/OAI/2.0/oai_dc/}dc': No matching global element declaration available, but demanded by the strict wildcard.

Keep feature "accept legal terms"?

New users have to accept a set of usage terms defined by the admin. Since we have the possibility of a guest account, that feature doesn't make sense (in that situation). Should we keep it? If yes: how should we handle the guest user?

unique identifiers for collections, fields and generators

when updating a record, that has previously been imported via oaipmh, I am not able to retrieve a particular collection, field or generator, since they don't have uuids
collection, field and generators only have a local id, i.e. the id in the database they came from. One could either create a new column in each of the tables in the harvesting database to store the local id from the provider database or a column for a uuid, in order to match these records between provider and harvester. The second solution seems more advantageous to me, since it would allow more effective retrieval of the records

the uuid for medium kind in different kor instances is not the same

cf. title

Missing translation on statistics page

See /tools/statistics

new relationship functionality lets the user create impossible relationships

starting point: The user is on a specific entity and is trying to add a new relationship with the new functionality in v2.0.
steps taken: The user selects a possible relationship from the dropdown menu, selects a target entity and adds the relation.
problem: The relation is accepted, even of the relationship type is not meant for the two entity types used.
example from the screenshot: person is being connected to a city with relation "wird dargestellt von".

enhancing new relationship functionality by allowing several target entities at once

Change: Allowing the user to select more than one target entity, when using the new relationship functionality. This could enhance the workflow, especially if the user is working on a particular batch of items, which all belong together. With "last created", he could easily create several relations at once.

Example: Selected entity "Leonardo da Vinci", selecting relation "has created" and then selecting works "Mona Lisa" and "Last supper".

Reasoning: research assistants often upload images in batches, and then continue on to add relations.

Write an xsd stylesheet for the XML generated for the OAI-MPH api

... once we defined it.

OAI PMH ListRecords: export relations uuid

as with entities it would be practical to get the relations uuid when requesting relationship records

Replace javascript video player with html5 video tags

That also provides a solution for audio.

Use avconv instead of ffmpeg if available

bulk relating and no entities selected -> 404

new relationship editor: reduce amount of results for last visited

starting point: open relationship editor on any given entity
steps to reproduce: click on "last created"
problem: currently 12 objects are shown, which causes unnecessary scrolling to create the new relationship. Would it make sense to reduce the objects to 9, meaning a 3x3 matrix? Or to allow the user to define the amount of objects or to allow the user to flip through several pages of these?

Use ActiveJob instead of delayed_job directly

Change column type to 'text' for description fields

Currently, kinds, relations and credentials have their description fields set to a string with length limited to 255 characters, this should be changed to 'text'.

align expert search behavior to simple search

starting point: select expert search.
steps to reproduce: select an entity type -> additional fields for this selected type appear.
enhancement: as soon as a type is selected, display search results for that type. The same is already the case for the simple search. This would provide the user an immediate response to his selection, without going to the search button first. From there on the user can enter search terms and continue to narrow his search results.