epimorphics / elda Goto Github PK

View Code? Open in Web Editor NEW

53.0 53.0 27.0 16.3 MB

Epimorphics implementation of the Linked Data API

License: Other

HTML 0.47% CSS 1.29% JavaScript 33.22% XSLT 27.61% Java 37.36% Shell 0.03%

elda's People

Contributors

Stargazers

Watchers

elda's Issues

csv.xsl not rendering shortnames?

The csv XSLT renderer seems to not always render a data field that has a shortname.

uri templates with ?foo=BAR don't have full support for varieties of BAR

A uriTemplate may specify ?foo=BAR meaning that it matches the path only if the query parameter foo is supplied and its value "matches" BAR.

If BAR starts with "{", then it is assumed to end with "}" and the intervening characters name a variable that is bound to the value of the query variable. Otherwise, the query variable must be string-equal to the value BAR.

This is not consistent with Elda's matching behaviour elsewhere, which would allow BAR to be eg `{alpha}-{beta}` and sucessfully match that to the value in ?foo=ajax-bottles, binding alpha to ajax and beta to bottles.

elda (1.2.8) silently ignores _sort URI parameter on endpoints implemented using api:select selectors

elda (1.2.8) silently ignores _sort URI parameter on endpoints implemented using api:select selectors. It should probably 404 and reject the use of the parameter. There are likely to be other _parms that its should also reject rather than silently ignore.

inappropriate log.warn in BytesOutTimed

BytesOutTimed (which records the time and bytes taken for outputting rendering bytes) attempts to catch exceptions (strictly, Throwables) and provide a log message. However
the message is unhelpful and, if headers have already been written, the error cannot be reflected in the response status for this rendering. The supplementary data is written to System.err, which is not an appropriate output stream when running under eg Tomcat. 

This should be addressed with an over look at the response protocol so that we
expect everything to render to its bytes before any response is constructed.
(We can't do much about errors that happen as precomputed bytes are streamed to the client anyway.)

Labelled Describes can generate enormous queries

See also  Issue 9 , but this is more specific.

Elda implements a labelled describe by doing a DESCRIBE query, taking the result
model, and generating a new query containing (O rdfs:label ?L) for every resource object O in the DESCRIBE result model. If the DESCRIBE happens to have many triples (as it may in the bathing-water data-cubes) then the result is an enormous query which fails to fit in buffers and breaks Elda.

There are multiple possibilities for fixing this. In no particular order,

* if the DESCRIBE result is "too big", don't run the label query at all.
* or only run the label query for the first N elements
* or only run the label query for a random N elements

(These all the have disadvantage that they lie to the end client)

* Run multiple sub-queries, N at a time

(There will be a lot of sub-queries)

* Set a limit to the number of elements asked for: if exceeded return a 
  status 500.

(The limit is an arbitrary value OR has to have a configuration option.)

* Implement DESCRIBE with (?item ?P ?V) with ?item fetched according to
  the SELECT query. Then the labelled describe is 

    ?item ?P ?V OPTIONAL {?V rdfs:label ?L}

(Misleading because this will not follow BNODES, which DESCRIBE is
expected [although not required] to do)

* As above, but for all ?Vs that are bnodes, do another query.

(Have to put a limit on the query depth; also, may be many queries.)

* Supply another viewer family called, say, AllProperties, which does
  the non-bnode (?item ?P ?V) trick, and advise developers to use this
  instead of the describe viewers especially in cases where there can
  be very many results. Note that the response may itself be very large.)

(Not portable unless it can be incorporated into the spec.)

api:base handling in generating redirection URI (and others?) is confused.

api:base handling in ELDA fails to distinguish between the use of api:base as a prefix which needs to be pre-pended to an api uriPath; and use of a base in URI.resolve(URI base) calculations that resolve relative URI against a based URI (and are not necessarily concatenations).

The problem is that api uriPath are held with a leading '/' which makes them (as relative URI) relative to the server root rather that the root of the servlet context.

Basically elda does not correctly handle all cases:

api:base "/"; (seems to work)
api:base "http://environment.data.gov.uk/"; (unknown)
api:base "http://environment.data.gov.uk"; (unknown)
api:base "/environment/" (fails - api:base eliminated from resulting URI);
api:base "/environment" *fails - api:base eliminated from resulting URI;

The problem is a combination of Jersey behaviour for seeOther responses which treats all relative URI as relative to servlet root (even with leading '/') and URI.resolve(base) behaviour which performs RFC compliant relative URI wrt to base computation.

This needs to be sorted with some urgency because elda will fail in some scenarios.

Translation to Portugueses Language

Hi,

I work at the Brazilian Government. We have been using Elda for opening up our federal budget data. The configuration is not finished yet. The main "endpoint" will be http://orcamento.dados.gov.br/id/item-de-despesa .

We are interested in contribute with a portuguese translation for elda.
Is there a properly way to do that? Was it set up in a translation platform, like Transifex?

Cheers,
Nitai

bad date literals break JSON rendering

If a model contains xsd:Date or :DateTime literals whose lexical form is illegal,
the JSON rendering will throw an exception when it tries to render the date into
a Javascript-style date string. This happens because it uses getValue() on the
bad literal.

It would be better if it rendered the bad literal somehow and continued.

a lint/eyeball for configs would be useful

Mistakes like forgetting to declare the type of a property in an LDA config cause mysterious run-time failures (404s, 500s, or worse) in Elda (and maybe other implementations of the LDA).

A lint-like functionality for reporting suspicious configs would be useful -- maybe an Eyeball plugin.

is the JSON renderer's 'unhandled datatype' log message necessary?

When the JSON renderer renders a literal with a datatype which is not one of those given in the spec, it reports (once only per rendering) that this has happened
using a WARN log. It then correctly renders the literal as its lexical form.

The warning message is provided in case the datatype is wrong, but this is not
Elda's responsibility. Perhaps it (the message) should be dispensed with.

opaque message and code in analysis of api variables in config

VariableExtractor, which does the analysis of api:variable clauses
in a config, makes heavy weather out of working out what the type
of the variable is. The code should be clearer and there's probably
no need for the DEBUG (was WARN) diagnostic.

Cannot override contextPath with jetty-maven-plugin 7.x

With the solution initially proposed for  issue #127 , it arises the following error:

[ERROR] BUILD ERROR
[INFO] ------------------------------------------------------------------------
[INFO] Error configuring: org.mortbay.jetty:jetty-maven-plugin. Reason: ERROR: Cannot override read-only parameter: contextPath in goal: jetty:deploy-war

Therefore it could be fixed in two ways:

1) remove contextPath usage with 7.x versions of the plugin
2) keep using org.mortbay.jetty:maven-jetty-plugin:6.1.25

Elda silently ignores URI based filters on List Endpoints that use an api:select selector.

The api:select selector lets an api developer provide a whole SPARQL select query. Elda will manage the LIMIT and OFFSET paging - but it does not/cannot inject URI based filter elements into the query.

Currently elda (1.2.8) seems to silently ignore URI based filters at endpoints implemented using api:select selectors.

This will be counter inuitive to an api user expecting a filter to have an effect. Elda should probably 404 and respond with a message that says that the corresponding URI parameters are not supported on the given endpoint.

Require way to terminate Elda standalone on Windows

^Cing the Elda standalone server on Windows doesn't shut it down properly.

Hygiene: too much non-restlet code in restlets

Some of the restlets contain Elda code that doesn't depend on the restlet environment. It could usefull be extracted to places that don't need a Jersey dependency. (This has already been done for the Router, Config, and Stats restlets.)

Feature Request: Per API entry page

imagine a server with 5 running configs (all in seperate URI spaces)
/api/config1/functions*
/api/config2/functions*
/api/config3/functions*
/api/config4/functions*
/api/config5/functions*

I would like to put a 'entry page' for each running config. If they dont call a 'function' provided by the config (ie /api/config1/ they would get a static page describing the API (completely customisable by us, or just plain html - take your pick.)

Is this possible?

Suggest support for inverse properties

It is sometimes useful to be able to specify inverse properties when selecting or viewing an item without those inverses being named and present in an API config or vocabulary.

A natural approach would be to have inv-P=V to translate to ?something P ?item. (V will map to a resource, not a literal.) The extension to property chains isn't immediate since inv-P.Q=V has at least three different readings depending whether it is P, Q, or P.Q that is to be inverted.

does stylesheet render datatypes or language tags?

Do the XSLT stylesheets render datatypes or language tags on literals if the XML renderer is configured to generate them?

Allow CSS customization

Currently I thing there is no way for customize the style of the markup generated. 

Would be possible to allow the configuration of such feature?

SpecManager is dead

Elda has no real use for its embedded SpecManager (which "manages" configs as they are created and destroyed) (although it is used by Vlad).

We should probably remove that code at a suitable opportunity.

Elda does not forward 500 problems from the SPARQL endpoint

If the SPARQL endpoint reports a problem, Elda has nothing that can
handle it (not at the top level nor in the Source query handlers).

This makes generating good error reports challenging.

JSON rendering of typed literals of structured properties goes round the houses

When a structured property with a data-typed value is rendered as JSON,
the name-shortening is done in a different place to that provided by
the pre-computed shortnames for the model. This behaviour was not
expected and a WARN log message was generated.

The issue is not a user or developer problem but an internal Elda
problem. The log message has been changed to a DEBUG. It should be
possible to eliminate this indirection completely.

uses of RuntimeException need winnowing

There are several places where Elda throws a RuntimeException: these should be inspected with an eye to finding a more appropriate exception to throw, eg an EldaException or a WrappedException or even an Error.

Old TODOs to review and make into issues if necessary.

o Configurable templates for the HTML render
  In particular current hardwired style sheet address is broken

o Switch to SELCT-based views with no separate describe phase unless _view=all

o More efficient page sequencing. Optionally (?) open stream as unbounded (if tdb/memory),
  keep around for a time out and reuse if possible. Needs to tie in with rebuilt caching system.

o If visit base API should get a summary of endpoints and names, see
  http://services.data.gov.uk/education/api

o Review varProps hack.

o Eager caching of whole result sets so paging through doesn't rerun the whole query with quadratic costs

o Improve text search, put textmatch query first?

templates, base uri, and deployments

Hi Guys,

we have an external app server, which proxies content from interal hosts for our exteral users. unfortunatly elda makes use of root url paths, not relative, like /elda/ instead of ./ which makes it sensitive to where its deployed, and worse than that - we cant proxy our services as they 'swap' uri paths in the external network.

Any chance of working out how to make it more 'relative' path aware? 

Our servers look like:
external-appserv/geo-data -> proxy -> int-apps1/elda-geo/
external-appserv/petro-data -> proxy -> int-apps2/elda-petro/

which really messes with no relative paths!

Elda doesn't use templating for its non-XSLT rendering

Elda's built-in Java renderer for HTML is all code, no templating. Templating would make it easier to produce custom renderers when the overheads of rendering to XML and then XSLTing the results are too high.

Wrong way for reading log4.properties file

Using ELDA (1.2.6-SNAPSHOT) on my webapp, I'm getting the following error launching it:

16:58:12,066 com.epimorphics.lda.routing.Loader:88 INFO : Starting Elda 1.2.6-SNAPSHOT
log4j:ERROR Could not read configuration file [/home/sergio/projects/foo/src/main/webapp/log4j.properties].
java.io.FileNotFoundException: /home/sergio/projects/foo/src/main/webapp/log4j.properties (No such file or directory)
    at java.io.FileInputStream.open(Native Method)
    at java.io.FileInputStream.<init>(FileInputStream.java:120)
    at java.io.FileInputStream.<init>(FileInputStream.java:79)
    at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:316)
    at org.apache.log4j.PropertyConfigurator.configure(PropertyConfigurator.java:342)
    at com.epimorphics.lda.routing.Loader.configureLog4J(Loader.java:148)
    at com.epimorphics.lda.routing.Loader.init(Loader.java:89)
    at javax.servlet.GenericServlet.init(GenericServlet.java:241)

The library should not look up at such path. IMO it should do it from the classpath, adding something like this to com.epimorphics.lda.routing.Loader.configureLog4J() at line 149:

PropertyConfigurator.configure(Thread.currentThread().getContextClassLoader().getResource(file).getFile());

Feature suggestion: add semantic site maps

From the semantic-web mailing list:

"""
I start to believe that tools such as Elda and Pubby should publish in addition sitemap/void descriptions of the exposed SPARQL enpdoints. And even, the robots.txt when they are launched in a standalone fashion.

I mean, these tools should be active players in data/SPARQL endpoint discovery in the Web, not only passive data publishers.

Cheers,

-- fjlopez
"""

I think that's a good suggestion

/api/meta/ in internet explorer 9

open up elda-1.2.5/api/meta/ in internet explorer 9.
It doesnt send a content type and consequently doesnt get anything back from elda.

Is elda IE9 compatible? or is IE9 a supported browser platform for it?

Regards,
T

bnodes do not merge across view query parts

bnodes delivered by different queries of a multi-query view do not merge.

Without SPARQL endpoint help, or a special property of the data under view, they can't be merged.

Possibly we can define alternative single-query views (hence, no merging required) that sufficiently approximate what the multi-query views request.

Use Elda for Google Refine reconciliation

It would be very handy if Elda could be configured to provide a Google-Refine reconciliation API easily. Would make it very straightforward to use local datasets as reconciliation vocabularies with Refine.

silence on &param= with no value

What steps will reproduce the problem?

have a query parameter bound to no string, .../segment?foo=

What is the expected output? What do you see instead?

expected: Expect an error 400 or silently ignored (pick one)

actual: no error, but there is DEBUG log output (which is not much
help to the end user)

Too much handwork doing a release

Doing an Elda release currently requires hand-stitching documentation links (for quickstart, index, and advanced in docs, and on the Elda front page). It's easy
to forget to do this. Also the upload of the standalone jar is done by hand. This
should be fixed so that only release:(prepare, perform) is needed.

Minimal API without Jersey/Jetty

Is it possible to get a minimal API implementation that basically contains lda.jar and json-rdfq.jar with no jersey/jetty dependencies? I'd like to use the API programmatically and there's no clean way to decouple the middleware from the api as it stands.

View query display incomplete

An Elda view may require up to three queries to construct (given the selection already made):

* a query for the property chains, if any
* a describe, if requested
* labels for all items fetched, if view is a labelled-describe (which may also have chains)

However, the stylesheeted box displays only one query. At the very least, that
there are other queries involved should be signalled.

bindings "not substituted" may do unnecessary logging.

When a substitution "...{spoo}..." is being evaluated in Bindings,
and there is no value for the variable `spoo`, Bindings makes
a log entry noting the fact. (And {spoo} is replaced by {spoo},
ie, unbound variable portions of the string do not change).

If this is legal, there's no need for the debug message, and if
it's not legal, then it should generate some warning message,
eg HTTP status 400.

[The messages have just been changed to DEBUG from WARN]

Need a mechanism to so that copy can work on staging server.

There is a need to be able to deploy Elda on a staging server and show it to people, e.g. users/customers and have it work.  Elda currently returns URLs as they are in triple store which point to the master server, not the staging server.  There is a need to be able to rewrite these URLs so they point to the staging server.

For testing purposes this can largely be done by manipulating the host table of the testing machine.  But we can't expect clients to mess with their host tables when we are showing them facilities installed on a staging server.

One solution is for Elda to rewrite the URLs before sending them back to the client.

It should be possible to pick up the rewrite rules from the environment of the server, without having to mess with the Elda configuration.  This way exactly the same code/data can be deployed on the staging server and a release server and it will behave as required.

clearing the cache uses GET but should use POST.

Trying to clear the Elda cache using POST eg with

  curl -v -s -XPOST http://localhost:8080/wherever/control/clear-cache

does not work; it needs to be GET, which is inappropriate for a
state-changing operation.

Fix: allow POST as well, remove GET at a suitable moment.

Request: Configuration syntax check, and Container overrides check.

1. Developing configuration files.
How do I check that they are syntactically correct? or even parseable by Elda/LDA?

Can you extract this into a checker class? If this was able to be exported into a standalone config checker - that would be great.

----

2. Container Over-ride check

Currently if I have to configuration files, which re-declare each others api space, things break

./config1.ttl
this:endpoint isA elda:Api
...

./config2.ttl
this:endpoint isA elda:Api
...

I would like a tool or the engine to detect colliding config files, and provide feedback to the admin to alert them to this.

ie: config2.ttl redefines content in config1.ttl or something similar.

it's too hard to set log levels when using the standalone jar

Setting log levels when using the standalone jar requires unpacking the jar and poking at the right log4j.properties. It should be documented and easier, eg with a command-line flag.

Elda's responses do not have a last-modified date

Elda's responses (specifically its successful responses with actual data) don't have last-modified dates to help with caching.

Need mechanism to specify a 404 page.

There needs to be a means for an application of Elda to specify a 404 page.

Elda caching needs investigation / adjustment / documentation

* it's not clear how much benefit the cache brings in a multi-client context.

* the drop-all nature of the cache may provoke load peaks. Integrating an
  LRU policy should be investigated.

* we don't have good numbers for how much room cache entries take. That makes
  it hard to size the cache (or the application memory).

* view cache units, currently APIResultSets, may be too big. We should consider
  caching at eg the Resource x View level. May not be useful if too many (small)
  queries generated instead.

Elda does not handle bnodes as items

If a selected ?item is a bnode, some Elda views will crash. If they do not, generated XML may be missing some (all) properties.

-- clarify if bnodes-as-items are to spec or not
-- decide whether Elda will support bnodes-as-items

inconsistency in shortname treatment

Elda can report "illegal shortname ignored", but in fact the name
is not ignored -- there is inconsistency in the checking of code.

Also the stringent restrictions on shortnames that allow them to
be all of element names, Javascript names, and query variable names,
is not required for non-property names. In particular allowing
classes to have shortnames that are the same as their local names
allows convenient shorthands in config construction.

(a) remove the inconsistency in Elda's checking (collapse NameMap
and Context, eventually); (b) relax the constraints on non-property
shortnames.

_distance isn't bound

when using geo properties with _distance, there is no accessible variable binding to use in ORDER BY.

Statistics display media types but not format names.

When the statistics page shows rendering-dependent statistics, it uses the media type names not the format names.

The format name is not always available. The renderer could be modified to know its format name, but conceivable the same renderer might be used for two different format names.

Maybe we should use the name if available and the media type if not. Maybe we should special-case text/html.

Some timelike values still forced to have timezones

(Reported by Stuart)


Although the json-rdf encoder has been fixed to properly handle zoneless 
xsd:dateTime - however there are other potentially timezone bearing XSD 
datatypes that have been similarly fixed (eg. xsd:date, xsd:time, 
xsd:gYearMonth, xsd:gYear,  xsd:gMonthDay, xsd:gDay, xsd:gMonth are all allowed 
an "... optional following time zone qualifier..." [1]).

Also xsd:date values are being serialised as time-zoned time instants - I don't 
know how json-rdf serialises other xsd: date/time related datatypes, but other 
that xsd:dateTime (and the newer xsd:dateTimeStamp which does require an 
explicit timezone to be given) and xsd:time (a value space of recurring 
instants) the other datatypes correspond to intervals (recurring in some cases) 
rather than instants (recurring or not).

xsd:date values are generated by the IntervalServer eg:

     http://reference.data.gov.uk/doc/month/2012-01.json

expand xsd:date values for scovo:min and scovo:max properties are time instants 
(the values are correct in .ttl and .rdf variants).

[1] http://www.w3.org/TR/xmlschema11-2/#built-in-datatypes

Need to define variable length lists in views

There is currently no mechanism that allows a view to include arbitrary length RDF list.  This is essential functionality for my current application.

I have currently working around it by including in my view the property paths:

  foo.first,foo.rest.first,foo.rest.rest.first,...

This only supports a list upto a  maximum length and is horribly cumbersome if one wishes to define a set of property paths from the elements of the list to be included in the view.

I would like to be able to express:

  foo.rest*.first

which would do what I want quite nicely.

I noticed that I can put an api:template property on a view that specifies the construct part of query.  This mechanism only allows one to specify a view using graph patterns that can appear in both the select and construct part of query.  This does not include, for example, variable length property paths.  If it were possible to specifiy the construct clause and select clause independently, that would I think be a sufficient, though inelegant way of addressing my need to arbitrary length lists in views.

Suppress decorators in HTML representation of list endpoint generated by api:select

If a list endpoint uses api:select then adding filters to the request IRI doesn't work.  However decorators such as "more like this" appear in the HTML representation generated by the stylesheet provided.  These links do not work - the filters they specify are ignored.

Best not to display these decorators in these circumstances.

epimorphics / elda Goto Github PK

elda's People

Contributors

Stargazers

Watchers

Forkers

elda's Issues

Recommend Projects

Recommend Topics

Recommend Org