Coder Social home page Coder Social logo

cats-uri's People

Contributors

rossabaker avatar valencik avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

isomarcte

cats-uri's Issues

idea: URIs using opaque types and match types

I thought it worth suggesting one way of building a URI library that has the following advantages:

  1. it can abstract over multiple implementations
  2. no efficiency loss (no extra object creations)
  3. build on existing work
  4. once existing one has implementations for existing URI libraries have been implemented it is possible to build one's own with new features such as interning discussed in #6

I have used this concept for a very much larger library banana-rdf scala3 branch. The key interface is RDF which gives the main types for RDF, and then Ops which gives the key operations on each type, which are specified in the operations package.

These then get implemented for the various frameworks, such as Apache Jena or IBM Eclipse Rdf4J or even a JavaScript project such as Tim Berners-Lee's rdflib.js...

One can then write code independent of the implementation and switch between implementations by just changing the ops. (One can hard code that for a particular project to avoid the abstraction). For the abstract view see the test cases such as GraphTest.scala.

As I was writing some code to sign http headers, I actually started on this project for HTTP frameworks, abstracting between Akka and Http4s Messages. See httpSig's Http.scala with an Http4s implementation and an Akka implementation. I did not spend that much time on that as I was just trying to reduce duplication of code in tests and elsewhere.

The URI, URL and URN seem to be very similar to those use cases. Just as for RDF and HTTP all implementations build on the same standards, so that it is not surprising that there is a common abstraction.

One could thus start by implementing it for

  • the current Http4s URI class (and add a URN class to it using |)
  • lemonlabs URI scheme
  • JS URI schems
  • the Java URI scheme
  • the various banana-rdf light weight versions of URIs...
  • and indeed if a framework on which http4s bases itself has its own URI scheme, one could implement that, to avoid URI transformations...

and once tests are written to cover those, one can then write a pure scala implementation that can be confident it has captures the URI abstractions faithfully.

Some requirements for a new URI class

Having tried lemonlabs Uri in banana-rdf and using http4s URi I have some ideas of key elements to improve http4s Uri.

  1. Http4s Uris are currently actually URLs, and should be named as such. Urls can have paths, be relativised, absolutized and be used for requests using something like an http framework. URNs cannot. We have type URI = URL | URN

  2. Http4s should take onboard the typing lemonlabs uses, that clearly distinguishes absolute Urls, relative Urls (and there relative path absolute: urls starting with /) and also schema relative urls. That would allow Http4s to let us know if a Request is meant to take an absolute or a relative Url for example. There may be clients that can take both, but not all clients can. A client that has no connection cannot take a relative Url. A client bound to a connection can take a (path absolute) relative Url).
    Further crazy requirements:

  3. Ideally (here we may be going beyond what is possible) it would be possible for the interface to be implemented so as to allow unique storage of URIs to re-use components. So one would be able to have a garbage collected store of URIs starting with "https" then followed by domain components, etc... so that Uris could be compare using eq for equality and use as little memory as possible.

  4. In RDF one has so many URIs the frameworks are careful to minimally parse them, leaving perhaps invalid uris in graphs rather than having to parse the structure of them. I wonder if there is a clever way of optionally integrating that requirement.

  5. Perhaps one also wants to map URIs to 64bit Long numbers. At the RDF layer one does not need to look into the URIs at all: one could just as well use 64bit long comparison to test for equality, once those are parsed and stored in a DB.

Having an excellent real URI class that encompasses both URNs and URLs would be very helpful when writing application code. On the web application code proceeds as follows:

  1. client fetches hyper-text or hyper-data resource on the web (using http4s)
  2. it parses the content, often containing URIs (relative and absolute URLs and sometimes even URNs) and may follow any number of those URLs in sequence or in parallel (going back to 1.).

If the code could use the same URI structure in the parsing stage as it uses in the fetching stage that would make for much nicer and smoother coding. Note in RDF URIs are the core of the naming scheme, so one is using them all the time for every data structure. In banana-rdf I added support for relative urls, as those are useful when fetching data and also when posting graphs to a container: the server decides on the URL of the created resource and so the graph sent has to be one that can contain relative Urls.

I also write this up on the http4s discussion area http4s/http4s#5930 (comment)

cats-uri should abstract all URI frameworks

It would help to have a URI spec that can abstract between any implementation of URIs, the way that banana-rdf can abstract between any RDF Framework, without needing to create any new objects (using opaque types essentially and Match Types.
Having such a library would allow one to write code that can work in Java, JS, and pure Scala land and native using the libs most appropriate for that platform, and making a switch from one library to the next a simple 1 line code change.

That would allow one to write code that could use any of the following libs:

  • java.net.URI
  • akka URI
  • Http4s URI
  • Jena URI
  • lemonlabs uri
  • JS URI libs
  • native URI libs
  • sttp.mode.Uri
  • ...

The pattern for such an abstraction is demonstrated with the banana-rdl scala3 library:

To see how it is used, please take a look at the tests which are framework independent, such as this one:
https://github.com/bblfish/banana-rdf/blob/scala3/rdf-test-suite/shared/src/main/scala/org/w3/banana/GraphTest.scala

Banana-rdf actually has an initial abstraction for URI too:
rg/w3/banana/operations/URI.scala, to help us write code that can work with URIs across frameworks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.