http-apis / hydrus Goto Github PK
View Code? Open in Web Editor NEWREST server - Flask Hydra-powered for Semantic Web
Home Page: https://pypi.org/project/hydrus/
License: MIT License
REST server - Flask Hydra-powered for Semantic Web
Home Page: https://pypi.org/project/hydrus/
License: MIT License
As the first students expressed interest in the project, I write here some more insights about the things at this very early stage.
The objective for this project is to create a demo Web API implementing the HYDRA draft, that is an RDF-based framework. The entities defined in the specs are meant to describe the structure and usage of a generic Web API, to let an HYDRA-enabled ("intelligent" or "smart") client to connect to the API's entrypoint and automatically find out where and how to find the needed data.
In this scenario the layers involved are:
A. HYDRA server that can serve data and metadata to a client (this layer can be split into a traditional lower level server relying on a graph database plus a "HYDRA middleware"),
B. client that can "understand" HYDRA metadata and connect to HYDRA-enabled services, and possibly "learn and remember" about past interactions with other services to store its own set of concepts to be used in the usage's domain.
The objective is generally to let different HYDRA-enabled clients to exchange data each other. These clients can be running on any kind of machine, but the focus for this automation are IoT (connected) devices (industrial or consumer or research).
Usage scenario:
Different concepts and classes are involved. An RDF domain has to be defined for the metadata exchange to work.
To make the demo interesting I suggested we should leverage space exploration and astronomy, so the graph can be based on these vocabularies: https://github.com/chronos-pramantha/RDFvocab/blob/master/ld%2Bjson/Spacecraft.json I can suggest possible operations to be requested to the API. Resources are well connected to popular repositories, so we can reach a great amount of knowledge without storing too much.
A very good starting design for the server is https://github.com/antoniogarrote/levanzo
A Python implementation for the server client: https://github.com/pchampin/hydra-py
Please enlist questions and comments below.
Some resources:
PS. the stack to be used has to be decided yet. We should tend to use a full-Python implementation, except for the lower layer where a graph database is required, we are free to experiment so we can suggest anything in the beginning. At first impression I would avoid Triple Stores and try to use a Graph Database or try to prototype something with Apache TinkerPop or also Spark GraphX, to gain in flexibility to switch to different solutions. At first I would prefer to not get concerned into stability and scalability but just try to reach the first working tool, to let the things to be iterated.
UPDATE: To have a better insight, one of the proposed design is described at #3
UPDATE:
There are different possible designs that I am proposing. I would like to discuss with you all students and mentors which one is the most interesting and viable:
UPDATE: Gitter chat available here
UPDATE: Check also this architectural proposal
Now that we have basic models and some kind of structuring of data, we need to define the ApiDocument
endpoint.
This is a more general overlook about the API's endpoints. The implementation described here is the application of the use-case no.2 described here in issue #2 .
For the use-case no.2, the demo API we are going to develop is about Commercial Off The Shelves (COTS) spare parts for pico- and nano-satellites. Has been a recent trend in commercial space industry to develop super-light and super-compact satellites for the sake of research, prototyping, Earth observation and product testing.
The kind of parts we are talking about are basically microcontrollers and processors that allows the spacecraft to operate. Each of these "boards" is meant to provide a function; there are 8-9 different functions (subsystems) to be carried on for the spacecraft to accomplish its mission. Some of them are critical, some other are supporting functions but likely almost every spacecraft needs almost all of them to work properly. This is a comprehensive list as they are written in the "subsystems" vocabulary:
This API will serve objects related to this kind of hardware. The simple starting endpoints can be:
/api/cots/
: from which the single hardware parts are store and retrieved. In particular:
GET /api/cots/<id>
to fetch a particular object,PUT /api/cots/<id>
to store a new object with some characteristicsPATCH /api/cots/<id>
: only for some propertiesDELETE /api/cots/<id>
Mandatory fields for every category of subsystem are specified in the vocabulary itself.
An example of COTS spare part is represented by this JSON, this is a backup power subsystem:
{
"id": 1,
"object": {
"category": "backup power",
"maxWorkingTemp": 61,
"volume": 126,
"minWorkingTemp": -21,
"mass": 252,
"power": 638,
"cost": 10208
},
"name": "2KV backup power",
"year": "2015",
"manufacturer": "Invented Corp."
}
This JSON is ported into a JSON-LD using this two vocabularies: one for the spacecraft, one for different subsystems :
{
"@context": {
"spacecraft" : "http://ontology.projectchronos.eu/spacecraft/",
"subsystems" : "http://ontology.projectchronos.eu/subsystems/"
},
"@id": "/api/cots/1",
"@type": "spacecraft:Subsystem_Spacecraft",
"subsystems:subSystemType": "subsystems:Spacecraft_BackupPower",
"subsystems:function": "Backup and restore energy if Power no. 1 goes down",
"subsystems:isStandard": "Cubesat",
"subsystems:maxWorkingTemp": 61,
"subsystems:volume": 126,
"subsystems:minWorkingTemp": -21,
"subsystems:mass": 252,
"subsystems:power": 638,
"subsystems:cost": 10208,
"subsystems:manufacturer": "Invented Corp."
},
"skos:name": "2KV backup power",
"subsystems:year": "2015"
}
/api/spacecraft/<id>
: an assembled group of COTS that can be used to make up an actual satellite. Each COTS object is categorized into a group that represents its function. For this endpoint:
GET /api/spacecraft/<id>
: retrieve a blueprintPUT /api/spacecraft/<id>
: create a new blueprintPATCH /api/spacecraft/<id>
: emend some component of the blueprintDELETE /api/spacecraft/<id>
: remove ablueprintThese are the endpoints that need to be served by the server with the proper description provided using the HYDRA framework.
In the Wiki page, there's an idea related to "Designing Command Line Interface for Hydrus"
I found this very interesting and matched my skill set. Can you please tell me, is there any limitation on which tool we have to use to make the command line interface? I have started working on developing a command line interface for Hydrus, and has already taught me a lot about how to write test cases and organise files for a project.
yes
Use Hydra client to write tests for more complex functionalities (using the demo API or the other Astronomy ontology)
Unexpected status code returned by the server:
Currently the hydrus server returns a status code 405
for non-existent collections in the database.
The server shall give a response with status code 404
for these endpoints.
To fix this behaviour we can add an additional return value
in the checkEndpoint
function
checkEndpoint
which denotes the expected status code
to be returned and abort that request with that status code.
yes
This Logic to check whether a http request made is authenticated or not is repeated inside every CRUD operations.This whole logic is repeated at 8 places. Few samples are provided here.
Line 103 in f88d0d0
Line 131 in f88d0d0
Line 169 in f88d0d0
Their should be a single function to check the authentication called by each CRUD operation.
yes
Lacks support for sqlalchemy.orm.scoped_session.
Currently the code in set_session() checks only for sqlalchemy.orm.session.Session
, while flask also supports making of other types of sessions.
scoped_session
in sqlalchemy.orm will be supported
Import scoped_session.
from sqlalchemy.orm import scoped_session.
Replace the current session-creation code with the following:
session = scoped_session(sessionmaker(autocommit=False, autoflush=False, bind=engine))
Yes.
Auth tests (test_auth.py) needs to be updated as all the 4 tests are currently failing.
All tests should pass.
Run hydrus/tests/test_auth.py
no
To the best of my knowledge, here are some of the key factors I think you should considered while reviewing python based web frameworks:
The three frameworks that are most used out there would definitely be Django, Pyramid and Flask
I would like for everyones opinions on which framework would be suitable to build the demo web application.
Here is what I think:
Given the fact that this project would lay the foundation stone for something bigger, I would suggest using something robust that would be both extendable and scalable. There is no saying as to what the project might branch out into in the future and as such, a strong foundation would be recommended. Keeping this in mind, I would suggest that Django be used for developement.
This is purely my opinion and I would like for alternate views on the same.
Extremely sorry if this thread was unnecessary, but I felt that it was better to have some sort of discussion to get things started.
The dependencies may conflict with any other modules presently installed in the user's machine. It's better to use a virtual environment to isolate development environments.
A Pull Request template should be there so that when someone sends a pull request the should always reference a Issue to that pull request.
Also some Contributing guideline's should be added like one should claim an issue before working on it . Because someone else may be working on it.
This will especially help the people very new to open source.
I want to work on this issue.
As most are aware, the wiki section is still very verbose and not beginner friendly. It would be great to have some refactoring done there to improve readability and help newcomer's in getting started.
Maybe something like the blog would be a good way to arrange ideas?
Please give some suggestions.
Adding for reference:
https://gsocchrizandr.wordpress.com/the-book-of-hydrus/
We need a template to wrap the data into a well-formed JSON-LD
I have just noted that the use of PUT
and POST
is inverted compared to what is stated in the design.
https://github.com/HTTP-APIs/hydrus/blob/master/hydrus/hydraspec/crud_template.py#L22
PUT
should be for creating anew a resource and POST
for updating an existing resource.
At first impression I would avoid Triple Stores and try to use a Graph Database or try to prototype something with Apache TinkerPop or also Spark GraphX, to gain in flexibility to switch to different solutions.
I tend to disagree. JSON-LD uses JSON as its concrete syntax, and RDF as its abstract syntax (i.e. data model), so the most natural fits would be either
distutils does not support install_requires()
. It gives a warning when you try to install hydrus using setup.py and the dependencies are not installed. The user will have to install the dependencies using the requirements.txt file.
All the dependencies must be installed automatically when we setup hydrus.
Run the setup.py file as python3 setup.py install
.
yes
We can move to setuptools (https://packaging.python.org/key_projects/#setuptools) which supports install_requires()
and will further ease the process of setting up hydrus for pip as a package.
https://packaging.python.org/discussions/install-requires-vs-requirements/.
Hydrus uses a relational database model to store graph data which have many limitations in querying and data accessing like scalability, null factor and time taking querying process.
Switch to a graph database like Redis. Its main features include Simple, fast indexing and querying, Data stored in RAM and etc. It will also help in the implementation of Better API querying.
Use Hexastore as a data structure and Cypher as a query language.
We will discuss here the steps how we can proceed and add this feature. So, everyone please help me in implementing this feature. I read about Redis, Cypher and Hexastore. Now I want to know that how I can implement those things in this codebase.
Yes, I am working on this.
Create an endpoint like: /api/cots?collection=propulsion
to fetch all the items in a given class.
Add the HydraCollection
to the vocabulary.
Add a guide to contributing file compiling a list of points people need to keep in mind while adding bugs or patches to the repository.
This can be added as a CONTRIBUTING.md
file in the .github/
folder
Let us discuss some points to keep in mind here after which someone can be assigned the task of creating a guide.
Write the REST API views to serve the data with method GET, POST, PUT as defined in #3
Write in the README how to run the example to dump the data and then run the server. Explaining which logic is followed to define the API from the RDF vocabulary to the endpoints.
After a starting implementation that applies some nice basic concepts, we need to find the right way for improving the models.
I segregated the models for properties (AbstractPropery
and Property
) but I don't exclude that in the future they will be in the same table as records that inherits from the same base model. A similar thing can be designed for records in the Graph
relation, we may have three different "kinds" of triples that inherits from the same "Triple" class and are then stored in the same table.
For Graph
that is really sparse a this moment. At first I came out with a solution like:
Add an Entity() class to build up every type of triple (instantiated, abstracted, terminated). To have for every record something like a triple:
(Entity(Instance('A')), Entity(Property('B')), Entity(Terminal('C'))).
Or split the triple into tuples:
(Entity(Instance('A')), Entity(Property('B'))) , (Entity(Property('B')), Entity(Terminal('C'))).
These solutions can lead to difficulties in querying by joining though.
Check this interesting video about optimizing hashmaps for ideas.
We are using try catch to check every key in the dictionary and assigning it to new variable.
rather to validate the current payload we should have a generic function in which we can pass the key and value to check and it will return the keys or exception from that function
[Current Sample code structure] (https://ibb.co/ciM4Vn)
yes
@Mec-iS -can you provide any suggestion here
see #11 (comment)
Methods' docstring do not list inputs in signatures
Every method should have a docstring in this format:
"""
Extensive explanation of what the method does.
:param input1: description of the input1
:param input2: description of input2
:return return_type: description of returned object
"""
A test suite to assert that the actual endpoints published are all the ones defined in the ApiDoc, with the expected functionalities.
Example of test case in pseudo-code:
ApiDoc = some_api_doc # can be the subsystem API
class = ApiDoc.get_class(@type=some_type) # get a random class from the ApiDoc
req = requests.get('/api/some_type') # run requests to the class with different methods
req1 = requests.put('...', some_data)
assert req.text['@type'] == expected_type
assert some_data == data_storedinthe_db
...
....
Try to refactor the codebase by implementing type annotations and type checking before runtime using mypy
Hi guys,
Usage and Design page links are not working in README .
As of now, the server URL is not shown when the script is run.
Display URL in the output.
Yes
Since there is no init file in tests folder pytest can't identify different imports.
Hence there should be init file in tests folder.
As the things are professional so for a good maintenance and productive reports, we should add CI service to the repository.
There are several provider which provides free CI service to open source projects.
in the Adding Classes and Properties
section
properties = doc_parse.get_all_properties(doc.generate())
will raise a TypeError, and I checked the code and found the right parameter should be classes
defined above.
In the readme of this project the version of python is not specified wheather it's python2 or python3. It will be nice if it is specified .
I think that this project is using python 2 . Please enlighten me if I am wrong.
Can I work on this issue and update the readme.
After having a good setup for #17 and once is explained how to run the server, write a unit test to test functionalities of the endpoints using requests
with test payloads for every endpoint.
There are different approaches to be tried about how the user approaches the definition of his/her API.
The procedure we need to implement can be realized as:
ApiDocumentation
first)Check how this flow is handled in Levanzo.
Check official use cases: https://github.com/HydraCG/Specifications/commits/master
Currently, the feature is still in TODO.
Line 22 in f88d0d0
We can add a unique nonce by using random numbers generated using a seed, which can be the sha224 digest of user id or any other parameter.
Yes
@Mec-iS Please elaborate on this TODO.
PS: Working on this. PR coming up in a while.
Based on the papers enlisted in Gitter channel, define possible evolution for Hydrus as a mature Web server.
https://arxiv.org/abs/1610.06264
https://arxiv.org/abs/1603.04068
http://arxiv.org/abs/1701.05397
http://arxiv.org/abs/1707.06507
[UPDATE]
http://redisgraph.io/design/
Introduction and paper here: https://redis.io/topics/indexes#representing-and-querying-graphs-using-an-hexastore
It would be great to have a tutorial like this for developer: https://bamtech.gitbooks.io/dev-standards/content/backend/graphql-js/getting-started-with-apollo-server-dataloader-knex.mo.html
A HYDRA server that can serve data and metadata to a client (this layer can be split into a traditional lower level server relying on a graph database plus a "HYDRA middleware"), A client that can "understand" HYDRA metadata and connect to HYDRA-enabled services, and possibly "learn and remember" about past interactions with other services to store its own set of concepts to be used in the usage's domain.
I think we must insist on developing the client and server as independently as possible. Ideally, different people would work on each part. If we end up with tightly coupled client and server, we will fail to demonstrate the value of Hydra.
Now that we need to run computations on classes defined in the ApiDoc we need a way of creating Python classes or extending or wrapping SQLAlchemy models to add the needed methods for every type of class.
In general, we need some sort of abstract layer that:
# an instance of the Drone model is loaded from the db
drone = Instance() # of `type_` equals `Drone`
class_ = instance.type_ # should find a way to return a class
# some attribute needs to be set taken from the ApiDoc, like SupportedOperations
# so we need some method (defined in the model) that allows something like Python `getattr` and `setattr`.
def get_status(drone):
"""retrieve a status object for an instance of a class that holds a Status"""
return drone.status
def set_status(drone, value)
"""set the value for status if the class support the Status object (as in the ApiDoc)"""
drone.status = value
class_.add_supported_property('get_status') = get_status
class_.add_supported_property('set_status') = set_status
instance.set_status(Status()) # also Status is a class generated from the
To resume, we need to have a factory for Python classes (newly generated or extending the SQLAlchemy class) to have the possibility to map the instances in the simulation. At simulation start-up all the RDFclasses in the db are loaded into Python classes, all the supported methods are added to them. Then all the instances from the db are instantiated and all the methods are ready to be used to map the actions/movements of the components in the simulation.
Mypy type annotations in some files are gone after merging #117
All files should have mypy type annotations
See app.py and several other files
https://wiki.python.org/moin/DocumentationTools
Autobuilding of documentation should be implemented with Sphinx or similar tool, and the documentation be published at https://readthedocs.org/
Docstrings and functions' signatures should be ported to supported formats.
see comment on #53 >>> #53 (comment)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.