data2semantics / d2s-tools Goto Github PK
View Code? Open in Web Editor NEWSome tools for dealing with RDF data in a Machine Learning/Complexity context
Some tools for dealing with RDF data in a Machine Learning/Complexity context
To reproduce: run the iterate-test workflow, open the report for the multiplier module. Set the horizontal axis and color to i:second. Dots with the same i:second have different colors.
When I log the colors to the JS console, everything looks alright, so the problem may be in the plotting code.
The repeat key for modules has not yet been implemented.
Currently, when a module has a non-public constructor, the workflow fails with a mysterious error. This should be checked beforehand, and give a nicer error.
Search the classpath for classes tagged Domain, and add them automatically (see Global.static)
The methods rank(), ready() and finished() in AbstractModule have a greedy implementation which means that every method call walks through the whole workflow multiple times. They should cache their states, if this turns out to be a performance problem.
My guess is that at the moment a workflow with a cyclic dependency fails by causing an infinite recursion in the rank method.
We should check during or after parsing whether the workflow has cycles and throw a nice supportive exception.
For quick modules, with a single output, it would be nice to be able to use a shorthand. Something like the following:
module:
name: big complex module
inputs:
dataset: smooth(csv('C:\data\dataset.csv), 0.3)
Which would be equivalent to
- module:
name: csv
inputs:
file: 'C:\data\data.csv'
- module:
name: smooth
inputs:
data: csv.result
smoothing factor: 0.3
- module:
name: big complex module
inputs:
dataset: smooth.result
This should be thought out carefully on the design level before implementation.
For a small workflow, we should allow the user to start with the list of modules directly in the yaml file.
Specifically, the keys workflow, name and modules should be made optional.
Currently, we tend to assume that all variable have a JavaType DataType. We need to work out exactly how the platform should handle types in different languages and how these should be converted.
Currently providing a set of references as an input in the workflow description is not supported (only raw values).
We should even be able to create a mixed list, with some raw values, and some references. This will be tricky.
Consider the following way of describing modules:
@Module(name="Adder", description="Adds two numbers together")
public class Adder
{
@Main
public static add(
@In(name="first") int first,
@In(name="second") int second)
{
return first + second;
}
}
This is a concise way of writing small modules, but it's tricky to implement. This should be thought out carefully before implementation.
If a module has different methods of construction (like different constructors), the same @in name is allowed to occur in both, but only with the same type. This should be checked explicitly, probably by JavaDomain.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.