dependablesystemslab / thingsjs Goto Github PK

ThingsJS is a framework for running JavaScript applications on IoT devices such as Raspberry PIs. For more details, see below:

Home Page: http://thingsjs.io

License: MIT License

JavaScript 99.93% HTML 0.07%

middleware iot-platform iot-framework iot-middleware iot-cloud

thingsjs's Introduction

ThingsJS

ThingsJS is a framework for running JavaScript applications on IoT devices such as Raspberry PIs

NOTE: This repository is currently under active development and its contents are subject to breaking changes.

Directory Structure

/bin
/docs
/lib
    /core
    /util
/util
    /dashboard
/samples

This repository is organized thus:

bin contains the things-js CLI (shell script) that is installed with the package. Also contains default config files.
docs contains the API documentations generated by JSDoc.
lib contains the core ThingsJS code
1. core contains the main ThingsJS objects such as Code, CodeEngine, Pubsub, Dispatcher.
2. util contains general-purpose utility modules used in the system.
util contains supplementary apps and debug tools
1. dashboard contains the ThingsJS Dashboard application; it is an express web-application
samples contains raw JavaScript sample code (non-instrumented) that can be dispatched to ThingsJS workers.

Dependencies

The ThingsJS framework uses MQTT Pub/Sub as its main communication mechanism and assumes the existence of an active MQTT service. Either Mosquitto, or Mosca will do. ThingsJS provides a basic Mosca server that can be started with things-js pubsub command. The Pub/Sub service is referenced only by the URL (e.g. mqtt://localhost) within the ThingsJS framework and does not assume any specific implementation of the service.

~$ things-js pubsub

For running the Dashboard web-application, MongoDB is required as the Dashboard uses it to store the end-user code.

~$ service mongod start

Getting Started

For trying out the framework, you can follow the steps below:

Installation

Option 1

git clone this repository

~$ git clone https://github.com/DependableSystemsLab/ThingsJS.git

npm install -g to install the package. (download all the npm dependencies and link the CLI tool) You may need to put sudo depending on your installation of NodeJS. You need the -g (global installation) option for using the CLI. If you don't plan on using the CLI, you can omit the -g option.

~$ cd ThingsJS
~/ThingsJS$ npm install -g

Option 2

Install via npm

~$ sudo npm install -g things-js

You may omit the sudo depending on your NodeJS install settings.

Using the CLI

Along with the API provided, CLI is included for easy use. Here are some of the commands: * things-js pubsub * things-js dashboard * things-js worker {config} * things-js instrument {code} You can find the full list of commands here.

To start the Dashboard Application:

~$ things-js dash

#OR

~$ things-js dashboard

By default it connects to MQTT at localhost:1883, MongoDB at localhost:27017/things-js-fs, and listens on localhost:3000. To start the dashboard with a different configuration, you can use the -c or --config options with the config file path provided as an argument. e.g.

~$ things-js dash -c my_config.conf

This will start a web-application served at the specified port. You can watch the demo of the Dashboard here:

Click Image to see Demo Video

Running a ThingsJS worker:

To start a ThingsJS worker, first you need to create a directory that will provide the NodeJS environment. This is because the worker needs to have a reference to the things-js module and any other npm modules that a ThingsJS user (developer) may require. If the worker cannot find a link to a node_modules directory, it will throw an error.

~$ mkdir hello_things
~$ cd hello_things
~$ npm link things-js

#create a config file for the worker first (e.g. node_00.conf) 

~/hello_things$ things-js worker node_00.conf

The configuration file is a required argument for starting the worker. It should contain the following information:

{
    "pubsub_url": "mqtt://localhost",
    "id": "node_00",
    "device": "raspberry-pi3"
}

To instrument raw JavaScript code into a "ThingJS-compatible" code:

~$ things-js inst my_code.js

#OR

~$ things-js instrument my_code.js

By default the output file will have the same file name, with the extension .things.js. To specify the output file name, provide the optional argument with -o or --output. e.g.

~$ things-js inst my_code.js -o my_code.instrumented.js

License

MIT

thingsjs's People

Contributors

Stargazers

Watchers

Forkers

rafizaman saidamira haozhouhadoop mgicode kanbang

thingsjs's Issues

Process' EXIT status over Pubsub never broadcasted

Scheduler doesn't have API for kill() and pause()

Regression tests

We will be refactoring the core library and doing a major rewrite to optimize the migration routine. Most of the higher-level API, however, will remain unchanged. Once the CLI is in good shape, we should write regression tests to ensure the refactored code behaves the same.

Code Engine: Required module for blinking LED via GPIO will not compile on Windows

Describe the bug
The onoff module is being used to allow Code Engine to indicate a busy status via blinking LED. There is a problem requiring the onoff module on Windows because it assumes a Linux interface to access the GPIOs.

This isn't a major issue when we consider that ThingsJS is supposed to run on IoT devices but may be a problem for developers who have Windows computers.

To Reproduce
Run the things-js command on a Windows machine

Screenshots

Desktop:
Windows 10

We need a way to identify when a device is running more than one Code Engine instance

The Scheduler algorithm takes into account the current available memory of each live device.
Currently we expect that each device only runs one Code Engine instance, and the Code Engine instance will broadcast the device's available memory.

However if one device runs n Code Engine instances, the Scheduler will think there is n times the available memory.

Perhaps each Code Engine can also provide in its meta what device it belongs to (e.g. via MAC address).

GFS: appendFile() and deleteFile() have small bug related to query

appendFile() currently modifies no object because there is no query passed in to find the object.

Currently in lib/extensions/gfs/gfs.js:169:

				return FSObject.findOneAndUpdate(node._id, {

It should be:

				return FSObject.findOneAndUpdate( { _id: node._id }, {

This is the same case for deleteFile() in lib/extensions/gfs/gfs.js:238.

Dispatcher can issue a migration request that comes back as failed even though it succeeded

Let's say we want to migrate Foo from Pi0 to Pi1, using Dispatcher.

The current protocol is then:

Dispatcher sends pubsub message to Pi0 (migrate Foo)
Pi0 sends pubsub message to Pi1 (run my snapshot of Foo)
Foo successfully migrated to Pi1
Pi1 sends an ACK back to Pi0
Pi0 sends ACK back to Dispatcher (your request was successful)

If Pi0 dies after step 3 but before step 5, Dispatcher will never receive an ACK, the migrate request will timeout and the user will think the migration failed even though it succeeded.
Maybe instead of listening for an ACK from Pi0, should Dispatcher listen for ACK from Pi1?

Change code database interface to REST API (Dashboard)

Right now the interface for saving code in DB from front-end is done through WebSocket by defining the action key in the message. It is better to isolate this part out of the WebSocket message handling routine, both from the back-end and front-end.

We should instead create an express.Router and provide a RESTful endpoint for the code stuff. This requires updating the server-side code in util/dashboard/application.js as well as the client-side code in util/dashboard/www/app.js. We can use WebSocket just to broadcast the change in the DB if there was one.

By doing this we will limit WebSocket usage mostly to relaying the pub/sub messages.

Dispatcher.prototype.moveCode and Dispatcher.prototype.spawnCode missing arguments

Dispatcher uses the function moveCode() and spawnCode() to make Code engine migrate/spawn code via pubsub. Code engine relies on the object kwargs sent by the Dispatcher over pubsub to execute the command correctly. For migration/spawn, kwargs needs to have an instance_id.

However, Dispatcher does not provide this to Code engine, e.g.:

Dispatcher.prototype.moveCode = function(from_id, to_id, code_name){
	this.sendCommand(from_id, 'migrate_code', {
		engine: to_id,
		code_name: code_name
	})
};

Here, Dispatcher only provides kwargs with engine and code_name, causing errors when Code engine receives the command, e.g.:

[ERROR] Code count.js ipcSend("undefined", "SNAPSHOT"): No child process

Scheduler: Incorrect view of the network for nodes that don't exit elegantly

Code Engine will report that it is dead over pubsub upon receiving a kill call, which Scheduler relies on to find available devices:

			if (engine.status !== 'dead'){
				var stat = engine.getStat();
				mapping[engine.id] = {
					available_memory: (stat.device_memory / 1000000),
					processes: {}
				};
			}

This is a problem if a Code Engine instance crashes or exits with Ctrl+C. Currently, these dead engines remain on the list of available devices for every new invoke period of Scheduler, but should be considered dead.

Race condition for Pubsub unsubscribe function

Currently what happens is that in Pubsub.prototype.unsubscribe, the handler for the topic is first deleted before the mqtt client unsubscribes from the topic. During this time however (when the topic handler no longer exists but the mqtt client has not yet completed unsubscribing), a publisher can publish to the topic which will invoke a non-existent handler.
Error received:

   Uncaught TypeError: self.handlers[topic] is not a function
      at MqttClient.<anonymous> (C:\Users\Benji\ThingsJS\ThingsJS\lib\core\Pubsub.js:39:23)

Handler should probably be removed after mqtt client successfully unsubscribes first.

Exporting the list of "output topics" for components (metadata)

In order to make the dashboard more generic (i.e., so that output can be shown, for any component), we need a mechanism by which components can "announce" their relevant "output topics", and the type of contents that they publish. This should be done through the metadata.

Example: for the video streamer, one output topic could be defined: the video stream, the type of contents would be "PNG" images. For the motion detector, two output topics could be defined: the motion-video stream (PNG images type) and whether motion is detected (perhaps JSON or text output).

Common output types could be: PNG (sequence of PNG images -- can emulate a video stream), text, JSON, CSV (all text-based formats would all be displayed in a text-like format, for the moment) etc. Eventually we could add support for real video streams (more efficient than PNG sequences), audio, etc.

Code Engine: Error thrown if reporting interval already begins as kill() is called

If CodeEngine.kill() is called before the asynchronous callback in CodeEngine.reportResources is executed (as a result from the call in CodeEngine.reportTimer), an error is thrown because the pubsub node is dead.

(node:9800) UnhandledPromiseRejectionWarning: Error: client disconnecting

Shell needs to be updated to use the latest API

Shell.js is currently using version 1 API, so some of the commands no longer work.
Since the last update to Shell.js, we have made breaking changes to:

Pubsub
Code
CodeEngine
Dispatcher

Perhaps we should take this opportunity to revise some of the shell commands as well.

Scheduler: First-fit algorithm throws exception for empty device list

Scheduler.js:282 : devices[0] is undefined

tasks.forEach(function(task){
	var most_space = Object.keys(mapping).reduce(function(acc, id){
		return (mapping[id].available_memory > mapping[acc].available_memory) ? id : acc;
	}, devices[0].id);

Error thrown:

     TypeError: Cannot read property 'id' of undefined

npm install not working

Describe the bug
A clear and concise description of what the bug is.

To Reproduce
Steps to reproduce the behavior:

Go to '...'
Click on '....'
Scroll down to '....'
See error

If applicable, provide a minimal reproducible example.

Expected behavior
npm install -g things-js should work, but it gives errors.

Desktop (please complete the following information):

OS: Ubuntu 16.04
Browser Chrome
Version v2.0.0

Check for valid instance is only done in Code.prototype.ipcSend()

In lib/core/Code.js, Code.prototype.pause(), Code.prototype.resume(), and Code.prototype.snapshot() all call ipcSend() which checks that the instance_id passed in exists in the processes dict. However, pause(), resume(), and snapshot() also rely on the instance_id being defined in the processes dict, and will throw an error from a bad instance_id leading to an undefined object - e.g.:

	if (instance.status === Code.Status.RUNNING){
	             ^

TypeError: Cannot read property 'status' of undefined
    at Code.pause (ThingsJS\lib\core\Code.js:678:15)

Documentation update: install instructions

We need to add alternative install instructions:

git clone option
wget tarball option
npm install option

GFS: Race condition for multiple nodes writing to same file

Because there is no lock or queue for multiple writers of the same file, these situations arise:

appendFile:

Final file contents depend on what the last successful writer read before appending. This may cause the appends from other nodes to be overwritten.

Performance bottleneck in Timers

One important observation from profiling a raw code and instrumented code is that creating anonymous functions is expensive.

In our current implementation, we create a wrapper function for each user function. We also wrap all the timer callbacks to support pause/resume of timers. This is okay when there aren't that many timer operations, but when a program is using, for example, recursive setImmediate calls to implement a loop, the lazy-compilation of the wrapped callbacks starts to be a performance bottleneck. A rough measurement shows that factorial.js slows down by ~3x when instrumented.

We need a clever way to mitigate this issue, while preserving the pause/resume capability of timers.

Code Engine reports resource using closed pubsub node after kill() is called

Each Code Engine object reports its resource every second over pubsub through the call CodeEngine.prototoype.reportResource, but when CodeEngine.prototype.kill is called, its pubsub object is also killed. This will cause an error to generate every second now when the Code Engine tries to report its resource again using the dead pubsub node:

(node:9224) UnhandledPromiseRejectionWarning: Error: client disconnecting
    at MqttClient._checkDisconnecting (C:\Users\Benji\ThingsJS\ThingsJS\node_modules\mqtt\lib\client.js:337:16)
    at MqttClient.publish (C:\Users\Benji\ThingsJS\ThingsJS\node_modules\mqtt\lib\client.js:377:12)

GFS: API functions invoked before mongodb connection fully resolves fails

Since the global file system bootstraps asynchronously,
API calls made right after require('things-js').GFS(mongoUrl) fails because the connection is not open by then.

Example:

var gfs = require('things-js').GFS('mongodb://localhost/my-global-filesystem');
gfs.readFile('/my-test-file.json', function(err, data){
    console.log(data.toString());
});

In this example, if the mongo client is not ready by line 2 (which it won't be for most devices)
the readFile call fails.

If we were to keep the synchronous interface so that people don't have to write for example:

var gfs = require('things-js').GFS('mongodb://localhost/my-global-filesystem');
gfs.on('ready', function(){
    gfs.readFile('/my-test-file.json', function(err, data){
        console.log(data.toString());
    });
});

we would need to implement GFS as a duplex stream and maintain a request/response buffer.

Relative paths not supported for required modules

Relative paths do not resolve properly for instrumented code in require statements because they take the path relative to the location of CodeV3.js, and not the location of the current working directory.

Code's pubsub is not killed in Code.kill() if Code created from Snapshot

A new pubsub object is created when Code.fromSnapshot is called, but is not killed when Code.kill is executed.

Guide: issues and typos

tutorials/Setup-BeagleboneBlack.md

After "Install ThingsJS using the Getting started steps"

RIoTBenchmarks.html

Proofreading required

/Community/comittee.html

Should contain list of all HQPs
Rename the page

/api

Proof developer API when completed

(Mailing list)

Create 'dev' mailing list

things-js shell

Right now the things-js CLI provides basic functionality for starting up a worker, instrumenting code, and dispatching code to a things-js worker. It would be nice to have a things-js shell in which we can query a list of active workers and manage them. i.e. dispatch code, migrate code, stop code, collect statistics, etc.

Having a shell and a comprehensive CLI will make it possible to write things-js scripts that will allow us to automate system-level tasks, such as scheduled migration, logging, etc. This will also make it easy to write system-level regression test suites.

Code Engine: Support for process.argv

Is your feature request related to a problem? Please describe.
For the IoT benchmarks, we want to be able to easily vary the rate at which the spout node emits the data.

Currently, we can edit the setInterval rate in the source code, but this requires multiple "versions" of the same code with different setInterval rates.

Describe the solution you'd like
Something akin to process.argv for Code Engine that would allow the code to be more reusable, e.g. a CodeEngine.runCode equivalent of:

> node SenMLSpout.js 3000 : emit data at a rate of 3s

Describe alternatives you've considered
Just alter source code

Monitoring resource usage per individual program instances

After we updated the CodeEngine API to accommodate multiple program instances, the web dashboard no longer displays resource usage for the programs because each program now publishes its resource statistics individually.

Code not killed if CodeEngine.kill() called before Code.run() resolves

Scenario:

CodeEngine receives a run_code command -> CodeEngine.kill() called -> Code.kill() is called on all running code for this engine -> Code.run() didn't finish resolving yet, so Code.kill() does not kill the code that is about to run because Code.processes hasn't been updated -> Error thrown once code begins to run with a 'killed' CodeEngine

(node:9036) UnhandledPromiseRejectionWarning: Error: client disconnecting

Migration causes failure if the specific code already exists on the engine code is migrating to

Code Engine keeps track of all codes it is running in a field called codes, a mapping of the code name to Code object.
When code migrates from engine n₀ to n₁, n₁'s codes field is updated to the code recovered from the snapshot

		var code = Code.fromSnapshot(kwargs.snapshot);
		self.codes[code.name] = code;

This will cause problems in an example like this:

Code Engine n₀ is running Foo.js
Code Engine n₁ is running Foo.js
n₁ migrates its Foo.js instance to n₀
n₀'s original Code object for Foo.js is overwritten and its reference is lost

This will specifically cause problems for the Scheduler because it assumes n₀ still has reference to its original Foo.js. If the next schedule is computed where n₀ should migrate its original Foo.js, an error like this will be thrown:

TypeError: Cannot read property 'snapshot' of undefined at Object.migrate_code (ThingsJS\lib\core\CodeEngine.js:269:69)

process.stdin.setRawMode does not work on Windows

In bin/things-js, process.stdin.setRawMode is called to control running instrumented code through terminal input. This needs a TTY stream which is not provided by Windows and will throw the following exception:

(node:9980) UnhandledPromiseRejectionWarning: TypeError: process.stdin.setRawMode is not a function

It might be a good idea to document that the CLI works best for Unix-like OS.

Running code without a Code Engine causes Scheduler to crash

For example, if you use the ThingsJS CLI and do:

things-js run <file>

The program does not "belong" to any Code Engine instance and will cause an exception to be thrown from Scheduler.js:147 because proc.engine does not exist:

		Object.values(self.programs).forEach(function(proc){
			if (self.engines[proc.engine].status !== 'dead'){
				mapping[proc.engine].processes[proc.id] = {

File system: Need to organise the directory structure.

Creates the new file in the root directory. Needs organization

test:scheduler: asynchronous mocha tests should use mocha's async API

Many of the scheduler tests are failing and while I haven't looked closely enough to pin-point the cause of the failures, I see many "hacky" routines implemented just to handle asynchronous operations.

Mocha provides async support for code using Promise or callbacks thus:

// synchronous test
it('should test synchronously', function(){
        // test body
}

// asynchronous test (callback)
it('should test asynchronously', function(done){
        // test body
});

// asynchronous test (Promise)
it('should pass when Promise resolves', function(){
    return new Promise(function(resolve, reject){
        // test body
    });
});

There seems to be a few places where the test is using this API, but it should be used across all async tests and we shouldn't have to hack around it.

Global File system: path direct error in readfile and writefile function

The writefile and readfile function only support for the operation in the root folder, once I want to specify a subdirectory like apps, it feels to navigate inside to do the read or write operation. For example
GFS.readFile('./code/test.json', function(err,data){
console.log(data);
})
should read te file inside the code folder, but it throw me an error as not found.
for write GFS.writeFile('./code/test.json', testdata,function(err,data){
console.log(data);
}), it failed to write data inside code folder and only write it to root folder.

Scheduler: Exception thrown when CodeEngine device stats are not available

When a new Code Engine instance begins to run, it publishes its existence to the topic engine-registry. Dispatcher listens to this topic to keep a mapping of all existing engines in this.engines.

Scheduler, who inherits from Dispatcher, takes this mapping when finding available devices (Scheduler.js:136 in Scheduler.prototype._assess):

		Object.values(self.engines).forEach(function(engine){
			if (engine.status !== 'dead'){
				var stat = engine.getStat();
				mapping[engine.id] = {
					available_memory: (stat.device_memory / 1000000),
					processes: {}
				};
			}
		});

If a Code Engine instance exits the network after it publishes its existence to engine-registry but before it publishes its status in a setInterval reportStatus() call, the following exception will be thrown:

[Scheduler] Invoke start
(node:1124) UnhandledPromiseRejectionWarning: TypeError: Cannot read property 'device_memory' of undefined
    at ThingsJS\lib\core\Scheduler.js:140:30
    at Array.forEach (<anonymous>)
    at ThingsJS\lib\core\Scheduler.js:136:31