Coder Social home page Coder Social logo

sharmalab / datascope Goto Github PK

View Code? Open in Web Editor NEW
16.0 3.0 12.0 85.46 MB

Interactive linked visual query system for large datasets

Home Page: https://sharmalab.github.io/Datascope/

License: BSD 3-Clause "New" or "Revised" License

JavaScript 94.53% Python 0.11% CSS 3.10% HTML 2.23% Dockerfile 0.03%
nci-qin tcia-dac

datascope's Introduction

DataScope Build Status DOI

We propose an environment for visualizing and exploring multidimensional data. We propose methods to create a new search interface to the data as an alternate way to explore data, create dynamic dashboards that can be extended to support data exploration using Javascript libraries like crossfilter and dc.js. This method is extendible to support data from other remote archives.

Quickstart guide

(requires docker)

  • Enter the datascope directory (this directory)
  • docker build -t datascope .
  • docker run -p 3001:3001 datascope

Running Without Containers

Prerequisites
  • Install Node.js and NPM
  • sudo npm install -g webpack
  • sudo npm install -g forever ((Optional) recommended for production deployements)
  • sudo npm install -g apidoc
Installation
  • Clone the repository
  • Enter the datascope directory (this directory)
  • Get dependencies with ```npm install``
  • Run npm run-script build
Running
  • Copy an example config and data folders to this directory from examples

  • Modify the files present in config to fit your needs:

    • dataSource.json
    • dataDescription.json
    • interactiveFilters.json
    • visualization.json
    • dashboard.json (For dashboard settings)
  • Run node app.js

  • Goto http://localhost:3000 from your favorite browser.

Read the User Guide for more details

Recommended production deployement

We recommend deploying Datascope with forever.js.

  • Install forever.js npm install forever -g
  • forever start app.js
  • forever ps gives a list of current instances running. You can get uptime, log details etc.

Developers

API Documentation

Head over to API Doc for documentation about Datascope's REST API.

datascope's People

Contributors

ashishof77 avatar birm avatar dependabot[bot] avatar lastlegion avatar loghijiaha avatar sharmaashish avatar srflorea avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

datascope's Issues

Refactor FilteringAttribute.jsx

Right now the code for all the interactive filter types is in FilteringAttribute.jsx [Source]

We should consider splitting it into separate react components as we've done for Visualizations. So we'll have a directory of react components with each component having a separate file, e.g. PieChart.jsx, BarChart.jsx etc.

`Redo search in this area` approach for certain visualizations

For certain visualizations, to allow for smoother interactions we might want to consider using "redo search in current area" or for our purposes "filter in current area".

image

So whenever a user zooms into say a parallel coordinate or a map they have the option to set the appropriate filters in the rest of the dashboard.

Backend Improvements

  • calls don't appear to be stateless
  • calls may be too specific to certain visualizations (e.g. table)
  • support for user queries on data through datascope
  • documentation

Data load fails for some cases

the join of two provided files, and the manually joined data (~ 1k rows) both gave an error which seems to suggest that the read/join operation is timing out.
There is likely more we can discover, and hopefully some fix we can use to prevent this.

Add a numeric console to DataScope

The use case would be one where a user wants to do some simple number crunching on a column. Say splice data in a column and then do a unique count. Or maybe create a new column based on existing data.
This could be a major feature creep so we have to be careful. Maybe start with something simple and see how users respond to it.

Data in browser

Allow visualizations like parallel coordinates by showing at least some individual data.

Crossfilter in Browser

  • Add Rest call for crossfilter
  • Add module for crossfilter
  • Add front-end for crossfilter

DataTable pagination issues with large datasets

The DataTable is fairly slow on large datasets with response times 4-8 seconds. We need to look at optimized pagination strategy.

Table pagination and backend processing is fairly unoptimized right now since we make a call to .top(Infinity) on one of the dimensions to get the entire data.

dc-js/dc.js#966 is a useful resource to solve this.

dashboard.json should be optional, instead breaks

Error: ENOENT: no such file or directory, open 'config/dashboard.json'
at Error (native)
at Object.fs.openSync (fs.js:641:18)
at Object.fs.readFileSync (fs.js:509:33)
at exports.getDashboardConfig (/Users/birm/Desktop/git/Datascope/routes/rest.js:44:16)
at Layer.handle [as handle_request] (/Users/birm/Desktop/git/Datascope/node_modules/express/lib/router/layer.js:95:5)
at next (/Users/birm/Desktop/git/Datascope/node_modules/express/lib/router/route.js:137:13)
at Route.dispatch (/Users/birm/Desktop/git/Datascope/node_modules/express/lib/router/route.js:112:3)
at Layer.handle [as handle_request] (/Users/birm/Desktop/git/Datascope/node_modules/express/lib/router/layer.js:95:5)
at /Users/birm/Desktop/git/Datascope/node_modules/express/lib/router/index.js:281:22
at Function.process_params (/Users/birm/Desktop/git/Datascope/node_modules/express/lib/router/index.js:335:12)

Handle Missing Data

We need to setup our filters and visualizations, so that we can exclude missing/invalid data

Disable statistics icon if there are no stats available

Currently the statistics icon is present by default.
image

Even when there are no stats associated with an attribute like in the above example. We should only show the statistics icon when there are stats associated with the attribute. This also provides a visual cue for users about what attributes have statistics associated with them.

Travis issue with old node versions: "Error: Cannot find module 'JSONStream'"

I'm honestly not sure why we're testing against 0.1* when node is on 6./7.. That said, it looks like JSONStream isn't compatible with these versions.
My instinct is to change the node versions to the current ones, (I've added last stable already). Before I consider removing the old versions, I would like to know why they're there.

Consolidate Documentation

Since docs are spread across .md/html in repo and bitbucket wikis, we should consolidate them.
It looks like some of the code is formatted for jsdoc, so we should make sure that is consistent, and maybe add jsdoc to a release process.

Compressing data using data dictionary

The idea is to used compressed encodings of the data that would be available in a dictionary format. Currently i'm using the dataDescription.json as the place to define the dictionary for each attribute. For example for the attribute Specimen_Type the values 0, 1 and 2 correspond to tumor_tissue, normal_tissue and tumor_blood respectively:

     {
        "attributeName": "Specimen_Type",
        "attributeType": [
            "visual", "filtering"
        ],
        "dictionary": {
            "0": "tumor_tissue",
            "1": "normal_tissue",
            "2": "tumor_blood",
            "3": "tumor_marrow"
        }
    }

This involves modifying the AppStore.jsx which is the front-end data store to encode and decode data. Decoding is done when the \data?filter={} API is called to fetch data in encoded form from the server. Encoding is done on the filter JSON object.

Contextual Filter Visualizations

We need to be able to show both the original and filtered data in a way which describes the filter in the context of the data.

Flexible/Mobile Layouts

Should be able to respond to different size screens better.
Subtasks:

  • Maps currently require scrolling to see whole map
  • Filter width seems quite constant

Some examples silently exit

TitancSurvivors and newDataSourceConfig fail silently. Why is this, and what should be done or what should be logged.

Replacing crossfilter with druid

This would involve

  1. Ingesting data into druid. We'll need to change the dataSource module to ingest data into druid instead of storing it in memory with crossfilter
  2. We'll also need to modify how we process the filtering requests in /data request handler to perform the filtering on druid using something like Plywood.

More CI Tests

The fact webpack succeeds isn't alone a good indicator of a good commit/pr. Thus, we should at least unit test some core functionality.

  • Fix current tests
  • Add more tests

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.