Coder Social home page Coder Social logo

changxinbai / dataview Goto Github PK

View Code? Open in Web Editor NEW

This project forked from shiyonglu/dataview

0.0 0.0 0.0 130.18 MB

DATAVIEW is a big data workflow management system. It uses Dropbox as the data cloud and Amazon EC2 as the compute cloud. Current research focuses on the security and privacy aspects of DATAVIEW as well as performance and cost optimization for running workflows in clouds.

JavaScript 20.34% HTML 1.42% CSS 4.97% Java 73.25% Shell 0.02%

dataview's Introduction

DATAVIEW

DATAVIEW (www.dataview.org) is a big data workflow management system. It uses Dropbox as the data cloud and Amazon EC2 as the compute cloud. Current research focuses on the performance and cost optimization for running workflows in clouds.

DATAVIEW supports two programing interfaces to develop and run workflows:

  1. JAVA API: A programmer can develop various workflow tasks and workflows based on the DATAVIWE libraries. /DATAVIEW/src/test.java shows the six steps to create a customized workflow and execute it in Amazon EC2.
  • The external dependecies libraries must be added to the Eclipse project from /DATAVIEW/WebContent/WEB-INF/lib
  • The accessKey and secretKey should be updated in config.properties under /DATAVIEW/WebContent/workflowLibDir/
  • After finishing the workflow, please terminate all the EC2 instances from your AWS account manually.
  1. Visual Programming: DATAVIEW is deployed as a Web site in Tomcat and a user can drag and drop tasks and link them into a workflow in a visual workflow design and execution environment called Webbench.
  • A dropbox accout is necessary to store all the input data, workflow tasks, the final output files produced by the workflow execution. The user needs to create Three default folders Dropbox/DATAVIEW/Tasks, which stores the task file (class file or jar file); Dropbox/DATAVIEW/Workflows, which stores the mxgraph file for the generated workflow; Dropbox/DATAVIEW-INPUT, which stores the input files for a workflow. Four relational algebra tasks (jar files) and input files are already stored under the DATAVIEW/WebContent/workflowTaskDir folder.
  • A local account needs to be registered to show a visualized workflow.
  • A dropbox token should be provided in the main interface when you login in, which can be generated based on this tutorial:https://blogs.dropbox.com/developers/2014/05/generate-an-access-token-for-your-own-account/

Download and configure DATAVIEW as JAVA API

Check out tutorial: https://youtu.be/xJikeWptYSw or follow the instructions below:
  1. Download the DATAVIEW package from https://github.com/shiyonglu/DATAVIEW by clicking the "Clone or Download" button.
  2. Unzip the DATAVIEW-master.zip file and import the DATAVIEW project into Eclipse as an "Existing Projects into Workspace" by selecting "Projects from Folder or Archive".
  3. The external dependecies libraries must be added to the Eclipse project from /DATAVIEW/WebContent/WEB-INF/lib
  4. /DATAVIEW/src/test.java shows the six steps to create a new workflow and execute it with local executor.

Download, configure, and deploy DATAVIEW as a Website

  1. Follow the first three steps from

    Download and configure DATAVIEW as JAVA API

  2. Create three default folders Dropbox/DATAVIEW/Tasks, which stores the task file (class file or jar file); Dropbox/DATAVIEW/Workflows, which stores the mxgraph file for the generated workflow; Dropbox/DATAVIEW-INPUT, which stores the input files for a workflow in your dropbox.
  3. Get a dropbox token.

DATAVIEW Tutorials

  1. Chapter 1: A gentle introduction to DATAVIEW ๏ผˆhttps://youtu.be/7S4iGKXpaAc)
  2. How to download, import DATAVIEW into Eclipse as Java API and run a workflow with local executor (https://youtu.be/xJikeWptYSw)

dataview's People

Contributors

changxinbai avatar shiyonglu avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.