Coder Social home page Coder Social logo

dataanalysis_is665's Introduction

PROJECT1:

  1. Find a data visualization tool and learn to use it;

  2. Find a publicly available data set, use the data visualization tool to make a nice presentation of the data visually.

Using Tableau for Data Visualization:

Step 1: Please go to this URL: http://www.tableau.com/public/ to download the free version of Tableau.

Step 2: Refer to this resource page for more tutorials than you will need to build a nice visualization project: http://www.tableau.com/public/training

Step 3: Look for inspiration: the gallery is a good place to look for ideas if you do not know where to begin: http://www.tableau.com/public/gallery

Step 4: Look for data: Sample data sets can be found here: http://www.tableau.com/public/community/sample-data-sets. In addition, there are many data sets online these days at other web sites such as kaggle. You can also use your own data set if you have an interesting problem of your own to work on.

DELIVERABLE:

  1. Is the use of visual components effective and intuitive?

  2. Is the dashboard INTERACTIVE? Does it allow users to drill up and down (or across) through the use of various widgets?

  3. Is the message clear thanks to the visualization?

PROJECT2:

In this project, each team will use a publicly available data set, define a mining problem, then use at least TWO different mining algorithms to mine the data set. Compare the performance of the two models, choose the better one, and interpret your findings. Teams will write a report summarizing their findings. The report is due at the end of the semester.

To look for a data set, you can go to kaggle,com or UC Urvine machine learning data repository. There are plenty of data for you to choose from. Make sure you read files that describe the data before you engage in mining.

To mine, you can use Rapid Miner. If you do use it, make sure to have a screen shot for EACH mining model for your presentation.

The resulted confusion matrix and performance parameters need to be shown in your presentation.

In your discussion, focus on your findings and their significance, both mathematically and practically.

STRUCTURE:

Part I: Introduction. Discuss the problem at hand, the background of the data set. Who collected/created it, for what purpose, etc.

Part II: Data. Show summary stats for your data. Number of rows, columns, median, mean, standard deviation, etc. RapidMiner could be a very useful help for this task.

Part III: Mining Algorithms. Introduce at least TWO CLASSIFICATION / PREDICTION algorithms covered in our class. Show screenshot of the mining workflow.

Part IV: Evaluation. The most important part. You will address the following issues:

a. Do you choose precision or recall as the main measure for your task? Why?

b. Show the confusion matrix for the two algorithms. Which one is better?

Project Link - https://public.tableau.com/profile/sirisha.bojjireddy#!/vizhome/BankData_15736791328020/Story1

dataanalysis_is665's People

Contributors

sirishabojjireddy avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.