Coder Social home page Coder Social logo

gettingandcleaningdatacp's Introduction

Getting And Cleaning Data - Cleaning and Tyding Human Activity Recognition Using Smartphones Data Set

Warning - this readme file is written using Github Flavoured Markdown - some formatting maybe loss if viewing using different markdown parser.

This reposity contains tools to extract average of feature variable from training and test data sets in UCI Machine Learning Reposity: Human Activity Recognition Using Smartphones Data Set.

Original data set is present here: http://archive.ics.uci.edu/ml/datasets/Human+Activity+Recognition+Using+Smartphones .

This script:

  • extracts data from two data sets: training and test data sets.
  • combines subject, activity labels and measurments for training and data sets. It results in two complete data sets: one for training data and one for test data.
  • combines training and test data into one data sets
  • replaces column names with readable ones using source data from feature.txt file.
  • replaces numeric values in ACTIVITY column with activity names taken from activity_labels.txt file.
  • select only mean and std measurments into data frame
  • resulting data frame is used to create a tidy data set
  • data is cleaned, only complete cases are taken
  • values are arragned ascending by ACTIVITY and SUBJECT
  • scripts then calculates mean values for every measurment each activty and each subject

To run this script you need:

  • install R packages: dplyr and sqldf
  • clone this repository and checkout master branch
  • source script run_analysis.R
  • run function runAnalysis(directoryName)
  • argument directoryName is character string. It expects a directory name of UCI Machine Learning data in your working directory. It should contain this structure:
   ├── activity_labels.txt
   ├── features_info.txt
   ├── features.txt
   ├── README.txt
   ├── test
   │   ├── Inertial Signals
   │   │   ├── body_acc_x_test.txt
   │   │   ├── body_acc_y_test.txt
   │   │   ├── body_acc_z_test.txt
   │   │   ├── body_gyro_x_test.txt
   │   │   ├── body_gyro_y_test.txt
   │   │   ├── body_gyro_z_test.txt
   │   │   ├── total_acc_x_test.txt
   │   │   ├── total_acc_y_test.txt
   │   │   └── total_acc_z_test.txt
   │   ├── subject_test.txt
   │   ├── X_test.txt
   │   └── y_test.txt
   └── train
       ├── Inertial Signals
       │   ├── body_acc_x_train.txt
       │   ├── body_acc_y_train.txt
       │   ├── body_acc_z_train.txt
       │   ├── body_gyro_x_train.txt
       │   ├── body_gyro_y_train.txt
       │   ├── body_gyro_z_train.txt
       │   ├── total_acc_x_train.txt
       │   ├── total_acc_y_train.txt
       │   └── total_acc_z_train.txt
       ├── subject_train.txt
       ├── X_train.txt
       └── y_train.txt

Default value for this argument is UCI HAR Dataset. If your directory name is the same, you don't need to provide it.

  • the result of this function is data frame. All columns are described in CodeBook.md

gettingandcleaningdatacp's People

Contributors

lukgad avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.