Coder Social home page Coder Social logo

gettingandcleaningdata's Introduction

Introduction

This repository contains the implementation of the course project for the MOOC Getting and Cleaning Data on Coursera. The purpose is to create a tidy dataset from another (less tidy) dataset.

The dataset is used for Human Activity Recognition Using Smartphones. It contains data on experiments where people moves where recorded with the sensor signal (accelerometer, gyroscope) of a Samsung Galaxy S.

The dataset is split across several files:

File Content
activity_labels.txt Contains the activities labels (WALKING, WALKING_UPSTAIRS, WALKING_DOWNSTAIRS, SITTING, STANDING, LAYING)
features_info.txt Feature description, how they are calculated...
features.txt List of all features. It will be used to extract column names

Then there are two dataset (train and test), and for each:

File Content
subject_{train,test}.txt List of subject by observation
X_{train,test}.txt The observation with as many features as described in the features.txt file
y_{train,test}.txt The observed activities

subject_{train,test}.txt, X_{train,test}.txt and y_{train,test}.txt have the same length.

Extracted values

Variables are described in the CodeBook. The main objective was to extract variables that match only the mean or std, so we can focus on means and standard deviation. Then variables were slightly renamed to remove parenthesis and replace '-' by '.'.

At last, the tidy data set contains means for each of those extracted variables for each activity then by each subject.

Generate the tidy dataset

Clone this repository, then open the script with R:

    source('run_analysis.R')

You might need to install the following packages first: data.table, plyr.

The data is downloaded from https://d396qusza40orc.cloudfront.net/getdata%2Fprojectfiles%2FUCI%20HAR%20Dataset.zip, and unzip'ed in the data/ directory.

The run_analysis writes the result in a file Meandata.txt in the current working directory. Format of this file is described in the CodeBook file.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.