Coder Social home page Coder Social logo

nunesma / data_science Goto Github PK

View Code? Open in Web Editor NEW
0.0 3.0 0.0 533.05 MB

Repos related to Johns Hopkins University and Coursera's Data Science specialization

HTML 91.91% Makefile 0.01% JavaScript 6.40% R 0.13% CSS 1.42% TeX 0.12% Shell 0.01% Rebol 0.01%
data-science data-analysis r-programming machine-learning

data_science's Introduction

Data Science

Repos related to Johns Hopkins University and Coursera's Data Science Specialization.

Table of contents:

  • Getting and Cleaning Data project
  • Exploratory Data Analysis project
  • Reproducible Research project
  • courses - Data Science Specialization
  • DataScienceSpCourseNotes

Getting and Cleaning Data

This course will cover the basic ways that data can be obtained. The course will cover obtaining data from the web, from APIs, from databases and from colleagues in various formats. It will also cover the basics of data cleaning and how to make data β€œtidy”. Tidy data dramatically speed downstream data analysis tasks. The course will also cover the components of a complete data set including raw data, processing instructions, codebooks, and processed data. The course will cover the basics needed for collecting, cleaning, and sharing data.

Exploratory Data Analysis

This course covers the essential exploratory techniques for summarizing data. These techniques are typically applied before formal modeling commences and can help inform the development of more complex statistical models. Exploratory techniques are also important for eliminating or sharpening potential hypotheses about the world that can be addressed by the data. We will cover in detail the plotting systems in R as well as some of the basic principles of constructing data graphics. We will also cover some of the common multivariate statistical techniques used to visualize high-dimensional data.

Reproducible Research

This course focuses on the concepts and tools behind reporting modern data analyses in a reproducible manner. Reproducible research is the idea that data analyses, and more generally, scientific claims, are published with their data and software code so that others may verify the findings and build upon them. The need for reproducibility is increasing dramatically as data analyses become more complex, involving larger datasets and more sophisticated computations. Reproducibility allows for people to focus on the actual content of a data analysis, rather than on superficial details reported in a written summary. In addition, reproducibility makes an analysis more useful to others because the data and code that actually conducted the analysis are available. This course will focus on literate statistical analysis tools which allow one to publish data analyses in a single document that allows others to easily execute the same analysis to obtain the same results.

Data Science Specialization - (courses)

These are the course materials for the Johns Hopkins Data Science Specialization on Coursera

Data Science Specialization Course Notes - (DataScienceSpCourseNotes)

Compiled notes for all 9 courses of the Johns Hopkins Unversity/Coursera Data Science Specialization. The notes are all written in R Markdown format and cover all concepts convered in class, as well as additional examples compiled from lecture, exploration, StackOverflow, and Khan Academy. These documents are intended to be comprehensive sources of reference for future use.

data_science's People

Contributors

nunesma avatar

Watchers

James Cloos avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.