Coder Social home page Coder Social logo

programming-for-data-analysis-2019's Introduction

Programming for Data Analysis, Higher Diploma in Data Analytics

Student: Niamh O'Leary

ID: G00376339

Task The following assignment concerns the numpy.random package in Python. This required required the ceration of a Jupyter notebook to exaplain the use of the package, including detailed explanations of at least five of the distributions provided for in the package.

There are four distinct tasks to be carried out in your Jupyter notebook.

  1. Explain the overall purpose of the package.
  2. Explain the use of the “Simple random data” and “Permutations” functions.
  3. Explain the use and purpose of at least five “Distributions” functions.
  4. Explain the use of seeds in generating pseudorandom numbers.

Getting started

Download and install Python and Anaconda All files associated with this project are available at https://github.com/NiamhOL/programming-for-data-analysis-2019

Packages used in this project

The following packages were used to run statistical analysis and draw grpahs for this project.

Python https://www.python.org/downloads/

Anaconda https://www.anaconda.com/distribution/ - is the easiest way to perfrom Python data science machine learning on Linux, Windows and Mac OS.

iPython https://ipython.org/ - it an interactive command-line terminal for Python.

Numpy http://www.numpy.org/ - is the fundamental package for scientific computing within Python.

Jupyter Notebook https://jupyter.org/ - is an open-source web application that allows the creation and sharing of documents that contains live code, equations, visualisations and narriative text.

Importing packages

The above packages can be imported into Python. Use Import function in iPython as follows:

'import ipython'
'import numpy as np'
'import jupyter notebook'
'import matplotlib.pyplot as plt'

Background

"NumPy's random number routines produce pseudo random numbers using combinations of a BitGenerator to create sequences and a Generator to use those sequences to sanple from different statistical distributions. [1] This random number generator was designed with the focus on modelling and simulation. A common task in data analysis is the creation of random samples. NumPy Random provides a way of creating random samples, which can then be used for data analysis.

NumPy functions operate on numbers and they are especially useful for data science, statistics and machine learning. Which often use very large dataset of numneric learning. An intrical part of machine learning and deep learning is data manipulation. NumPy provides an excellent toolkit to help "clean up" data for data manipulation.

The core functionality of NumPy is its "ndarray", data structure. Which describes the collection of items of the same type. "Every item in an ndarray takes the same size block in the memory" [2] Ndarry's can be indexed to allow for analysisng and data manipulation.

This assignment will focus on using NumPy to generate random samples of a population to check the validity of conclusions that are being drawn from the whole population.

Juypter notebook

The Juypter notebook attached to this project contains the answers to the four tasks.

References

[1] https://numpy.org/doc/1.17/reference/random/index.html

[2] https://www.tutorialspoint.com/numpy/numpy_ndarray_object.htm

Biblography

Jupyter Documentation https://jupyter.org/documentation

Numpy.random https://docs.scipy.org/doc/numpy-1.15.0/reference/routines.random.html

https://www.python-course.eu/python_numpy_probability.php

https://www.r-craft.org/r-news/how-to-use-numpy-random-choice/

Author: Niamh O'Leary

programming-for-data-analysis-2019's People

Contributors

niamhol avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.