Coder Social home page Coder Social logo

daru's Introduction

daru

Data Analysis in RUby

Gem Version Build Status

Introduction

daru (Data Analysis in RUby) is a library for storage, analysis, manipulation and visualization of data.

daru is inspired by pandas, a very mature solution in Python.

Written in pure Ruby so should work with all ruby implementations. Tested with MRI 2.0, 2.1, 2.2.

Features

  • Data structures:
    • Vector - A basic 1-D vector.
    • DataFrame - A 2-D spreadsheet-like structure for manipulating and storing data sets. This is daru's primary data structure.
  • Compatible with IRuby notebook, statsample, statsample-glm and statsample-timeseries.
  • Support for time series.
  • Singly and hierarchially indexed data structures.
  • Flexible and intuitive API for manipulation and analysis of data.
  • Easy plotting, statistics and arithmetic.
  • Plentiful iterators.
  • Optional speed and space optimization on MRI with NMatrix and GSL.
  • Easy splitting, aggregation and grouping of data.
  • Quickly reducing data with pivot tables for quick data summary.
  • Import and export data from and to Excel, CSV, SQL Databases and plain text files.

Notebooks

Notebooks on most use cases

Notebooks on Time series

Case Studies

Blog Posts

Time series

Documentation

Docs can be found here.

Roadmap

  • Enable creation of DataFrame by only specifying an NMatrix/MDArray in initialize. Vector naming happens automatically (alphabetic) or is specified in an Array.
  • Basic Data manipulation and analysis operations:
    • DF concat
  • Assignment of a column to a single number should set the entire column to that number.
  • Multiple column assignment with []=
  • Multiple value assignment for vectors with []=.
  • #find_max function which will evaluate a block and return the row for the value of the block is max.
  • Sort by index.
  • Statistics on DataFrame over rows.
  • Calculate percentage change.
  • Have some sample data sets for users to play around with. Should be able to load these from the code itself.
  • Sorting with missing data present.

Contributing

Pick a feature from the Roadmap or the issue tracker or think of your own and send me a Pull Request!

For details see CONTRIBUTING.

Acknowledgements

  • Google and the Ruby Science Foundation for the Google Summer of Code 2015 grant for further developing daru and integrating it with other ruby gems.
  • Thank you last.fm for making user data accessible to the public.

Copyright (c) 2015, Sameer Deshmukh All rights reserved

daru's People

Contributors

dansbits avatar mrkn avatar v0dro avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.