Coder Social home page Coder Social logo

ppsr's Introduction

R-CMD-check

ppsr - Predictive Power Score

ppsr is the R implementation of 8080labs Predictive Power Score.

The Predictive Power Score is an asymmetric, data-type-agnostic score that can detect linear or non-linear relationships between two columns. The score ranges from 0 (no predictive power) to 1 (perfect predictive power). It can be used as an alternative to the correlation (matrix).

Read more about the (dis)advantages of the Predictive Power Score in this blog post.

Installation

You can install the development version of ppsr using the following R code:

# You can get the development version from GitHub:
# install.packages('devtools')
devtools::install_github('https://github.com/paulvanderlaken/ppsr')

Computing PPS

You can think of the predictive power score as a framework for a family of scores. There is not one single best way to calculate a predictive power score. In fact, there are many possible ways to calculate a PPS that satisfy the definition mentioned before.

Currently, the ppsr package calculates PPS by default:

  • Using the default decision tree implementation of the rpart package, wrapped by parsnip
  • Using 0 cross-validations
  • Using F1 scores to evaluate classification models. Scores are normalized relatively to a naive benchmark consisting of predicting the modal or random y classes
  • Using MAE to evaluate regression models. Scores are normalized using relatively to a naive benchmark of predicting the mean y value

Usage

The ppsr package has three main functions that compute PPS:

  • score() - which computes an x-y PPS
  • score_predictors() - which computes X-y PPS
  • score_matrix() - which computes X-Y PPS

where x and y represent an individual feature/target, and X and Y represent all features/targets in a given dataset.

Examples:

ppsr::score(x = iris$Sepal.Length, y = iris$Sepal.Width)
#> [1] 0.1822185
ppsr::score_predictors(df = iris, y = 'Species')
#> $Sepal.Length
#> [1] 0.6142833
#> 
#> $Sepal.Width
#> [1] 0.4014296
#> 
#> $Petal.Length
#> [1] 0.9299341
#> 
#> $Petal.Width
#> [1] 0.9399976
#> 
#> $Species
#> [1] 1
ppsr::score_matrix(df = iris)
#>              Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
#> Sepal.Length    1.0000000   0.1217743    0.6097753   0.4881395 0.4207886
#> Sepal.Width     0.1822185   1.0000000    0.3000489   0.3174639 0.2237120
#> Petal.Length    0.6687765   0.2815605    1.0000000   0.8072795 0.7972117
#> Petal.Width     0.5436489   0.2301854    0.7732164   1.0000000 0.7630875
#> Species         0.6142833   0.4014296    0.9299341   0.9399976 1.0000000

Visualizing PPS

Subsequently, there are two main functions that wrap around these computational functions to help you visualize your PPS using ggplot2:

  • visualize_predictors() - producing a barplot of all X-y PPS
  • visualize_matrix() - producing a heatmap of all X-Y PPS

Examples:

ppsr::visualize_predictors(df = iris, y = 'Species')

ppsr::visualize_matrix(df = iris)

Open issues & development

PPS is a relatively young concept, and likewise the ppsr package is still under development. If you spot any bugs or potential improvements, please raise an issue or submit a pull request.

On the developmental agenda are currently:

  • Implementation of different modeling techniques / algorithms
  • Implementation of different model evaluation metrics
  • Implementation of cross-validation
  • Implementation of downsampling for large datasets

Note that there’s also an unfinished R implementation of the PPS package by 8080labs.

ppsr's People

Contributors

paulvanderlaken avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.