Coder Social home page Coder Social logo

hneth / riskyr Goto Github PK

View Code? Open in Web Editor NEW
19.0 19.0 1.0 469.17 MB

A toolbox for rendering risk literacy more transparent

R 32.92% HTML 67.08%
2x2-matrix bayesian-inference contingency-table r r-package representation risk risk-literacy rstats visualization

riskyr's People

Contributors

hneth avatar ndphillips avatar nigradwohl avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Forkers

nigradwohl

riskyr's Issues

Include vignettes and package loading message with link to main package guide

Hey guys, it looks like you guys have made tremendous progress on riskyr! I just installed it in the hopes of playing around, but without a package guide I wasn't sure how to proceed.

Do you guys plan to create one soon? It looks like you have plenty of existing documentation on GitHub, so it probably wouldn't be difficult to port it over to your package.

See my FFTrees package guide for an example: https://github.com/ndphillips/FFTrees/blob/master/vignettes/guide.Rmd

Looking forward to playing around with riskyr!!

Pass frequencies (instead of probabilities) to objects

To define a riskyr object, we currently pass 3 essential probabilities (prev, sens, spec). However, we also have functions translating from probabilities into frequencies (and vice versa). Hence, why not allow passing any of these to define a riskyr object?

Representing incomplete scenarios

Related to the issues of visualizing uncertainty and representing changes:

  • How can we express and depict incomplete or partial scenarios, in which some parameters are known, but others are unknown or may be irrelevant?

For instance, many well-known problems involving conditional probabilities (e.g., see the so-called prosecutor's fallacy) can be visualized and explained by showing partial frequency trees (with only 1 main branch being of interest). The confusion typically results from a (mis-)interpretation of 2 different conditional probabilities.

Without a way of plotting incomplete scenarios, we have no means of representing such problems.

Allow plotting riskyr objects

At the moment, we distinguish between plotting riskyr objects (via the plot.riskyr method) or using low-level plotting functions with parameter inputs (e.g., prev, sens, spec). Why not allow the low level functions accepting riskyr objects as well?

Generalize plots from 2 to 3 perspectives

Many riskyr plots (e.g., plot_fnet, plot_tree, and plot_mosaic) currently allow choosing between 2 perspectives (by splitting the population into 2 sub-groups by either condition or decision, i.e., by = "cd" vs. by = "dc"). Adding accuracy (by = "ac") as a 3rd perspective would support 3 x 2 different versions of each plot.

Turn riskyr into a package?

Hey Hans, I see you're still hard at work on riskyr, that's great!

I was just looking at the repository and wanted to suggest that you restructure the project as an R package. Wickham has a great tutorial here http://r-pkgs.had.co.nz/ on creating R packages, and here's one on how to include Shiny apps https://deanattali.com/2015/04/21/r-package-shiny-app/

I'd be happy to try and do it myself, but I really have to finish other things this month as you might have guessed :)

Visualizing uncertainty

As of now, only plot_curve has an option for expressing uncertainty about parameter values (by setting the uc argument to a percentage value, resulting in ranges of uncertainty around a given parameter value). One could argue that uc โ€” by being a numeric value โ€” actually represents a form of risk. But irrespective of semantics, it would be desirable to include some means of expressing imprecision or vagueness in the other representations.

Perhaps a simple solution (or fall-back option) would be to specify parameter ranges (e.g., from min to max) and then create 2 graphs that represent the best and worst case scenarios?

Distinguish between 2 types of scaling

When plotting frequencies as graphical objects (lines, boxes, or squares), their dimensions can be scaled by magnitude (e.g., plot_fnet with area = "sq", or the new plot_bar function). When rounding frequencies to integers (as per default), the scaled graph may divert from the underlying probabilities (especially for small population sizes N). In the extreme, small frequencies may be rounded to 0 and disappear from plots.

To control this effect, introduce a scale option that defines whether objects are scaled by (rounded or non-rounded) frequencies or by (exact) probabilities. (See plot_bar for a first implementation and generalize to other plots.)

Determine necessary and sufficient conditions for a well-defined riskyr scenario

We know that providing 3 essential probabilities (prev, sens, spec) OR providing 4 essential frequencies (hi, mi, fa, cr) fully specify a scenario. However, which combination of (any) probabilities and frequencies is necessary or sufficient?

(Looking at a network diagram should tell us which parts are independent from vs. dependent on each other. Interestingly, probabilities allow abstracting from frequencies, but not vice versa...)

Defining scenarios by description vs. from data/cases (i.e., by experience)

Idea

riskyr currently assumes that scenarios are defined by 3 essential probabilities (typically prev, sens, and spec or fart, plus some population size N) or 4 essential frequencies (typically hi, mi, fa, and cr).

A more flexible setup would allow defining scenarios either from parameters (i.e., "by description) or from data or cases (i.e., "by experience").

  1. By description: Define a scenario from parameters (to create/simulate cases):

    • provide 4 essential frequencies (i.e., specifying the result)
    • provide 3 essential probabilities, N, and round to exact frequencies
    • provide 3 essential probabilities, N, and sample from given probabilities
  2. By experience: Define scenario from data or cases (to compute/extract parameters):

    • provide binary data frame of cases (and frame 2x2 matrix)
    • provide non-binary data frame of cases and a criterion to be maximized to binarize predictor variable

ToDo

See comp_popu() for a first function that generates data/cases (as df popu) from one type of description:

  • from 4 essential frequencies

Add option for generating corresponding simulations: Generate popu (as df):

  • from probabilities and N (using exact or rounded values)
  • from probabilities and N (and sample() from N)

Define a complementary function desc_data() that generates the description from (binary or binarized) data or cases.

Representing changes

Related to the issue of visualizing uncertainty, it would be desirable to visualize changes in parameter values. For instance, if a condition's prevalence (prev) or a test's sensitivity (sens) changed by some percentage, how would this affect the entire scenario or some other parameter (e.g., PPV, hi, acc)?

Again, this could easily be expressed by 2 distinct representations (pre- vs. post-change). Are there better ways to integrate the effects of changes into 1 representation (similarly to showing ranges of variability in some types of plots)?

Suggestion: An interactive simulation?

The riskyr package is looking great!

As I was looking through the (very good!) documentation and examples on GitHub, I thought of one other communication tool you might consider. Namely, creating an interactive Shiny application that allows people to interactively sample cases and observe outcomes. I know this technique has gotten some buzz in jdm and forecasting areas recently (though I can't remember the exact papers right now..)

There are many ways you could do this, but one way would be a screen like this one:

image

every time the user clicks a "Next case" button, a ball falling from 'the sky' which are either Red (true positive) or Blue (true negative) cases. Then, they cross a decision line, where they are classified as either positive or negative based on the sensitivity / specificity of the test.

Positive classifications go to the left, and negative classifications go to the right.

Over time, the ever-growing group of classified cases would form an icon array.

Just an idea :)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.