Coder Social home page Coder Social logo

crosstabr's People

Contributors

tklebel avatar

Stargazers

 avatar

Watchers

 avatar  avatar

crosstabr's Issues

Missing values

Find a way to deal with NA. For starters, they could just be omitted.

Would there be any alternative? It doesn't seem to make sense to include them in any way in the table.

Better way to find dimnames for matrix

Currently, dimnames are retrieved from the proportional data.frame by levels.

  • Check if this approach is always valid. It should always find the correct labels for the resulting table
  • Think about, what to do with unused levels (levels which were changed to NA, where the level-vector stays the same). Should they simply be dropped by default? Should there be an option to keep them? (if yes, why?)

Using lapply or purrr:walk on crosstabr

Something along the following lines should work for end users.

list_fun <- function(indep) {
  print(crosstab(titanic, reformulate(indep, response = "Survived")))
}

vars <- names(titanic)[1:3]
purrr::walk(vars, list_fun)

Export Function

Think about a function to export the table. Export and copy to clipboard works in RStudio, but it could be useful to have a function similar to ggsave for programming use.

drop unobserved factor levels

Think about the following case:

test_df2 <- data.frame(
  gender = factor(c("male", "female")),
  smoke = factor(c(rep("yes", 5), rep("no", 5))),
  age = factor(c("young", "old"))
)
cross_table(test_df, smoke~gender)

# now we recode a level to be missing
test_df2$gender[test_df2$gender == "female"] <- NA

# females still show up
cross_table(test_df2, smoke~gender)

# levels should be removed too
test_df2$gender <- factor(test_df2$gender, levels = "male")
cross_table(test_df2, smoke~gender)

Should we simply do layout_column(drop = T) to drop all unobserved factor levels? Or should we let the user specify which levels to remove?

For the first case the computation could make use of tidyr::complete(model_frame).

gmodels::CrossTable seems to drop unobserved factor levels. For exploratory analysis this is not optimal: you should notice, if some combinations were not observed in the data.

Maybe layout_column() could gain the argument drop:

  • drop can be TRUE or FALSE with default TRUE.
  • alternatively you can provide a character vector, specifying the levels to drop. (if not all levels should be dropped)

add layer to cross_table

cross_table should have a separate argument layer, where you can specify a third variable, which you want to control for. This seems easier to remember than to add another variable to the input-formula.

Rethink internals for add_stats

Maybe to incorporate the work in funs from dplyr?

At least use: lazyeval::lazy_dots(...) and lazyeval::as.lazy_dots().

Think about cases with user defined functions – are they working in the current approach?

build_tab

Extract parts which are building the table from print.cross_table.

Should be similar to ggvis:

  • at the top: guess layer, if none is provided
  • setup layout (column or row)
  • add custom labels, if provided
  • add stats computed by add_stats

implement add_stats

add_stats should take arguments as a vector and output a box with stats, taken from vcd::assocstats (Chi-square, Cramers-V, ...)

Plotting function

Think about plotting the table. Could possibly work with ggplot2 heat-map, where you plot the residuals.

Should be fairly simple, since the the factors are in the data.frame already, so it is just a matter of gathering and plotting accordingly. A layered cross_table could be implemented with facet_wrap.

Resizable table

Table should either be resizable by user (look into jQuery with ggvis) or should resize automatically (js?)

add_labels

Function to add labels which are to be used instead of variable names.

Manage css files for knit_print

Rename table class

  • crosstab_outer
  • crosstab_inner

Resolve compatibility issues between stylesheets

Currently the first row of the table () is displayed in bold letters, which shouldn't be the case. Furthermore the stylesheet of crosstabr should not change anything else than its own objects.

Output to Markdown

Think about how output could be created for RMarkdown. Would it be better to first create markdown-output and then convert to HTML for general viewing, or should output to markdown take a different path?

basic structure

Flesh out basic structure for cross_table and column_layout (or better: layout_column?).

Should have:

  • Input as formula
  • Output to RStudio-viewer, similar to ggvis
  • simply display counts and column-wise percent

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.