Coder Social home page Coder Social logo

hofnerb / paper Goto Github PK

View Code? Open in Web Editor NEW
27.0 8.0 4.0 396 KB

A toolbox for writing Sweave or other LaTeX-based papers and reports and to prettify the output of various estimated models.

Home Page: http://cran.r-project.org/package=papeR

R 100.00%
sweave r-language r-package reproducible-research reproducible reporting cran knitr latex

paper's Introduction

papeR

Build Status (Linux) Build Status (Windows) Coverage Status CRAN Status Badge

papeR provides a toolbox for writing knitr, Sweave or other LaTeX- or markdown-based papers and reports and to prettify the output of various estimated models.

Installation:

  • Current version (from CRAN):
install.packages("papeR")
  • Latest development version from GitHub:
library("devtools")
install_github("hofnerb/papeR")
  • To be able to use the install_github() command, one needs to install devtools first:
install.packages("devtools")

Using papeR

Tutorials on how to use papeR can be found on CRAN:

or within R via

## introduction to papeR (in combination with Markdown)
vignette("papeR_introduction", package = "papeR")
## introduction to papeR with LaTeX
vignette("papeR_with_latex", package = "papeR")

paper's People

Contributors

hofnerb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

paper's Issues

\endhead missing

... at least for longtables. Does it break the code for normal tables?

small issue: prettify digits problem in CI upper

cox regression model, using digits=3, CI (upper) show only 2 digits
my code
prettify(summary(mcox.fit1.train),digits = 3)
my result

               coef Hazard Ratio CI (lower) CI (upper) se(coef)      z Pr(>|z|)    
1     cpscore  0.80653        2.240      1.793       2.80   0.1137  7.096   <0.001 ***
2      sex: M -0.09104        0.913      0.654       1.27   0.1699 -0.536    0.592    
3 location: R  0.44916        1.567      1.099       2.23   0.1809  2.483    0.013   *

error while using knitr with papeR

Hello,

I have compiling error while trying to use knitr and papeR to produce a pdf (through LaTeX).

The latex comments such as
%% Output requires \usepackage

seem to be escaped to %% during the rmardown/knitr processing, and LaTeX therefore fails to compile. I cannot see any option in your code to prevent such comments from being generated, and I fail to understand/find how to prevent those % to be escaped.

with the following code:

xtable(summarize(data, type = "factor", variables = "Type"))

here is the part where LaTeX fails during pandoc generation:

! LaTeX Error: Can be used only in preamble.

See the LaTeX manual or LaTeX Companion for explanation.
Type H for immediate help.
...

l.122 %% Output requires \usepackage

pandoc: Error producing PDF

I'm using RStudio 1.0.143, rmarkdown 1.5, knitr 1.15.1, R 3.3.2 on a Mac OS X 10.10.5

Clean up

Can and should we replace

as.labeled.data.frame(object, ...)
is.labeled.data.frame(object)

with

as.ldf(object, ...)
is.ldf(object)

?

Fix scoping in summarize function

require("nlme")
require("papeR")

data(Orthodont, package = "nlme")

test = function(type) {
    a1 = Orthodont
    print(summary(a1))
    ## Get summary for continuous variables
    (tab1 <- summarize(a1, type = type))
}
test("factor")
##  Error in mySapply(data[, variables], is.factor) : object 'a1' not found 

test("numeric")
##  Error in summarize_numeric(data = a1) : object 'a1' not found 

(spotted by Douglas Ezra Morrison)

summarise() masks dplyr::summarise()

I am trying out papeR for managing data labels. I didn't know it had a summarise function and was unhappy that it masks dplyr::summarise. Thankfully this only tripped me up for a few minutes but I could see it giving a newer user a pretty big headache.

With papeR loaded after dplyr:

> mtcars %>%
+   group_by(cyl) %>%
+   summarise(mpg = mean(mpg))
Error in is.data.frame(x) : 
  (list) object cannot be coerced to type 'double'
In addition: Warning message:
In mean.default(data, na.rm = TRUE) :
  argument is not numeric or logical: returning NA

Is the conflict with dplyr::summarise intentional? Would you consider a different function name to avoid this conflict? I'll load papeR first going forward but that won't help others.

knitr::kable() summary output gives variable names *and* labels

The result of summarize_numeric contains the variable names as row names if labels = TRUE. These row names are printed in markdown output via the default knitr::kable(), which is unintended. A workaround is to call knitr::kable() with argument row.names = FALSE. In contrast, the print.xtable.summary method automatically hides the row names.

Illustration:

library("papeR")
#> Lade nötiges Paket: car
#> Lade nötiges Paket: carData
#> Lade nötiges Paket: xtable
#> 
#> Attache Paket: 'papeR'
#> The following object is masked from 'package:utils':
#> 
#>     toLatex
data(Orthodont, package = "nlme")
labels(Orthodont, "distance") <- "Fissure distance (mm)"

print(sum0 <- summarize(Orthodont))
#> Factors are dropped from the summary
#>              N    Mean   SD    Min Q1 Median Q3  Max
#> 1 distance 108   24.02 2.93   16.5 22  23.75 26 31.5
#> 2      age 108   11.00 2.25    8.0  9  11.00 13 14.0
print(sum1 <- summarize(Orthodont, labels = TRUE))
#> Factors are dropped from the summary
#>                           N    Mean   SD    Min Q1 Median Q3  Max
#> 1 Fissure distance (mm) 108   24.02 2.93   16.5 22  23.75 26 31.5
#> 2                   age 108   11.00 2.25    8.0  9  11.00 13 14.0

## xtable() uses *either* variable names *or* labels
xtable(sum0)
#> NOTE: Output requires \usepackage{booktabs} in your preamble.
#> \begin{center}
#> % latex table generated in R 3.4.4 by xtable 1.8-3 package
#> % Tue Feb  5 17:20:54 2019
#> \begin{tabular}{lrrrrrrrrrr}
#>   \toprule
#>    & N &   & Mean & SD &   & Min & Q1 & Median & Q3 & Max \\ 
#>     \cmidrule{2-2}  \cmidrule{4-5} \cmidrule{7-11}
#>  distance & 108 &  & 24.02 & 2.93 &  & 16.50 & 22.00 & 23.75 & 26.00 & 31.50 \\ 
#>   age & 108 &  & 11.00 & 2.25 &  & 8.00 & 9.00 & 11.00 & 13.00 & 14.00 \\ 
#>    \bottomrule
#> \end{tabular}
#> \end{center}
xtable(sum1)
#> NOTE: Output requires \usepackage{booktabs} in your preamble.
#> \begin{center}
#> % latex table generated in R 3.4.4 by xtable 1.8-3 package
#> % Tue Feb  5 17:20:54 2019
#> \begin{tabular}{lrrrrrrrrrr}
#>   \toprule
#>    & N &   & Mean & SD &   & Min & Q1 & Median & Q3 & Max \\ 
#>     \cmidrule{2-2}  \cmidrule{4-5} \cmidrule{7-11}
#>  Fissure distance (mm) & 108 &  & 24.02 & 2.93 &  & 16.50 & 22.00 & 23.75 & 26.00 & 31.50 \\ 
#>   age & 108 &  & 11.00 & 2.25 &  & 8.00 & 9.00 & 11.00 & 13.00 & 14.00 \\ 
#>    \bottomrule
#> \end{tabular}
#> \end{center}

## however, kable() gives both
knitr::kable(sum0)
N Mean SD Min Q1 Median Q3 Max
distance 108 24.02 2.93 16.5 22 23.75 26 31.5
age 108 11.00 2.25 8.0 9 11.00 13 14.0
knitr::kable(sum1)  # gives variable names *and* labels
N Mean SD Min Q1 Median Q3 Max
distance Fissure distance (mm) 108 24.02 2.93 16.5 22 23.75 26 31.5
age age 108 11.00 2.25 8.0 9 11.00 13 14.0

The reason for this is that the data.frame() setup in summarize_numeric()

papeR/R/summarize.R

Lines 129 to 133 in 9e79d5b

sums <- data.frame(variable = variable.labels, group = NA, blank = "",
N=NA, Missing = NA, blank_1 = "",
Mean=NA, SD=NA, blank_2 = "",
Min=NA, Q1=NA, Median=NA, Q3=NA, Max=NA, var = variables,
stringsAsFactors = FALSE)

picks up names(variable.labels) as default row names.

rownames(sum0)
#> [1] "1" "2"
rownames(sum1)
#> [1] "distance" "age"

Are these row names used anywhere else?

For markdown output, it would be convenient to have no row names, i.e., force row.names = NULL in the above data.frame setup. Otherwise I need to manually call knitr::kable() with row.names = FALSE for every summary in my R Markdown document.

Make new dependency checks happy

To make the new dependency checks on CRAN happy go along the following recipe (as suggested by Kurt Hornik).

1.) Copy the functions listed after Undefined global functions or variables: in the output from R CMD check in a variable txt:

txt <- "abline as.formula barplot boxplot citation coef coefficients
     complete.cases fivenum lm plot qnorm qt sd sessionInfo symnum vcov"

2.) With the function

imports_for_undefined_globals <- function(txt, lst, selective = TRUE) {
    if(!missing(txt))
        lst <- scan(what = character(), text = txt, quiet = TRUE)
    nms <- lapply(lst, find)
    ind <- sapply(nms, length) > 0L
    imp <- split(lst[ind], substring(unlist(nms[ind]), 9L))
    if(selective) {
        sprintf("importFrom(%s)",
                vapply(Map(c, names(imp), imp),
                       function(e)
                           paste0("\"", e, "\"", collapse = ", "),
                       ""))
    } else {
        sprintf("import(\"%s\")", names(imp))
    }
}

one can obtain the imports via

writeLines(imports_for_undefined_globals(txt))

3.) Copy and paste the output to NAMESPACE.
4.) Add the packages to Imports in DESCRIPTION.
To make the new dependency checks on CRAN happy go along the following recipe (as suggested by Kurt Hornik).

Passing group when summarizing factor variables

Hi all,

I have a simple dataset of factors that I am trying to summarize using latex.table.fac or xtable().

However, when I run xtable(summarize(data, type = "factor", variables = c(...), group = "...")), I get the following error:

Error in fisher.test(c(150L, 220L, 204L, 176L, 40L, 2L, 56L, 172L, 379L, : FEXACT error 501. The hash table key cannot be computed because the largest key is larger than the largest representable int. The algorithm cannot proceed. Reduce the workspace, consider using 'simulate.p.value=TRUE' or another algorithm.

I've no idea where this error is coming from and that's why I'm posting here. I know my example is not reproducible, but does anyone recognize this error and why it's appearing? Will try to replicate with a base R dataset now...

Error in UseMethod("prettify") : no applicable method for object of class "c('lmerMod', 'merMod')"

From https://cran.r-project.org/web/packages/papeR/papeR.pdf :

method for mixed models fitted with lme4 (vers. >= 1.0)

S3 method for class 'summary.merMod'

prettify(object, labels = NULL, sep = ": ", extra.column = FALSE,
confint = TRUE, level = 0.95,
smallest.pval = 0.001, digits = NULL, scientific = FALSE,
signif.stars = getOption("show.signif.stars"),
method = c("profile", "Wald", "boot"), B = 1000, env = parent.frame(), ...)

I am trying to prettify() lme4 models & receive this error (for any & all of the models):

Error in UseMethod("prettify") : 
  no applicable method for 'prettify' applied to an object of class "c('lmerMod', 'merMod')"

I tried with many types of lme4 models of different datasets. Same results across the board.

Thank you for your time and effort with this package!

Use `xtable` to print output of `latex.table.xxx`

Use something like

print(xtable(TAB, caption = caption, label = label), 
        caption.placement = "top", hline.after = NULL,
        add.to.row = list(pos = list(-1, 1, nrow(TAB)),
                          command = c("\\toprule\n", "\\midrule\n", "\\bottomrule\n")))

with more complex \cmidrule{} commands.

Do not check error message

Test does not adhere to the guidance given in https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Writing-portable-packages:

Do not test the exact format of R messages (from R itself or from other
packages): They change, and they can be translated.

Packages have even tested the exact format of system error messages, which are platform-dependent and perhaps locale-dependent.

and thus results in an error:

 ── 1. Failure: write.bib (@test-toLatex.R#105) ────────────────────────────────
 `write.bib("nonexisting_pkg")` threw an error with unexpected message.
 Expected match: "package .* not found"
 Actual message: "there is no package called 'nonexisting_pkg'"

set labels via named vector

Setting labels for a subset of the variables is supported by the argument which of the labels<- function. So we can do:

data(Orthodont, package = "nlme")
labels(Orthodont, which = "distance") <- "Fissure distance (mm)"

I suggest to allow for an alternative syntax for setting labels via a named vector, consistent with how labels() are extracted.

labels(Orthodont) <- c("distance" = "Fissure distance (mm)")

I would much prefer that syntax, especially for labelling a lot of variables simultanously, because the variable names and labels are more closely connected.

This alternative syntax is probably straightforward to implement given the current implementation:

papeR/R/labels.R

Lines 59 to 61 in 9e79d5b

"labels<-" <- function(data, which = NULL, value){
which <- check_which(which, data, "define")

Maybe add the following at the beginning of the function body:

if (!is.null(names(value)) {
    if (!is.null(which))
        warning("ignoring argument 'which' since labels are named")
    which <- names(value)
}

Error in prettify(): Model can't be refitted and no labels are specified

Hi, I'm trying to prettify() output of a lme4::lmer() model. In trying to use this with my own output, I ran into some trouble, so tried to run the example, i.e.:


library(nlme)
library(papeR)
#> Loading required package: car
#> Loading required package: xtable
#> 
#> Attaching package: 'papeR'
#> The following object is masked from 'package:utils':
#> 
#>     toLatex
library(lme4)
#> Loading required package: Matrix
#> 
#> Attaching package: 'lme4'
#> The following object is masked from 'package:nlme':
#> 
#>     lmList

## Fit a model for distance with random intercept for Subject
mod4 <- lmer(distance ~ age + Sex + (1|Subject), data = Orthodont)
summary(mod4)
#> Linear mixed model fit by REML ['lmerMod']
#> Formula: distance ~ age + Sex + (1 | Subject)
#>    Data: Orthodont
#> 
#> REML criterion at convergence: 437.5
#> 
#> Scaled residuals: 
#>     Min      1Q  Median      3Q     Max 
#> -3.7489 -0.5503 -0.0252  0.4534  3.6575 
#> 
#> Random effects:
#>  Groups   Name        Variance Std.Dev.
#>  Subject  (Intercept) 3.267    1.807   
#>  Residual             2.049    1.432   
#> Number of obs: 108, groups:  Subject, 27
#> 
#> Fixed effects:
#>             Estimate Std. Error t value
#> (Intercept) 17.70671    0.83392  21.233
#> age          0.66019    0.06161  10.716
#> SexFemale   -2.32102    0.76142  -3.048
#> 
#> Correlation of Fixed Effects:
#>           (Intr) age   
#> age       -0.813       
#> SexFemale -0.372  0.000
## Extract fixed effects table and make it pretty
prettify(summary(mod4))
#> Error in prettify.summary.merMod(summary(mod4)): Model can't be refitted and no labels are specified. Please specify labels.

How can I address this error: "Model can't be refitted and no labels are specified. Please specify labels."?

Do not replace registered S3 methods from base/recommended packages

Prof Brian Ripley wrote:

Do not replace registered S3 methods from base/recommended packages, something which is not allowed by the CRAN policies and will mean that everyone gets your method even if your namespace is unloaded.

There is a check for this in R-devel, and the details are shown on the CRAN results page for the package.

In some cases there appears to be modified copyrighted code from R/recommended packages used without giving credit in the DESCRIPTION file, so please review your compliance with that section of the CRAN policies.

Remedies depend on what you are trying to do. Ideas which have been used in other packages:

(a) if you want to make use of a class, say "lmList" but your objects are not really from the class defined in a standard package, you should give them an additional class, say c("lmList2", lmList") and register methods for the additional class.

(b) if you want to change the behaviour of a generic, say predict(), for an existing class or two, you could add such as generic in your own package with default method stats::predict, and then register modified methods for your generic (in your own package).

Please submit an update correcting this and any other issues showing on the CRAN results page. Do not reply to this email to do so: use the webform.

summarize: Captions missing

when table = "tabular" and floating = FALSE. Can we re-add the LaTeX package capt-of via print.xtable.summary?

Check and supress warning messages in summarize

summarize(Orthodont, type = "numeric", group = "Sex")
Factors are dropped from the summary
                Sex    N    Mean   SD    Min Q1 Median    Q3  Max   p.value
1   distance   Male   64   24.97 2.90   17.0 23  24.75 26.50 31.5        NA
1.1          Female   44   22.65 2.40   16.5 21  22.75 24.25 28.0      <NA>
2        age   Male   64   11.00 2.25    8.0  9  11.00 13.00 14.0         1
2.1          Female   44   11.00 2.26    8.0  9  11.00 13.00 14.0      <NA>
Warning message:
In names(sums)[names(sums) == "group"] <- labels(data, group) :
  number of items to replace is not a multiple of replacement length

Storrage of Latex attributes

When should LaTeX specific attributes be computed and where should they be stored?
Currently, they are always computed and stored as attributes.

Relevant attributes are:

  • align
  • sep
  • sanitize
  • rules
  • header

Fix labels

This does not work in the tutorial:

> labels(Orthodont, which = "age")
 [1] "age" NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA   
 [28] NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA   
 [55] NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA   
 [82] NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA   

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.