hofnerb / paper Goto Github PK

A toolbox for writing Sweave or other LaTeX-based papers and reports and to prettify the output of various estimated models.

Home Page: http://cran.r-project.org/package=papeR

R 100.00%

sweave r-language r-package reproducible-research reproducible reporting cran knitr latex

paper's Introduction

papeR

papeR provides a toolbox for writing knitr, Sweave or other LaTeX- or markdown-based papers and reports and to prettify the output of various estimated models.

Installation:

Current version (from CRAN):

install.packages("papeR")

Latest development version from GitHub:

library("devtools")
install_github("hofnerb/papeR")

To be able to use the install_github() command, one needs to install devtools first:

install.packages("devtools")

Using papeR

Tutorials on how to use papeR can be found on CRAN:

or within R via

## introduction to papeR (in combination with Markdown)
vignette("papeR_introduction", package = "papeR")
## introduction to papeR with LaTeX
vignette("papeR_with_latex", package = "papeR")

paper's People

Contributors

Stargazers

Watchers

Forkers

neokito firefoxxy8 karthy257 lxq123ll

paper's Issues

\endhead missing

... at least for longtables. Does it break the code for normal tables?

small issue: prettify digits problem in CI upper

cox regression model, using digits=3, CI (upper) show only 2 digits
my code
prettify(summary(mcox.fit1.train),digits = 3)
my result

               coef Hazard Ratio CI (lower) CI (upper) se(coef)      z Pr(>|z|)    
1     cpscore  0.80653        2.240      1.793       2.80   0.1137  7.096   <0.001 ***
2      sex: M -0.09104        0.913      0.654       1.27   0.1699 -0.536    0.592    
3 location: R  0.44916        1.567      1.099       2.23   0.1809  2.483    0.013   *

summarize: fix handling of latex escapes

\\%, \\sum, etc. need to be displayed correctly, in text and markdown output as well as in LaTeX output

Fix tests for grouped summary tables

library("papeR")
data(Orthodont, package = "nlme")
summarize(Orthodont, type = "numeric", group = "Sex")

does not show p-values

error while using knitr with papeR

Hello,

I have compiling error while trying to use knitr and papeR to produce a pdf (through LaTeX).

The latex comments such as
%% Output requires \usepackage

seem to be escaped to %% during the rmardown/knitr processing, and LaTeX therefore fails to compile. I cannot see any option in your code to prevent such comments from being generated, and I fail to understand/find how to prevent those % to be escaped.

with the following code:

xtable(summarize(data, type = "factor", variables = "Type"))

here is the part where LaTeX fails during pandoc generation:

! LaTeX Error: Can be used only in preamble.

See the LaTeX manual or LaTeX Companion for explanation.
Type H for immediate help.
...

l.122 %% Output requires \usepackage

pandoc: Error producing PDF

I'm using RStudio 1.0.143, rmarkdown 1.5, knitr 1.15.1, R 3.3.2 on a Mac OS X 10.10.5

Is it necessary to have a class `labeled.data.frame`?

Perhaps we can rewire everything to standard data frames?

Always separate p.value from 5-num summary

## wrong
xtable(summarize(Orthodont, type = "numeric", group = "Sex"))
## correct
xtable(summarize(Orthodont, type = "numeric", group = "Sex", quantiles = FALSE))

Clean up

Can and should we replace

as.labeled.data.frame(object, ...)
is.labeled.data.frame(object)

with

as.ldf(object, ...)
is.ldf(object)

summarize: Add comments that state which LaTeX packages are needed

E.g.

%% Output requires \usepackage{booktabs}.
%% Output requires \usepackage{capt-of}.

Fix captions for summarize(, type =factor, group =...)

Currently the group is not displayed.

Fix scoping in summarize function

require("nlme")
require("papeR")

data(Orthodont, package = "nlme")

test = function(type) {
    a1 = Orthodont
    print(summary(a1))
    ## Get summary for continuous variables
    (tab1 <- summarize(a1, type = type))
}
test("factor")
##  Error in mySapply(data[, variables], is.factor) : object 'a1' not found 

test("numeric")
##  Error in summarize_numeric(data = a1) : object 'a1' not found

(spotted by Douglas Ezra Morrison)

summarise() masks dplyr::summarise()

I am trying out papeR for managing data labels. I didn't know it had a summarise function and was unhappy that it masks dplyr::summarise. Thankfully this only tripped me up for a few minutes but I could see it giving a newer user a pretty big headache.

With papeR loaded after dplyr:

> mtcars %>%
+   group_by(cyl) %>%
+   summarise(mpg = mean(mpg))
Error in is.data.frame(x) : 
  (list) object cannot be coerced to type 'double'
In addition: Warning message:
In mean.default(data, na.rm = TRUE) :
  argument is not numeric or logical: returning NA

Is the conflict with dplyr::summarise intentional? Would you consider a different function name to avoid this conflict? I'll load papeR first going forward but that won't help others.

knitr::kable() summary output gives variable names and labels

The result of summarize_numeric contains the variable names as row names if labels = TRUE. These row names are printed in markdown output via the default knitr::kable(), which is unintended. A workaround is to call knitr::kable() with argument row.names = FALSE. In contrast, the print.xtable.summary method automatically hides the row names.

Illustration:

library("papeR")
#> Lade nötiges Paket: car
#> Lade nötiges Paket: carData
#> Lade nötiges Paket: xtable
#> 
#> Attache Paket: 'papeR'
#> The following object is masked from 'package:utils':
#> 
#>     toLatex
data(Orthodont, package = "nlme")
labels(Orthodont, "distance") <- "Fissure distance (mm)"

print(sum0 <- summarize(Orthodont))
#> Factors are dropped from the summary
#>              N    Mean   SD    Min Q1 Median Q3  Max
#> 1 distance 108   24.02 2.93   16.5 22  23.75 26 31.5
#> 2      age 108   11.00 2.25    8.0  9  11.00 13 14.0
print(sum1 <- summarize(Orthodont, labels = TRUE))
#> Factors are dropped from the summary
#>                           N    Mean   SD    Min Q1 Median Q3  Max
#> 1 Fissure distance (mm) 108   24.02 2.93   16.5 22  23.75 26 31.5
#> 2                   age 108   11.00 2.25    8.0  9  11.00 13 14.0

## xtable() uses *either* variable names *or* labels
xtable(sum0)
#> NOTE: Output requires \usepackage{booktabs} in your preamble.
#> \begin{center}
#> % latex table generated in R 3.4.4 by xtable 1.8-3 package
#> % Tue Feb  5 17:20:54 2019
#> \begin{tabular}{lrrrrrrrrrr}
#>   \toprule
#>    & N &   & Mean & SD &   & Min & Q1 & Median & Q3 & Max \\ 
#>     \cmidrule{2-2}  \cmidrule{4-5} \cmidrule{7-11}
#>  distance & 108 &  & 24.02 & 2.93 &  & 16.50 & 22.00 & 23.75 & 26.00 & 31.50 \\ 
#>   age & 108 &  & 11.00 & 2.25 &  & 8.00 & 9.00 & 11.00 & 13.00 & 14.00 \\ 
#>    \bottomrule
#> \end{tabular}
#> \end{center}
xtable(sum1)
#> NOTE: Output requires \usepackage{booktabs} in your preamble.
#> \begin{center}
#> % latex table generated in R 3.4.4 by xtable 1.8-3 package
#> % Tue Feb  5 17:20:54 2019
#> \begin{tabular}{lrrrrrrrrrr}
#>   \toprule
#>    & N &   & Mean & SD &   & Min & Q1 & Median & Q3 & Max \\ 
#>     \cmidrule{2-2}  \cmidrule{4-5} \cmidrule{7-11}
#>  Fissure distance (mm) & 108 &  & 24.02 & 2.93 &  & 16.50 & 22.00 & 23.75 & 26.00 & 31.50 \\ 
#>   age & 108 &  & 11.00 & 2.25 &  & 8.00 & 9.00 & 11.00 & 13.00 & 14.00 \\ 
#>    \bottomrule
#> \end{tabular}
#> \end{center}

## however, kable() gives both
knitr::kable(sum0)

	N		Mean	SD		Min	Q1	Median	Q3	Max
distance	108		24.02	2.93		16.5	22	23.75	26	31.5
age	108		11.00	2.25		8.0	9	11.00	13	14.0

knitr::kable(sum1)  # gives variable names *and* labels

		N		Mean	SD		Min	Q1	Median	Q3	Max
distance	Fissure distance (mm)	108		24.02	2.93		16.5	22	23.75	26	31.5
age	age	108		11.00	2.25		8.0	9	11.00	13	14.0

The reason for this is that the data.frame() setup in summarize_numeric()

papeR/R/summarize.R

Lines 129 to 133 in 9e79d5b

    
           sums <- data.frame(variable = variable.labels, group = NA, blank = "", 
        
                              N=NA, Missing = NA, blank_1 = "", 
        
                              Mean=NA, SD=NA, blank_2 = "", 
        
                              Min=NA, Q1=NA, Median=NA, Q3=NA, Max=NA, var = variables, 
        
                              stringsAsFactors = FALSE)

picks up names(variable.labels) as default row names.

rownames(sum0)
#> [1] "1" "2"
rownames(sum1)
#> [1] "distance" "age"

Are these row names used anywhere else?

For markdown output, it would be convenient to have no row names, i.e., force row.names = NULL in the above data.frame setup. Otherwise I need to manually call knitr::kable() with row.names = FALSE for every summary in my R Markdown document.

Add vignette to package

Add markdown vignette based on https://github.com/hofnerb/RR_Course/blob/master/Using_papeR.Rmd

Make new dependency checks happy

To make the new dependency checks on CRAN happy go along the following recipe (as suggested by Kurt Hornik).

1.) Copy the functions listed after Undefined global functions or variables: in the output from R CMD check in a variable txt:

txt <- "abline as.formula barplot boxplot citation coef coefficients
     complete.cases fivenum lm plot qnorm qt sd sessionInfo symnum vcov"

2.) With the function

imports_for_undefined_globals <- function(txt, lst, selective = TRUE) {
    if(!missing(txt))
        lst <- scan(what = character(), text = txt, quiet = TRUE)
    nms <- lapply(lst, find)
    ind <- sapply(nms, length) > 0L
    imp <- split(lst[ind], substring(unlist(nms[ind]), 9L))
    if(selective) {
        sprintf("importFrom(%s)",
                vapply(Map(c, names(imp), imp),
                       function(e)
                           paste0("\"", e, "\"", collapse = ", "),
                       ""))
    } else {
        sprintf("import(\"%s\")", names(imp))
    }
}

one can obtain the imports via

writeLines(imports_for_undefined_globals(txt))

3.) Copy and paste the output to NAMESPACE.
4.) Add the packages to Imports in DESCRIPTION.
To make the new dependency checks on CRAN happy go along the following recipe (as suggested by Kurt Hornik).

Passing group when summarizing factor variables

Hi all,

I have a simple dataset of factors that I am trying to summarize using latex.table.fac or xtable().

However, when I run xtable(summarize(data, type = "factor", variables = c(...), group = "...")), I get the following error:

Error in fisher.test(c(150L, 220L, 204L, 176L, 40L, 2L, 56L, 172L, 379L, : FEXACT error 501. The hash table key cannot be computed because the largest key is larger than the largest representable int. The algorithm cannot proceed. Reduce the workspace, consider using 'simulate.p.value=TRUE' or another algorithm.

I've no idea where this error is coming from and that's why I'm posting here. I know my example is not reproducible, but does anyone recognize this error and why it's appearing? Will try to replicate with a base R dataset now...

Fix warning on CRAN: not overwrite nlme:::confint.lme

Registered S3 method from a standard package overwritten by 'papeR':
 method from
 confint.lme nlme

Error in UseMethod("prettify") : no applicable method for object of class "c('lmerMod', 'merMod')"

From https://cran.r-project.org/web/packages/papeR/papeR.pdf :

method for mixed models fitted with lme4 (vers. >= 1.0)

S3 method for class 'summary.merMod'

prettify(object, labels = NULL, sep = ": ", extra.column = FALSE,
confint = TRUE, level = 0.95,
smallest.pval = 0.001, digits = NULL, scientific = FALSE,
signif.stars = getOption("show.signif.stars"),
method = c("profile", "Wald", "boot"), B = 1000, env = parent.frame(), ...)

I am trying to prettify() lme4 models & receive this error (for any & all of the models):

Error in UseMethod("prettify") : 
  no applicable method for 'prettify' applied to an object of class "c('lmerMod', 'merMod')"

I tried with many types of lme4 models of different datasets. Same results across the board.

Thank you for your time and effort with this package!

Use `xtable` to print output of `latex.table.xxx`

Use something like

print(xtable(TAB, caption = caption, label = label), 
        caption.placement = "top", hline.after = NULL,
        add.to.row = list(pos = list(-1, 1, nrow(TAB)),
                          command = c("\\toprule\n", "\\midrule\n", "\\bottomrule\n")))

with more complex \cmidrule{} commands.

`table.data.frame` must work with `kable` etc.

Make all clean up in prettify.table.xxx:

remove "blank" etc. from object
rename columns
remove duplicate variable names

include.rownames = TRUE breaks table layout

The lines are not set correctly with include.rownames = TRUE. Either we should disallow this completely or we need to fix the lines.

Add CRAN Badge

print(latex.table.fac(), table = „longtable“) broken

print(latex.table.fac(), table = „longtable“)

doesn't work, nor does

print(latex.table.fac(), tabular.environment = "longtable")

Do not check error message

Test does not adhere to the guidance given in https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Writing-portable-packages:

Do not test the exact format of R messages (from R itself or from other
packages): They change, and they can be translated.

Packages have even tested the exact format of system error messages, which are platform-dependent and perhaps locale-dependent.

and thus results in an error:

 ── 1. Failure: write.bib (@test-toLatex.R#105) ────────────────────────────────
 `write.bib("nonexisting_pkg")` threw an error with unexpected message.
 Expected match: "package .* not found"
 Actual message: "there is no package called 'nonexisting_pkg'"

set labels via named vector

Setting labels for a subset of the variables is supported by the argument which of the labels<- function. So we can do:

data(Orthodont, package = "nlme")
labels(Orthodont, which = "distance") <- "Fissure distance (mm)"

I suggest to allow for an alternative syntax for setting labels via a named vector, consistent with how labels() are extracted.

labels(Orthodont) <- c("distance" = "Fissure distance (mm)")

I would much prefer that syntax, especially for labelling a lot of variables simultanously, because the variable names and labels are more closely connected.

This alternative syntax is probably straightforward to implement given the current implementation:

papeR/R/labels.R

Lines 59 to 61 in 9e79d5b

    
           "labels<-" <- function(data, which = NULL, value){ 
        
               which <- check_which(which, data, "define")

Maybe add the following at the beginning of the function body:

if (!is.null(names(value)) {
    if (!is.null(which))
        warning("ignoring argument 'which' since labels are named")
    which <- names(value)
}

use `formattable` for more pretty

Nice package. For additional functionality and option for Prettify Output, formattable from @renkun-ken might be helpful. I'll try to be more thorough after more completely inspecting papeR.

Reference hofnerb/RR_Course#1

Use AppVeyor for CI on Windows

http://www.appveyor.com/

Hello! I was wondering if the function "prettify" would be implemented for objects of class 'aov' and 'listof'?

Hello, Mr.Benjamin. I'm an undergrad student from the US; I think that the papeR package is very useful and I use it a lot in my work. I notice that the function "prettify" is currently not implemented for an object of class like aov. Any chance that this would be implemented in the near future? Many thanks and best of luck.

Allow summarize to work with dates

e.g. via summarize_numeric()

Store labels as attributes of the variables in a data set

Think about storing variable labels as attribute of the variable instead attribute of the data.frame. This might ease handling of data.frames such as subsetting etc.

Error in prettify(): Model can't be refitted and no labels are specified

Hi, I'm trying to prettify() output of a lme4::lmer() model. In trying to use this with my own output, I ran into some trouble, so tried to run the example, i.e.:

library(nlme)
library(papeR)
#> Loading required package: car
#> Loading required package: xtable
#> 
#> Attaching package: 'papeR'
#> The following object is masked from 'package:utils':
#> 
#>     toLatex
library(lme4)
#> Loading required package: Matrix
#> 
#> Attaching package: 'lme4'
#> The following object is masked from 'package:nlme':
#> 
#>     lmList

## Fit a model for distance with random intercept for Subject
mod4 <- lmer(distance ~ age + Sex + (1|Subject), data = Orthodont)
summary(mod4)
#> Linear mixed model fit by REML ['lmerMod']
#> Formula: distance ~ age + Sex + (1 | Subject)
#>    Data: Orthodont
#> 
#> REML criterion at convergence: 437.5
#> 
#> Scaled residuals: 
#>     Min      1Q  Median      3Q     Max 
#> -3.7489 -0.5503 -0.0252  0.4534  3.6575 
#> 
#> Random effects:
#>  Groups   Name        Variance Std.Dev.
#>  Subject  (Intercept) 3.267    1.807   
#>  Residual             2.049    1.432   
#> Number of obs: 108, groups:  Subject, 27
#> 
#> Fixed effects:
#>             Estimate Std. Error t value
#> (Intercept) 17.70671    0.83392  21.233
#> age          0.66019    0.06161  10.716
#> SexFemale   -2.32102    0.76142  -3.048
#> 
#> Correlation of Fixed Effects:
#>           (Intr) age   
#> age       -0.813       
#> SexFemale -0.372  0.000
## Extract fixed effects table and make it pretty
prettify(summary(mod4))
#> Error in prettify.summary.merMod(summary(mod4)): Model can't be refitted and no labels are specified. Please specify labels.

How can I address this error: "Model can't be refitted and no labels are specified. Please specify labels."?

Allow to change main and other plot parameters in plot.ldf

Do not replace registered S3 methods from base/recommended packages

Prof Brian Ripley wrote:

Do not replace registered S3 methods from base/recommended packages, something which is not allowed by the CRAN policies and will mean that everyone gets your method even if your namespace is unloaded.

There is a check for this in R-devel, and the details are shown on the CRAN results page for the package.

In some cases there appears to be modified copyrighted code from R/recommended packages used without giving credit in the DESCRIPTION file, so please review your compliance with that section of the CRAN policies.

Remedies depend on what you are trying to do. Ideas which have been used in other packages:

(a) if you want to make use of a class, say "lmList" but your objects are not really from the class defined in a standard package, you should give them an additional class, say c("lmList2", lmList") and register methods for the additional class.

(b) if you want to change the behaviour of a generic, say predict(), for an existing class or two, you could add such as generic in your own package with default method stats::predict, and then register modified methods for your generic (in your own package).

Please submit an update correcting this and any other issues showing on the CRAN results page. Do not reply to this email to do so: use the webform.

summary.coxph might already provide CI

It might not be necessary to refit the Cox model for CIs as these are included in the summary per default in

summary(mod)$conf.int

Move part on xtable from markdown to LaTeX vignette

See #35

Create merge function

Make a clean-up function for markdown output

along the lines of print.xtable.summary if possible.

Group header missing

... for grouped statistics of factor variables.

summarize: Captions missing

when table = "tabular" and floating = FALSE. Can we re-add the LaTeX package capt-of via print.xtable.summary?

Check and supress warning messages in summarize

summarize(Orthodont, type = "numeric", group = "Sex")
Factors are dropped from the summary
                Sex    N    Mean   SD    Min Q1 Median    Q3  Max   p.value
1   distance   Male   64   24.97 2.90   17.0 23  24.75 26.50 31.5        NA
1.1          Female   44   22.65 2.40   16.5 21  22.75 24.25 28.0      <NA>
2        age   Male   64   11.00 2.25    8.0  9  11.00 13.00 14.0         1
2.1          Female   44   11.00 2.26    8.0  9  11.00 13.00 14.0      <NA>
Warning message:
In names(sums)[names(sums) == "group"] <- labels(data, group) :
  number of items to replace is not a multiple of replacement length

Storrage of Latex attributes

When should LaTeX specific attributes be computed and where should they be stored?
Currently, they are always computed and stored as attributes.

Relevant attributes are:

align
sep
sanitize
rules
header

Allow scientific p-values with proper numbers

Print exact p-values e.g. as

1.3 * 10^-5

and provide a toLatex function to prettify this.
(suggested by A. Ziegler)

Fix labels

This does not work in the tutorial:

> labels(Orthodont, which = "age")
 [1] "age" NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA   
 [28] NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA   
 [55] NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA   
 [82] NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA

	sums <- data.frame(variable = variable.labels, group = NA, blank = "",
	N=NA, Missing = NA, blank_1 = "",
	Mean=NA, SD=NA, blank_2 = "",
	Min=NA, Q1=NA, Median=NA, Q3=NA, Max=NA, var = variables,
	stringsAsFactors = FALSE)

	"labels<-" <- function(data, which = NULL, value){

	which <- check_which(which, data, "define")

hofnerb / paper Goto Github PK

paper's Introduction

papeR

Installation:

Using papeR

paper's People

Contributors

Stargazers

Watchers

Forkers

paper's Issues

method for mixed models fitted with lme4 (vers. >= 1.0)

S3 method for class 'summary.merMod'

Recommend Projects

Recommend Topics

Recommend Org