Coder Social home page Coder Social logo

strengejacke / ggeffects Goto Github PK

View Code? Open in Web Editor NEW
513.0 10.0 33.0 238.34 MB

Estimated Marginal Means and Marginal Effects from Regression Models for ggplot2

Home Page: https://strengejacke.github.io/ggeffects

License: Other

R 99.46% TeX 0.28% SCSS 0.04% CSS 0.22%
hacktoberfest marginal-effects prediction estimated-marginal-means

ggeffects's Introduction

Hello there!

GitHub followers Twitter Follow Research gate

I'm a senior researcher working in the Institute of Medical Sociology (IMS) at the University Medical Center Hamburg-Eppendorf (UKE). I chair the working group on health and health care among the aged.

My research interests are

  • Health and health care among the aged
  • Family carers of older people
  • Organizational behaviour and health care research
  • Social inequalities in health care

Scientific and social media profiles

               

You can find further information on my website:

👉👉 www.danielluedecke.de 👈👈

including a list of publications.

Statistical skills

I'm co-chairing the working group "Research Methods" of the German Society of Medical Sociology (AG Methoden / DGMS) and do a lot R programming. Currently, I'm mostly involved in projects related to R statistics (through the easystats project or my other packages like ggeffects). However, my previous main activity in developing research software was the Zettelkasten, in Java.

Here are the main R-package projects I'm working on:

Data wrangling and preparation

  • datawizard 🧙 Magic potions to clean and transform your data

Statistics and regression modelling

  • parameters: 📊 Computation and processing of models' parameters
  • performance: 💪 Models' quality and performance metrics (R2, ICC, LOO, AIC, BF, ...)
  • bayestestR: 👻 Utilities for analyzing Bayesian models and posterior distributions
  • effectsize: 🐉 Compute and work with indices of effect size and standardized parameters
  • correlation: 🔗 Methods for Correlation Analysis
  • modelbased: 📈 Estimate effects, contrasts and means based on statistical models
  • report: 📜 🎉 Automated reporting of objects in R
  • insight: 🔮 Easy access to model information for various model objects

Data Visualization

  • ggeffects: Estimated marginal means and effects plots for regression models
  • see: 🎨 Visualisation toolbox for beautiful and publication-ready figures
  • sjPlot: Plots and tables for summary statistics, descriptive statistics and regression models

ggeffects's People

Contributors

b0ydt avatar crsh avatar gregorsteiner avatar indrajeetpatil avatar kant avatar mattansb avatar sam-crawley avatar seananderson avatar strengejacke avatar vincentarelbundock avatar wibeasley avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ggeffects's Issues

A bug in ggpredict()?

Hi,

Thank you for sharing very nice package! I might have a bug with ggpredict().
It didn't seem to work with glmer.nb and nbinom1 and nbinom2 families in glmmTMB properly. The following is an example using the Owls dataset.

m00 <- glmmTMB(SiblingNegotiation~SexParent+(1|Nest),data=Owls,family=nbinom1)
m01 <- glmmTMB(SiblingNegotiation~SexParent+(1|Nest),data=Owls,family=nbinom2)
m02 <- glmer.nb(SiblingNegotiation~SexParent+(1|Nest),data=Owls)

ggpredict of the first two models returned an error.

> ggpredict(m00,terms="SexParent")
Error in lf(prdat$fit) : could not find function "lf"

ggpredict of the last model returned results, but the confidence intervals were very large.
Is this expected?

> ggpredict(m02,terms="SexParent")
Note: uncertainty of the random effects parameters are not taken into account for confidence intervals.
# A tibble: 2 x 5
      x predicted conf.low conf.high  group
  <dbl>     <dbl>    <dbl>     <dbl> <fctr>
1     1  5.721317 250.0507  372.7732      1
2     2  6.433730 517.0217  749.4767      1

Thanks!!

ci.style for plot()-method

dot-style and dash-style, currently also available throught the ci-argument, will be options for the ci.style-argument. This argument may also get an option to plot lines and error bars, like in this example: #48

Error since updating to version 0.2.2

I've been using ggpredict on a polr object without any problem, but since updating to the latest CRAN version, I get the following error:

Error in UseMethod("droplevels") :
no applicable method for 'droplevels' applied to an object of class "character"

The command that generates this error is:
ggpredict(fit.tri.int.b, terms = c('chain_type', 'sector'))

The sector variable in the data is of class character rather than factor, which seems to be the source of the error. I converted sector to a factor and the error disappears, and that is a simple workaround, but I'd prefer not to have to use factors if I can avoid it. Is this something that could be fixed?

Thanks,
David

ggeffect applied to a factor term

When you apply ggeffect to a factor term, e.g.

ggeffect(mod, "sex")

the values of x in the return object are numbers (i.e. 1 and 2) instead of the factor labels.

Incorrect Predictions for Non-linear Transformations

When using the log transform within the model, the wrong predictions are generated. The ggpredict function seems to treat the logged values of income as the raw values of income and then logs them again to generate the prediction.

> library(car)
> library(ggeffects)
> data(Prestige)
> summary(Prestige$income)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
    611    4106    5930    6798    8187   25879 
> mod <- lm(prestige ~ log(income), data=Prestige)
> ggpredict(mod, terms="income")
Following variables had many unique values and were prettified: income. Use `pretty = FALSE` to get smoother plots with all values, however, at the cost of increased memory usage.
# A tibble: 4 x 5
      x predicted conf.low conf.high group
  <int>     <dbl>    <dbl>     <dbl> <fct>
1     6    -101.     -128.     -74.8 1    
2     8     -95.0    -120.     -69.7 1    
3    10     -90.2    -115.     -65.8 1    
4    12     -86.3    -110.     -62.5 1    

The values above in x are the logged values of income. Using the raw values of income to generate predictions from the model, we get something different and more resonable

> predict(mod, newdata=data.frame(income = exp(6)))
        1 
-10.51755 

problems install nloptr causing ggeffects not to install

see astamm/nloptr#40

sessionInfo()
R version 3.3.3 (2017-03-06)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.5 LTS

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8   
 [6] LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
 [1] colorspace_1.3-2 scales_0.4.1     assertthat_0.1   lazyeval_0.2.0   plyr_1.8.4       tools_3.3.3      gtable_0.2.0     tibble_1.2      
 [9] remotes_1.1.0    Rcpp_0.12.8      ggplot2_2.2.0    grid_3.3.3       munsell_0.4.3   

Request: could you make ggeffects take objects from metafor (class rma.mv rma)?

Dear SJ

I was wondering whether you could get your wonderful package to work with rma.mv and rma objects? I just recently came across (yesterday) your sjstats and this package so wondering how difficult it would be to do this for the metafor package (a package for meta-analysis). In a meta-analysis, we often have a lot of categories so it would be good to know marginal effects but currently, this is not possible (at least easily) with metafor. Anyway, many thanks for these cool packages!

Thanks!

Shinichi

Cannot get ggeffects for glm.nb and glm with contiunous covariate

I'm using ggeffects to get marginal effects plotted from different glmer and glm models and I've found out ggeffects gives the following result when asking for a marginal effect of one of the continuous variables with glm.nb or glm models:

[1] x predicted conf.low conf.high
<0 rows> (or 0-length row.names)

For example, I cannot get ggeffects for anything with anosc in this model:
fit<- glm.nb(formula = ncone.san ~ Tratamiento * C + anosc + I(anosc^2) +
anosc:Tratamiento + I(anosc^2):Tratamiento + ivsc, data = datos,
init.theta = 1.133641349, link = log)

ggeffect(fit, c("anosc")) gives 0 rows, while ggeffect(fit, c("ivsc")) gives correct ggeffects:
# Predicted incidents for N.sanas.sum
# x = ivsc

x predicted conf.low conf.high
-1 442.571 254.379 769.991
0 274.681 170.779 441.798
1 170.480 82.409 352.674
2 105.808 35.151 318.496
3 65.670 14.490 297.622
4 40.758 5.898 281.638

Both are scales variables and are numeric (class(fit$model$anosc) = class(fit$model$ivsc)=numeric). The only difference is that anosc has only four unique values, but I don't think this should make a difference.

Any hint of what's going on?
Best regards,
Asier

Error message while installing: Ggeffects is corrupt

Hi!

I tried to install sjPlot and all relevant dependencies via github (to run the syntax for plotting my graphs, which you provided earlier), using the following syntax:

library(devtools)
devtools::install_github("strengejacke/sjlabelled", force= TRUE)
devtools::install_github("strengejacke/sjmisc", force= TRUE)
devtools::install_github("strengejacke/sjstats", force= TRUE)
devtools::install_github("strengejacke/ggeffects", force= TRUE)
devtools::install_github("strengejacke/sjPlot", force= TRUE)

However, when I try to install ggeffects, I get the following error message:

Downloading GitHub repo strengejacke/ggeffects@master
from URL https://api.github.com/repos/strengejacke/ggeffects/zipball/master
Installation failed: lazy-load database 'C:/Users/Marco/Documents/R/win-library/3.5/ggeffects/R/ggeffects.rdb' is corrupt
Warning messages:
1: In eapply(ns_env(pkg), force, all.names = TRUE) :
restarting interrupted promise evaluation
2: In eapply(ns_env(pkg), force, all.names = TRUE) :
internal error -3 in R_decompress1

Can you please tell me how to fix this issue? Thanks in advance!

Kind regards,

Marco

Automated tests

Hi @strengejacke. I'm currently moving my way through your JOSS review. I can't seem to find any information about how the function of the software is verified. Is this done through either manual or automatic tests? My apologies if this information is listed in documentation or paper, but I couldn't find it in either. Thanks!

`ggeffect()` not working for `mgcv::gam()`

Hello, I'm getting errors when trying to use ggeffect() on a model fitted by mgcv::gam() with and without spline terms.

> library(mgcv)
Loading required package: nlme
This is mgcv 1.8-24. For overview type 'help("mgcv-package")'.
> library(ggeffects)
> 
> fit1 <- gam(mpg ~ s(wt), data=mtcars)
> ggeffect(fit1, "wt")
Error in model.frame.default(mpg ~ s(wt) + wt, data = list(mpg = c(21,  : 
  invalid type (list) for variable 's(wt)'
> 
> fit2 <- gam(mpg ~ wt, data=mtcars)
> ggeffect(fit2, "wt")
Error in estimate.gam(G, method, optimizer, control, in.out, scale, gamma,  : 
  unknown smoothness selection criterion
> 
> sessionInfo()
R version 3.5.0 (2018-04-23)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Manjaro Linux

Matrix products: default
BLAS: /usr/lib/libblas.so.3.8.0
LAPACK: /usr/lib/liblapack.so.3.8.0

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C               LC_TIME=en_GB.UTF-8       
 [4] LC_COLLATE=en_GB.UTF-8     LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] ggeffects_0.4.0 mgcv_1.8-24     nlme_3.1-137   

loaded via a namespace (and not attached):
 [1] tidyr_0.8.1        splines_3.5.0      carData_3.0-1      modelr_0.1.2      
 [5] assertthat_0.2.0   stats4_3.5.0       coin_1.2-2         pillar_1.2.3      
 [9] lattice_0.20-35    glue_1.2.0         glmmTMB_0.2.2.0    snakecase_0.9.1   
[13] minqa_1.2.4        colorspace_1.3-2   sandwich_2.4-0     Matrix_1.2-14     
[17] survey_3.33-2      plyr_1.8.4         psych_1.8.4        pkgconfig_2.0.1   
[21] broom_0.4.5        haven_1.1.2        purrr_0.2.5        xtable_1.8-2      
[25] mvtnorm_1.0-8      scales_0.5.0       stringdist_0.9.5.1 lme4_1.1-17       
[29] emmeans_1.2.2      tibble_1.4.2       effects_4.0-2      bayesplot_1.5.0   
[33] ggplot2_3.0.0      sjlabelled_1.0.11  TH.data_1.0-8      nnet_7.3-12       
[37] TMB_1.7.14         lazyeval_0.2.1     mnormt_1.5-5       survival_2.41-3   
[41] magrittr_1.5       crayon_1.3.4       estimability_1.3   MASS_7.3-49       
[45] forcats_0.3.0      foreign_0.8-70     tools_3.5.0        data.table_1.11.4 
[49] hms_0.4.2          multcomp_1.4-8     stringr_1.3.1      munsell_0.5.0     
[53] prediction_0.3.6   bindrcpp_0.2.2     compiler_3.5.0     rlang_0.2.1       
[57] grid_3.5.0         nloptr_1.0.4       ggridges_0.5.0     rstudioapi_0.7    
[61] gtable_0.2.0       codetools_0.2-15   sjstats_0.15.0     reshape2_1.4.3    
[65] sjmisc_2.7.3       R6_2.2.2           zoo_1.8-2          pwr_1.2-2         
[69] dplyr_0.7.6        bindr_0.1.1        modeltools_0.2-21  stringi_1.2.3     
[73] parallel_3.5.0     Rcpp_0.12.17       tidyselect_0.2.4   lmtest_0.9-36

x.as.factor = TRUE does not work for several levels?

Hi Daniel, again thank you very much for this nice package!

I have used your package a lot in the last few months to plot numeric as well as factorial explanatories. Until now, everything was working as expected but today I increased the numbers of levels in my factorial explanatory (going from seasons = 4 levels to months = 12 levels). As usual, I transformed the variable to a factor before fitting the model. But however ggpredict() plots a line as for numeric values, even when trying to force it not to do so by adding x.as.factor = TRUE.

Is this the behaviour we should expect? Is there any way to plot this as factors (point + errorbar), even though it are many levels?

Thank you very much in advance!

Memory error with high number of observations

Hi,
I'm trying to plot the fixed effects of a lmer model, but the number of data points seems to high for ggeffects despite being a reasonnalby large dataset.

reproducible example :

library(lme4)
data = data.frame(X=rnorm(30000), Y=rnorm(30000), Z=rep(1:2),cell=rep(c(1:10),3000))
LME <- lmer(X ~ Y * Z + (1|cell), data=data)
library(ggeffects)
dat <- ggpredict(LME, terms = c("Y","Z"))
plot(dat)

Erreur : impossible d'allouer un vecteur de taille 26.8 Go

sessionInfo() :

R version 3.3.2 (2016-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.4 LTS
ggeffects_0.3.1 lme4_1.1-15 

Is there a way to reduce the object size ?

Thanks for your package by the way !

Confidence intervals not included for polr

First of all, a great package – thanks a lot for your work. I ran an ordinal logistic regression using the polr function from the MASS package. While the predicted probabilities are returned, the data frame does not include confidence intervals. How can I retrieve them? Thanks for a reply and your help!

library(MASS)
options(contrasts = c("contr.treatment", "contr.poly"))
house_plr <- polr(Sat ~ Infl + Type + Cont, weights = Freq, data = housing)

predict_house <- ggpredict(house_plr, terms = "Type")

head(predict_house)
# # A tibble: 6 x 4
# x predicted response.level  group
# <dbl>     <dbl>          <chr> <fctr>
#   1     1 0.3784493            Low      1
# 2     1 0.2876752         Medium      1
# 3     1 0.3338755           High      1
# 4     2 0.5190445            Low      1
# 5     2 0.2605077         Medium      1
# 6     2 0.2204478           High      1

ggpredict is not working for glmmTMB hurdle-models

library(glmmTMB)
library(ggeffects)

fit1 <- glmmTMB(SiblingNegotiation~FoodTreatment+ArrivalTime+(1|Nest),
                data=Owls,
                ziformula=~1,
                family=truncated_poisson)

When I try to use ggeffects on the fit object, I get the following error:

> ggpredict(fit1, "FoodTreatment")
Error in lf(prdat$fit) : could not find function "lf"

(It's working for zero-inflated-models with family=poisson)

clustered standard errors with glm logit

Hi,

I am wondering if it is possible to use clustered standard errors with ggeffects to plot confidence intervals accordingly. I found this thread on how to do it with the effects package, but it's somewhat counterintuitive, and I was not sure I really get the syntax right. Therefore I wanted something more straightforward, preferably in a format I can then plot with ggplot (hence this package). I need this from a glm logit.

Thanks!

typical values for mermod object

Hello, I am trying to hold other variables in my model at 0 for continuous and the mode for categorical variables, using ggpredict. When I try to run this:
effect<- ggpredict(model, terms = "x1", type = c("fe", "re"),ci.lvl = NA,typical =c(numeric = "zero", factor = "mode"))

I get the following error:
Error in model.frame.default(delete.response(Terms), newdata, na.action = na.action, : factor x2 has new level 1.

Setting it just to zero gives me the same problem. Is there a way around this? I would ideally like to set factors at their reference level in the model. Thanks.

Richard

Question regarding squared terms and multiple interactions

Hi,

first of all thanks for yet another nice package :)
I don't fully understand whether in ggeffects all interactions and polynomial terms are correctly accounted for when computing average marginal effects.
As an example, this is a nbreg model, effects calculated with your package:

fit <- glm.nb(comments_count_fb ~ maxprop + I(day^2) + day * maxprop , data = d)
ame <- ggaverage(fit, terms = c("day", "maxprop"))

ame %>% filter(group %in% c(6,9)) %>% 
ggplot( aes(x, predicted, colour = group)) + 
  geom_line()  + ylim(0,2000) + theme_bw() + 
  scale_x_continuous(breaks = c(6,63,123,184,246,308, 369, 429,489),
                     labels = c("15/01", "15/03", "15/05",
                                "15/07", "15/09", "15/11",
                                "16/01", "16/03", "16/05"))

image

And this is the same in Stata (the coefficients are equivalent):

 nbreg comments_count_fb c.day##c.day##i.maxprop
 margins, at(day=(6  63  123  184  246 308 369  429  489) maxprop=(9 6)) 

 marginsplot, legend(ring(0) position(12) bmargin(large)) title("") ///
  xlabel(6 "15/01" 63 "15/03"  123 "15/05" ///
 184 "15/07" 246  "15/09"  308 "15/11" 369  ///
 "16/01"  429 "16/03" 489 "16/05",  angle(vertical)) ///
 ytitle("Predicted number of Comments") scheme(s1mono)

graph

Especially for the red curve this looks a little different to me. Do you have an idea why? And is there a reason why confidence intervals are missing for ggaverage() estimates? The two columns conf.low and conf.high are filled with NA values for me.

Support

It would be nice to include in the README file some discussion of what users should do if they need support.

Problems with offsets

For calls to ggpredict on glm model objects from poisson family with more than one offset, the following error occurs:
Error in offset(log(xxxxxx)) :
object 'xxxxxx' not found

For calls to ggpredict on glm model objects from poisson family with an offset with arithmetic performed within the offset call (e.g. offset(log(xxxxxx + 0.00001)), the same error as above occurs.

Support for factors in plot.ggeffects()

At the moment plot.ggeffects() handle variables that are labelled but fails with factors. Could it be possible to make it handle also factors, as this is rather widely used data type in R.

data(efc)
fitLab <- lm(barthtot ~ c12hour + neg_c_7 + c161sex + c172code, data = efc)
(efLab = ggeffect(fitLab, "c161sex"))
plot(efLab)

efc$c161sex = factor(efc$c161sex, labels = names(attributes(efc$c161sex)$labels))
fitFact <- lm(barthtot ~ c12hour + neg_c_7 + c161sex + c172code, data = efc)
(efFact = ggeffect(fitFact, "c161sex"))
plot(efFact)

Allow "constant" option

Add argument "constant", that allows specific values to hold a variable constant at, e.g. constant = c(age = 40) would use "40" for age, instead of the typical value like mean.

Confidence intervals with survey weighted data

I am running into an issue with the confidence intervals produced after ggpredict. Well, not an issue per se, but they don't match the output produced by Stata, although the point estimates match Stata's exactly. Here is a working example:

library(tibble)
library(survey)
library(ggeffects)

# Data set 1

set.seed(123)

df1 <- tibble(id      = seq(1, 100, by = 1),
              gender  = as.factor(rbinom(n = 100, size = 1, prob = 0.50)),
              working = as.factor(rbinom(n = 100, size = 1, prob = 0.40)),
              income  = sample(0:100000, 100),
              pweight = sample(50:500, 100,  replace   = TRUE))


# Apply weights via svydesign

weights <- svydesign(id      = ~id,
                      weights = ~pweight,
                       data    = df1)
                                            

# Logit model with weights

logit <- svyglm(gender ~ working*income,
                family = quasibinomial(),
                data = df1,
                design = weights)

ggpredict(logit, terms = c("working [0, 1]", "income [100, 1000, 10000]"))

I get the following results:

# A tibble: 6 x 5
      x predicted  conf.low conf.high  group
  <dbl>     <dbl>     <dbl>     <dbl> <fctr>
1     0 0.4906395 0.2467707 0.7390463    100
2     0 0.4893027 0.2484972 0.7351775   1000
3     0 0.4759466 0.2652582 0.6955592  10000
4     1 0.5794210 0.2617570 0.8425931    100
5     1 0.5773197 0.2637649 0.8388980   1000
6     1 0.5561654 0.2830412 0.7990959  10000

I do the same exact analysis in Stata. The output from svyglm in R and svy: glm in Stata match perfectly (i.e., point estimates, t-values, confidence intervals). When I run the margins command in Stata, which is just like ggpredict in this example, I get the same point estimate, but different confidence intervals:

margins, at(working=(0 1) income=(100 1000 10000))

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         _at |
          1  |   .4906395   .1375135     3.57   0.001     .2177829    .7634961
          2  |   .4893027   .1356353     3.61   0.000     .2201729    .7584325
          3  |   .4759466   .1173992     4.05   0.000     .2430012     .708892
          4  |    .579421   .1687542     3.43   0.001      .244576    .9142659
          5  |   .5773197    .166619     3.46   0.001     .2467115    .9079279
          6  |   .5561654   .1454702     3.82   0.000      .267521    .8448097
------------------------------------------------------------------------------

As you can see, the confidence intervals don't match. In this case they are relatively similar, but in my real-world analyses, there is a bigger discrepancy. I was wondering if you could shed some light on what is happening? Are the confidence intervals in ggpredict incorporating the survey weights? I know they don't have to exactly match, but since everything else does, it made me wonder how ggpredict is handling the confidence intervals.

memory allocation error with ggeffect, ggpredict

I have a three-way interaction from an lmer model that I was able to summarize and plot easily with the effects package. However, when I try to summarize it using ggeffect I get the following error message: Error: cannot allocate vector of size 84619.5 Gb. With ggpredict I get a similar but less alarming message (the vector size is ~8.3 Gb). I have no sample data I can share, but the fact that this works in effects but not ggeffects makes me suspect a bug.

maintain variable names in data frame

first of all: thank you so much for this work! I've been searching for something like this for years. I will most likely introduce ggeffects in the book I am writing.

Still, I am wondering, why in the produced data frames, the original effect names are not kept, but replaced by x and group. As ggpredict even takes the variable names as argument, this seems a missed chance to create fully fledged plots right out of the box.

Error when plotting raw data for values at specific levels

Gives error:

data(efc)
m <- lm(neg_c_7 ~ e42dep * c160age, data = efc)
dat <- ggpredict(m, c("e42dep", "c160age [40,50,60]"))
plot(dat, rawdata = T)

Workaround:

p <- plot(dat, rawdata = F)
library(ggplot2)
p + geom_point(data = attr(dat, "rawdata"),
               mapping = aes(x = x, y = as.numeric(group)), colour = "black", fill = "black")

closure error

I have a simple multilevel (mixed) model I am fitting with glmer(). Here is the call:

fit.r <- glmer(switch~(1|st)+(1|year)+(1|k.id)+diff, family="binomial", data=subset(sw.sample,party=="R"),control=glmerControl(optimizer="bobyqa"))

I have no problem fitting it, I get sensible results, etc.

> summary(fit.r)
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
 Family: binomial  ( logit )
Formula: switch ~ (1 | st) + (1 | year) + (1 | k.id) + diff
   Data: subset(sw.sample, party == "R")
Control: glmerControl(optimizer = "bobyqa")

     AIC      BIC   logLik deviance df.resid 
   697.2    726.4   -343.6    687.2     2549 

Scaled residuals: 
   Min     1Q Median     3Q    Max 
 -1.62  -0.17  -0.10  -0.06 323.79 

Random effects:
 Groups Name        Variance  Std.Dev. 
 k.id   (Intercept) 2.297e-14 1.516e-07
 st     (Intercept) 1.605e+00 1.267e+00
 year   (Intercept) 2.852e-01 5.341e-01
Number of obs: 2554, groups:  k.id, 2229; st, 49; year, 31

Fixed effects:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)  -4.4517     0.3195 -13.934   <2e-16 ***
diff         -4.3385     0.4461  -9.724   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
     (Intr)
diff 0.446 

When I try to use ggeffects on the fit object and the primary variable of interest (diff), I get the following error:

ggpredict(fit.r,terms="diff")
Error in model.frame.default(object, data, xlev = xlev) : 
  invalid type (closure) for variable 'diff'

No applicable method for 'select_'

I figured out that the error comes from a somewhat damaged tidyverse installation. Even though I had some issues with the SE ranges beforehand, now it seems to result in the same plot for both, old and new versions of R and ggplot2.

Since I cannot delete this issue, I am goign to close it again.

Thank you very much for the package!

ggpredict fail when model is coxph with splines and polynomial

Thanks for your awesome package
It, seems that ggpredict fails when I use a coxph model with a polynomial when the variable in Surv() are not called time and status. Works fine for glm()

d1 <- tibble(status = sample(0:1, 100, replace = TRUE), time = rnorm(100, 60, 30), age = rnorm(100, 60,15))
d2 <- tibble(alive = sample(0:1, 100, replace = TRUE), delay = rnorm(100, 60, 30), age = rnorm(100, 60,15))

mod1 <- coxph(Surv(time, status) ~ poly(age, 3), data = d1)
mod2 <- coxph(Surv(delay, alive) ~ poly(age, 3), data = d2)

# the first one will work while the second one will fail
ggpredict(mod1, "age")
ggpredict(mod2, "age")


## Not happenning without poly or ns
mod2_ter <- coxph(Surv(delay, alive) ~ age, data = d2)
ggpredict(mod2_ter)

## glm works
mod2_bis <- glm(alive ~ poly(age, 3), data = d2)
ggpredict(mod2_bis, "age")

The functions fails with "Error: Unknown columns time and status"

Hope it won't be to hard to fix
Thanks again, work like a charm for interaction and others and will likely replace rms::Predict() for me !

Differences between ggeffect and ggpredict

The package is really useful. I have been using Effects for a while, and having it tidy is great.

May I ask about the differences between ggeffect, ggpredict and ggaverage? I am having problems understanding the differences between the first and the other two. Adding ggeffect to the Vignette might be very useful. For instance, I have been trying to replicate the results provided by effects:Effect using predict with no success.

modify_if() has stricter requirements with data frame columns

This line is problematic with the new implementation of modify():

mf <- suppressWarnings(as.data.frame(purrr::modify_if(mf, is.array, as.vector)))

We're now wrapping around [[<- instead of [<-. The data frame method for [[<- is stricter and throws an error when the output is too long. The method for [<- just truncates with a warning, which you're hiding with suppressWarnings.

Could you fix this problem please? We're planning to release purrr in a couple of weeks.

Change default for parameter `ppd` to TRUE

Since the recommendation for running ggpredict with stan models is to set ppd = TRUE, with which I agree that it is the sensible thing to do, shouldn't the default be set to TRUE? Unless I am mistaken, I don't see any unintended side-effect from this choice.

`ggpredict` with `ppd = TRUE` fails for multivariate `brmsfit` objects

Here is a reproducible example:

library(brms)
library(ggeffects)

x <- rnorm(10, 0)
b <- runif(2)
s <- ifelse(diag(2) == 0, 0.23, 1)
er <- mvtnorm::rmvnorm(10, rep(0, 2), s)
y <- apply(t(b), 2, `*`, x) + er

d <- data.frame(y1 = y[,1], y2 = y[,2], x)

m <- brm(cbind(y1, y2) ~ 1 + x,
        data = d)

p <- ggpredict(m, ppd = TRUE)
Error in predictive_interval.default(prdat2) : 
  For the default method 'object' should be a matrix.
traceback()
9: stop("For the default method 'object' should be a matrix.")
8: predictive_interval.default(prdat2)
7: rstantools::predictive_interval(prdat2)
6: get_predictions_stan(model, expanded_frame, ci.lvl, type, faminfo, 
       ppd, terms, ...)
5: select_prediction_method(fun, model, expanded_frame, ci.lvl, 
       type, faminfo, ppd, terms = ori.terms, typical, vcov.fun, 
       vcov.type, vcov.args, ...)
4: ggpredict_helper(model = model, terms = .x, ci.lvl = ci.lvl, 
       type = type, full.data = full.data, typical = typical, ppd = ppd, 
       x.as.factor = x.as.factor, condition = condition, vcov.fun = vcov.fun, 
       vcov.type = vcov.type, vcov.args = vcov.args, ...)
3: .f(.x[[i]], ...)
2: purrr::map(predictors, function(.x) {
       tmp <- ggpredict_helper(model = model, terms = .x, ci.lvl = ci.lvl, 
           type = type, full.data = full.data, typical = typical, 
           ppd = ppd, x.as.factor = x.as.factor, condition = condition, 
           vcov.fun = vcov.fun, vcov.type = vcov.type, vcov.args = vcov.args, 
           ...)
       tmp$group <- .x
       tmp
   })
1: ggpredict(m, ppd = TRUE)

Please see the associated PR #45, which should fix the issue.
You may also consider including a test on a case that looks like the above.

Passing poly objects to ggpredict

I've been struggling for a while now to visualize a set of relatively complex three-way interactions arising from a series of linear mixed effects model in R. I recently came across the ggeffects package, which looked like it might finally offer a solution, and I've mostly been having a lot of success using it. The problem I'm having is that one of my models involves a polynomial term -- for simplicity sake, let's say that the model takes the form of

m <- lmer(poly(X, 2)*Y*Z + (1|SubjectID))

where X and Y are continuous variables, and Z is a categorical variable. However, when I attempt to run the command...

dat <- ggpredict(model.Test1, terms = c("X", "Y [alpha, beta, gamma]", "Z"), type = "re", allow.new.levels = TRUE)

...it produces the following error: "Column poly(x, 2) must be a 1d atomic vector or a list." I'm wondering if there are any good solutions that might allow me to visualize the three-way interaction while still retaining the polynomial term. Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.