richardli / summer Goto Github PK
View Code? Open in Web Editor NEWSAE Unit/area Models and Methods for Estimation in R
Home Page: https://richardli.github.io/SUMMER/
SAE Unit/area Models and Methods for Estimation in R
Home Page: https://richardli.github.io/SUMMER/
We need to make sure when using data from multiple surveys that the $cluster columns values are unique by survey. It's probably best to have people submit a data.frame with one extra column: $surveyyear so that internally unique IDs for clusters can be specified in a new cluster column and we can figure out how to combine across surveys
getDirect
assumes that if there is a strata column, it's called "strata", but this isn't documented anywhere (you can find it by looking through the code). Would be nice to either have an input variable strataVar (similar to other column name parameters), or at least make a note of this in the documentation
Error message: Error in maptools::unionSpatialPolygons: isTRUE(gpclibPermitStatus()) is not TRUE
Solved by installing gpclib
from source according to this discussion
if (!require(gpclib)) install.packages("gpclib", type="source")
or maybe by just installing rgeos
according to the same discussion.
Somehow it seems mapproj
is also required somewhere in the mapPlot
.
(Discoverd these when playing with packrat
to pack the SUMMER app)
Great package,
Thank you. :)
Hi, we refer to strata throughout SUMMER as the strata variable possibly v024xv025. But we need a separate argument for urban rural maybe? in the formula. As the strata for the direct estimates will be difference than the one for the fixed effect in the formula in fitINLA2.
Hi, I fitted a Spatio-temporal smoothing model for under-five mortality rates with SUMMER package, specifically with fitINLA function. The related references (mercer et al, Li, et al) shows that the fitted yearly-model is a Spatio-temporal one with spatial structured and unstructured effect (ICAR, iid), also structured and unstructured temporal effect (RW2, iid) and the Spatio-temporal interaction based in Knorr. However, when I fitted the model and summarized it the unstructured random effect was not included and most of them are "generic" which means, an iid model for all of them. Moreover, I tried modifying the function but the latent model objects iid.new, rw.model are hidden and R doesn't recognize it.
thanks
Hi,
I got this error message when running the codes below:
smoothed <- smoothSurvey(data = data, geo = districts, Amat = mat,
responseType = "binary", responseVar = "OW", strataVar = NULL, weightVar = NULL,
regionVar = "district", clusterVar = NULL, CI = 0.95)
summary(smoothed)
Strata not defined. Ignoring sample design
cluster not specified. Ignoring sample design
Error in fit$marginals.linear.predictor[[i]] : subscript out of bounds
There was no problem when running the weighted and smoothed estimates with the codes below.
svysmoothed <- smoothSurvey(data = data, geo = districts, Amat = mat,
responseType = "binary", responseVar="OW", strataVar = "PSU", weightVar ="Weight_Final",
regionVar ="district", clusterVar = "~1", CI = 0.95)
summary(svysmoothed)
I really appreciate your help. Thank you.
Dear Dr. Z. Richard Li!
while trying to fit direct estimate using fitINLA() function to estimate the Random Walk 2 random effects on the yearly scale, it returns error message.
fit2 <- fitINLA(data = data, geo = NULL, Amat = NULL, year_label = years.all,
year_range = c(1985, 2019), rw = 2, is.yearly = TRUE, m = 5, type.st = 4)
returns:
Error in data.frame(..., check.names = FALSE) :
arguments imply differing number of rows: 42, 72
INLA Repository in the Vignette was not updated in v0.2.1. Should match DESCRIPTION file.
Hi Richard -- I'm getting the following error with getBirths which is unclear to me. Can you please take a look? I'll email you the data file.
Thanks,
Katie
data.raw <- readRDS("~/Desktop/data.raw.rds")
data <- SUMMER::getBirths(
data = data.raw,
surveyyear = 1996,
year.cut = seq(1970, 1997, 1),
variables = intersect(c("caseid", "v001", "v002", "v004", "v005", "v021", "v022", "v023", "v024",
"v025", "v101", "v102", "v139", "bidx", "v012"), names(data.raw)),
strata = intersect(c("v022", "v024", "v025", "v101", "v102"), names(data.raw)),
compact = F
)
#> Children with age at least 24 months are assumed to have recorded age truncated to full years.
#> Recorded age + 5 months is used to adjust for the truncation for ages >= 24 and are multiples of 12.
#> Error in `$<-.data.frame`(`*tmp*`, "age", value = "0"): replacement has 1 row, data has 0
Created on 2024-01-25 with reprex v2.1.0
Dear Z. Richard Li,
I am working with 4 DHS surveys conducted in Nigeria (2003, 2008, 2013, 2018).
Using getDirectList(), I end up having HT estimates for a given period, region and survey. I combine them with aggregateSurvey(). I run fitINLA on this output which works fine. However, running getSmoothed() on the fitted object does not work and print the error message:
"Error in mod$marginals.lincomb.derived[[index]] :
attempt to select less than one element in get1index"
In addition, aggregateSurvey() inverts the lower and upper bound (at least in my case).
Code
data_multi <- getDirectList(births = data[[2]], years = years,regionVar = "region", timeVar = "time",
clusterVar = "~clustid + id", ageVar = "age", weightsVar = "weights",
geo.recode = NULL)
est <- aggregateSurvey(data_multi)
fit1 <- fitINLA(data = est, geo = NULL, Amat = NULL, year_label = years,
year_range = c(1989, 2018), rw = 2, m = 5, is.yearly = TRUE)
out1 <- getSmoothed(fit1)
All functions should have either an example or not be exported.
schecking Rd line widths ... NOTE
Rd file 'mapPlot.Rd':
\examples lines wider than 100 characters:
mapPlot(countryname = "Uganda", results = results_rw2, geo = geo, countrysum = data, inlamod = inla_model)
The current hatchPlot
function is not very flexible (e.g., difficult to transform scales and color gradients). It would be great to implement it in ggplot. I'm keeping this thread open in case anyone wants (and has the time and curiosity!) to help. I'll post related stuff here too.
For an illustration: https://imaddowzimet.github.io/crosshatch/, which implemented hatching, but seems not easily adaptable to change hatching density and show on legends.
Hi there, I would like to use the summer package, Nigeria DHS 2018, district level to generate SAE for the malaria parasitemia and when I run fitGeneric() to obtain smoothed estimates without weights,
smoothed <- fitGeneric(data = pfpr_df_2, geo = LGA, Amat = mat, responseType = "binary",
responseVar = "hml32", strataVar = NULL, weightVar = NULL, regionVar = "LGA",
clusterVar = NULL, CI = 0.95)
I get the error
Error in fitGeneric(data = pfpr_df_2, geo = admin1shp, Amat = mat, responseType = "binary", :
Exist regions in data but not in the Amat.
However, checking the two datasets with the code below, which I think the function is doing (not sure), I see that it is counting the LGA names as 7933. The actual LGA length is 653 in pfpr_df_2
and 774 in mat. I am using the same name formats in both. I am not sure how to fix this. I will appreciate any assistance
sum(!pfpr_df_2$LGA %in% colnames(mat))
[1] 7933
svysmoothed.year <- smoothSurvey(data = BRFSS, geo = KingCounty, Amat = mat,
responseType = "binary", responseVar = "diab2", strataVar = "strata", weightVar = "rwt_llcp",
regionVar = "hracode", clusterVar = "~1", timeVar = "year", time.model = "rw1",
type.st = 1)
Error in smoothSurvey(data = BRFSS, geo = KingCounty, Amat = mat, responseType = "binary", :
Exist regions in data but not in the Amat.
The years variable is character. The years variable from getDirect
is factor (and is usually used to extract the year levels without typing them out, e.g., when you see years = levels(birth$years)
in vignettes).
Maybe worth making both factors, is the consistency worth the risk of using factors?
years <- levels(DemoData[[1]]$time)
data_multi <- getDirectList(births = DemoData, years = years,
regionVar = "region", timeVar = "time", clusterVar = "~clustid+id",
ageVar = "age", weightsVar = "weights", geo.recode = NULL)
data <- aggregateSurvey(data_multi)
years.all <- c(years, "15-19")
fit1 <- smoothDirect(data = data, Amat = NULL,
year_label = years.all, year_range = c(1985, 2019),
time.model = "rw2",
is.yearly=FALSE, m = 5)
out1 <- getSmoothed(fit1)
plot(out1, is.subnational=FALSE)
fit2 <- smoothDirect(data = data, Amat = DemoMap$Amat,
year_label = years.all, year_range = c(1985, 2019),
time.model = "rw2", is.yearly=TRUE, m = 5, type.st = 4)
out2 <- getSmoothed(fit2)
plot(out2, is.subnational=TRUE)
Line 1244 in 92d0073
When time.model
is "rw1"
or "rw2"
, the slope.fixed.output
object is never defined by the smoothCluster
function so we get this error:
Error in smoothCluster(data = counts.all, Amat = DemoMap$Amat, family = "betabinomial", :
object 'slope.fixed.output' not found
This is resolved when we use time.model == "ar1"
.
I found this issue when running the benchmarking example code (https://github.com/richardli/SUMMER/blob/master/R/benchmark.R#L33):
fit.bb <- smoothCluster(data = counts.all, Amat = DemoMap$Amat,
family = "betabinomial",
year_label = periods,
survey.effect = TRUE,
linear.trend = TRUE,
time.model = "ar1")
Thanks!
Added CI currently need to have different names as the CI in the dataset to plot (because of the merge), could be good to have a simple name change internally to avoid that.
NFSH data is having record of births data upto 2020. But after running program its giving project from 2015 onward but i want to project after 2020. How to get that Please suggest.
Aloha!
I am currently running through the vignette, and I am running into the following error:
fit2 <- fitINLA(data = data, geo = NULL, Amat = NULL, year_names = years.all,
year_range = c(1985, 2019), priors = priors, rw = 2,
is.yearly=TRUE, m = 5)
Error in exists("my.cache", envir = envir, mode = "list") :
use of NULL environment is defunct
I am running R 3.5.1 on Windows.
smooth.year <- smoothSurvey(data =COD_ANC_comb_data, geo = geo,
Amat = Amat, responseType ="binary", responseVar = "anc_timing",
strataVar = "residence", weightVar = "sample_weight", regionVar = "region",
clusterVar = "~DHSCLUST+caseid",timeVar = "year", time.model="rw2",
CI = 0.95, formula=formula1, type.st = 4, nest=TRUE)
**** l get this error below when l run the code below, please help
Error in fit$marginals.linear.predictor[[i]] : subscript out of bounds
dear Richard!
the getDirectList() functin failed to take the regionvar. returns
Error in getDirect(births = births[[1]], years = years, regionVar = regionVar, : region variable not defined, and no v101 or v024!
Originally posted by @Awugchew in #15 (comment)
ERROR: this R is version 3.3.2, package 'SUMMER' requires R >= 3.4.2
Can fix by changing R requirement.
I'm using data from a two-stage cluster sample and have some duplicate id's between clusters. To fix this with svydesign()
, I would just set nest=T
. There is no option to do that in fitGeneric()
.
As an aside, is there a reason it is preferable to have to recreate a the svydesign object in the fitGeneric()
call, rather than just having the option to just reference an existing svydesign
object?
> fitGeneric(data = khi_base, geo = UCmap, Amat = mat, responseType = "binary",
+ responseVar = "novac", strataVar = "mc_104", weightVar = "dsgnwt", regionVar = "mc_104",
+ clusterVar = "~mc_105", CI = 0.95)
Error in svydesign.default(ids = stats::formula(clusterVar), weights = ~weights0, :
Clusters not nested in strata at top level; you may want nest=TRUE.
Hi, I followed the tutorial in the vignette and observed the newformula as seen below was commented out but I tried to use it anyway since you used the penalized complexity priors and I wanted to see how the result differed from the default priors. However, Amat in the "graph=Amat" was not previously defined and so I replaced it with "mat" which has used to define the adjacency matrix previously but my results differed (see attachment) The code below is from the vignette.
newformula <- "f(region.struct, model = 'bym2', graph = Amat, constr = TRUE,scale.model = TRUE, hyper = list( phi = list(prior = 'pc', param = c(0.5 , 2/3) , initial = -3), prec = list(prior = 'pc.prec', param = c(0.2/0.31 , 0.01) , initial = 5)))" svysmooth.2 <- fitSpace(data = BRFSS, geo = KingCounty, Amat = mat, family = "binomial", responseVar="diab2", strataVar="strata", weightVar="rwt_llcp", regionVar="hracode", clusterVar = "~1", hyper=NULL, CI = 0.95, newformula = newformula)
Error in f(region.struct, model = "bym2", graph = Amat, constr = TRUE, : object 'Amat' not found In addition: Warning message: In inla.model.properties.generic(inla.trim.family(model), (mm[names(mm) == : Model 'bym2' in section 'latent' is marked as 'experimental'; changes may appear at any time. Use this model with extra care!!! Further warnings are disabled.
I replaced graph = Amat with graph = mat and it worked. Problem is my results were different for the PC prior
svysmooth.2 <- fitSpace(data = BRFSS, geo = KingCounty, Amat = mat, family = "binomial", responseVar="diab2", strataVar="strata", weightVar="rwt_llcp", regionVar="hracode", clusterVar = "~1", hyper=NULL, CI = 0.95, newformula = newformula)
See https://stat.ethz.ch/pipermail/r-devel/2014-December/070252.html
May fix to change vignette encoding.
Currently it's just merging the names. Good, but not informative when errors show up. But should not check all regions in data also in map (since there maybe region = "All" usually in plotting direct estimates), or all map regions in data (since there could be missing?). Maybe make sure that when merging, keep all for map.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.