The bmstdr from sujit-sahu

Random Error by total time measurement in the Bsptime function in foreach loop

I have discovered an error that random occurs in a non-parallelized foreach loop. I am using a foreach loop for the calculation of the cross-validation. The error is

"Error in { : task 1 failed - "argument is of length zero""

and refers to the total time measurement in the function Bsptime that you have built in. I haven't found out the exact line yet.

After various test runs, the error always occurs when it cannot calculate the "Total time taken" for whatever reason. The actual model (spTimer), is calculated correctly before and their output is also displayed correctly by spTimer.

For the elimination of the error it would be the simplest to remove the total time measurement or to check this on ZERO. Possibly also another kind of the time measurement would be a solution.

Because of this sporadic error a current use of the function "Bsptime" within a foreach loop is unfortunately not possible.

My system:
R version 4.2.0 (2022-04-22 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows Server x64 (build 17763)

Version bmstdr: 0.2.2

Error in predict function

The predict function has error with bmstdr. The error is:

"Error in paste(call.f, sep = "")[[3]] : subscript out of bounds"

Here a Example:

library(bmstdr)
library(spTimer)
library(string) 

set.seed(11)
s <- sort(sample(unique(nysptime$s.index), size = floor((length(unique(nysptime$s.index))/100)*20)))
DataFit <- spT.subset(data = nysptime, var.name = "s.index", s = s, reverse = T) 
DataValPred <- spT.subset(data = nysptime, var.name = "s.index", s = s, reverse = F) 

mod <- Bsptime(package = "spTimer", 
                 model = "GPP", 
                 formula = as.formula(y8hrmax ~ xmaxtemp + xwdsp + xrh), 
                 data = DataFit, 
                 n.report = 5, 
                 coordtype = "utm", 
                 coords = 4:5, 
                 scale.transform = "NONE", 
                 g_size = 4,
                 N = 2000, 
                 mchoice = F,
                 plotit = F)

# Spatial prediction using spT.Gibbs output
pred.gp <- predict(mod$fit, tol.dist=0.0, newdata = DataValPred, newcoords = ~ Longitude + Latitude)

# result: Error in paste(call.f, sep = "")[[3]] : subscript out of bounds

A temporary fix is:
mod$fit$call <- c(mod$fit$call,"", str_split_fixed(mod$fit$call, " ", n = 3)[3])

The error is that the predict function cannot divide the regression formula by dependent and independent variables. I do this manually with the fix. The error line (line number 674) is in the data spGPP.r from the package spTimer in github with the Web Adress https://github.com/cran/spTimer/blob/master/R/spGPP.r in the function "spGPP.prediction". The line is call.f<-as.formula(paste("tmp~",paste(call.f,sep="")[[3]])).

It is very important that you fix this in the program code, because this error occurs in all models. My temporary fix is only a stopgap.

My system:
R version 4.2.0 (2022-04-22 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows Server x64 (build 17763)

Version bmstdr: 0.2.2

Random Error in Function Bsptime

I looked at it all again and it looks like the total time is not the problem after all (Issue). What the actual problem is, I don't know. However, I have observed that if you run the "Bsptime" function with the same data and settings in a loop, that it then sporadically produces errors.

Here is a minimal example with my data where the error occurs from time to time:

My data: data.csv

data <- read.csv("S:/data.csv")  # Customize path

N <- 2000
nBurn = 1000
f <- as.formula(y_spec ~ x_precipitation_HYRAS_std + x_DGM1 + x_air_temperature_mean_std * x_radiation_global_mean + x_air_temperature_mean_std * x_Auenlehmmaechtigkeit)
plotit <- F
mchoice <- F
data <- data
vrows <- c(13,14,15,16,17,18,43,44,45,46,47,48,61,62,63,64,65,66,79,80,81,82,83,84,91,92,93,94,95,96,127,128,129,130,131,132,133,134,135,136,137,138,157,158,159,160,161,162,181,182,183,184,185,186,193,194,195,196,197,198,217,218,219,220,221,222,253,254,255,256,257,258,289,290,291,292,293,294,295,296,297,298,299,300,319,320,321,322,323,324,337,338,339,340,341,342,349,350,351,352,353,354,367,368,369,370,371,372,379,380,381,382,383,384,409,410,411,412,413,414,421,422,423,424,425,426,457,458,459,460,461,462,517,518,519,520,521,522,535,536,537,538,539,540,541,542,543,544,545,546,565,566,567,568,569,570,571,572,573,574,575,576,589,590,591,592,593,594,601,602,603,604,605,606,613,614,615,616,617,618,655,656,657,658,659,660,667,668,669,670,671,672,709,710,711,712,713,714,721,722,723,724,725,726,727,728,729,730,731,732,823,824,825,826,827,828,853,854,855,856,857,858,865,866,867,868,869,870,901,902,903,904,905,906,907,908,909,910,911,912,919,920,921,922,923,924,955,956,957,958,959,960,997,998,999,1000,1001,1002,1009,1010,1011,1012,1013,1014,1033,1034,1035,1036,1037,1038,1087,1088,1089,1090,1091,1092,1105,1106,1107,1108,1109,1110)
coords <- as.matrix(data.frame(x = c(314456.1,305362.6,308393.7,311424.9),y = c(5692110,5695567,5695567,5695567)))
n.report <- 1
package <- "spTimer"
model <- "GPP"
coordtype <- "plain"
coords_column <- which(colnames(data) %in% c("coord_x","coord_y"))
scale.transform <- "NONE"
                        
                        
foreach(a = 1:10) %do% {
    Bsptime(package = package, model = model, knots.coords = coords, formula = f, data = data, n.report = n.report, coordtype = coordtype, coords = coords_column, N = N, burn.in = nBurn, mchoice = mchoice, scale.transform = scale.transform, validrows = vrows, plotit = plotit)
}

Note: The error occurs randomly. If you do not get the error immediately just increase the number of loop passes.

Here is a screenshot of a minimal example with 3 passes with the error:

You can see that the model was executed 2 times without error and only on the 3rd time the model produces error. All settings and data are the same!

Here a screenshot of the same function being executed twice (same data and settings) once without error and then with error outside a loop:

What is very strange is the fact that sometimes the model works without error and then again it doesn't....or is it my data?

Maybe the functions must not be executed so quickly one after the other, because something is not yet processed in the background. I do not know....

Please take a look at this, otherwise I can't use your wonderful package because it doesn't work reliably.

My system:
R version 4.2.0 (2022-04-22 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows Server x64 (build 17763)

Version bmstdr: 0.2.2

Error in spTimer model with truncated GPP and validrows

I have unfortunately encountered another error. If you want to create a truncatedGPP spTimer model with a validation you get the following error (see screenshot):

Error in if (length(yval) != nrow(yits)) { : argument is of length zero

The error does not occur without a validation for the same model!

The named error comes from the function 'calculate_validation_statistics' which is accessed in Bsptime/BspTimer_sptime'. The original error however originates from the fact that further above in the function 'Bsptime/BspTimer_sptime' a reference (op = MCMC samples for the true observations) is accessed which in a truncated GPP model does not exist. The result of this access is NULL! In the following screenshot I have marked the line with the faulty reference.

Here still the existing objects in gp_fit.

Please fix quickly... ;)

No function for the "plotit" option

Since the new version, the plot can no longer be deactivated via the "plotit" option. No matter if you set the option TRUE or FALSE, the plot always appears.

My system:
R version 4.2.0 (2022-04-22 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows Server x64 (build 17763)

Version bmstdr: 0.2.2

Release bmstdr 0.1.0.9000

First release:

usethis::use_cran_comments()
Proof read Title: and Description:
Check that all exported functions have @returns and @examples
Check that Authors@R: includes a copyright holder (role 'cph')
Check licensing of included files
Review https://github.com/DavisVaughan/extrachecks

Prepare for release:

urlchecker::url_check()
devtools::check(remote = TRUE, manual = TRUE)
devtools::check_win_devel()
rhub::check_for_cran()
rhub::check(platform = 'ubuntu-rchk')
rhub::check_with_sanitizers()
Draft blog post

Submit to CRAN:

usethis::use_version('minor')
devtools::submit_cran()
Approve email

Wait for CRAN...

Error in predict function

The predict function has error with bmstdr. The error is:

"Error in paste(call.f, sep = "")[[3]] : subscript out of bounds"

Here a Example:

library(bmstdr)
library(spTimer)
library(string) 

set.seed(11)
s <- sort(sample(unique(nysptime$s.index), size = floor((length(unique(nysptime$s.index))/100)*20)))
DataFit <- spT.subset(data = nysptime, var.name = "s.index", s = s, reverse = T) 
DataValPred <- spT.subset(data = nysptime, var.name = "s.index", s = s, reverse = F) 

mod <- Bsptime(package = "spTimer", 
                 model = "GPP", 
                 formula = as.formula(y8hrmax ~ xmaxtemp + xwdsp + xrh), 
                 data = DataFit, 
                 n.report = 5, 
                 coordtype = "utm", 
                 coords = 4:5, 
                 scale.transform = "NONE", 
                 g_size = 4,
                 N = 2000, 
                 mchoice = F,
                 plotit = F)

# Spatial prediction using spT.Gibbs output
pred.gp <- predict(mod$fit, tol.dist=0.0, newdata = DataValPred, newcoords = ~ Longitude + Latitude)

# result: Error in paste(call.f, sep = "")[[3]] : subscript out of bounds

A temporary fix is:
mod$fit$call <- c(mod$fit$call,"", str_split_fixed(mod$fit$call, " ", n = 3)[3])

The error is that the predict function cannot divide the regression formula by dependent and independent variables. I do this manually with the fix. The error line (line number 674) is in the data spGPP.r from the package spTimer in github with the Web Adress https://github.com/cran/spTimer/blob/master/R/spGPP.r in the function "spGPP.prediction". The line is call.f<-as.formula(paste("tmp~",paste(call.f,sep="")[[3]])).

It is very important that you fix this in the program code, because this error occurs in all models. My temporary fix is only a stopgap.

`Sujit said:

Thanks for this. But sorry, I have not been able to reproduce the error. I checked both on my linux and Windows machines. I will ask some of my students to double check this. Please can you give me code which results in error.`

My system:
R version 4.2.0 (2022-04-22 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows Server x64 (build 17763)

Version bmstdr: 0.2.2

truncated distribution with two border???

Hi

it is possible you can make a truncated distribution with two border (currently truncated distribution has one (bottom) border. The goal is to predict only in a defined range of values (in this case in the range 0 <= y <= 1 or also 0 <= y <= 100). My input data have a same range!

If I create a GPP model (spTimer) without a truncated distribution, the prediction contains a lot of values < 0. This is not good for my case. If I use the truncated distribution with the lower limit = 0 instead, then my prediction does not contain any values < 0. That is already very good.

Two questions:
Does a limit of 0 mean that 0 is modeled in the distribution or only values > 0?
How do I determine the correct/optimal lambda value for the option 'truncation.para'? Currently I use the default setting.

I would need the same however also for the upper limit. As far as I know, there are truncated distributions with 2 (upper and lower) limit (example here, here and here).

For non-Bayesian regression, I always used the inflated beta distribution (value range: 0 <= x <= 1), which gave me predictions only within the defined range.

Is this possible?

sujit-sahu / bmstdr Goto Github PK

bmstdr's Introduction

bmstdr: Bayesian Modeling of Spatio-Temporal Data with R

Author: Sujit K. Sahu

Date: April 07, 2022

Introduction:

bmstdr's People

Contributors

Stargazers

Watchers

Forkers

bmstdr's Issues

Recommend Projects

Recommend Topics

Recommend Org