Comments (19)
Hei @aminorberg
The precise answer to your question very much depends on what you refer to as a full working example. As far as I am aware of, the functionality from the original methodological paper is properly implemented and anything not working with of that shall be treated as bug.
On the other hand, there are currently a lot of post processing utilities in the package, which bring the raw results of the MCMC model fitting closer to ecological interpretation. These are not guaranteed to work fine with variable latent loadings, both due to
- the fact that often it is improper to interprete them in similar manner as the constant latent loadings and the underlying math is yet to be designed,
- in some of those cases, when the interpretation is transferrable, the current implementation is deficient due to limited robustness caused by iterative development process.
In personal communication, @ovaskain claimed that they encountered multiple challenges with this class of models, both in terms of sensitivity of prior assumptions and/or sophisticated exploration of the posterior typical set using available MCMC algorithm.
from hmsc.
Thanks @gtikhonov!
By working example I mean e.g. an example using the data provided with the package, since I can't even get that working, and I think it's just that I don't know the syntax. This is what I tried:
studyDesign <- data.frame(sample = as.factor(1:50),
plot = as.factor(sample(1:20, 50, replace = TRUE)))
rL2 <- HmscRandomLevel(units = TD$studyDesign$plot)
rL1 <- HmscRandomLevel(xData = data.frame(x1 = rep(1, length(TD$X$x1)),
x2 = TD$X$x2))
m <- Hmsc(Y = TD$Y, XData = TD$X,
XFormula = ~x1+x2,
studyDesign = studyDesign,
ranLevels = list("sample" = rL1, "plot" = rL2))
ps <- sampleMcmc(m, samples = 1000)
And when attempting to sampleMcmc, I get an error:
Error in (Eta[[r]][Pi[, r], ] * rL[[r]]$x[as.character(dfPi[, r]), k]) %*% :
non-conformable arguments
In addition: Warning message:
In Ops.factor(Eta[[r]][Pi[, r], ], rL[[r]]$x[as.character(dfPi[, :
‘*’ not meaningful for factors
from hmsc.
Looks like a bug. Will have a close look at this.
from hmsc.
Looks like this comes from updateZ
in loop of lines 22 to 30, and exactly there in line 28:
for(r in seq_len(nr)){
if(rL[[r]]$xDim == 0){
...
} else{
LRan[[r]] = matrix(0,ny,ns)
for(k in 1:rL[[r]]$xDim)
LRan[[r]] = LRan[[r]] + (Eta[[r]][Pi[,r],]*rL[[r]]$x[as.character(dfPi[,r]),k]) %*% Lambda[[r]][,,k]
}
}
and basically the error is caused by this statement with innermost loop at k == 2
rL[[r]]$x[as.character(dfPi[,r]),k]
[1] o o o o o o o o o o o o o o o o o o o o o o o o o c c c c c c c c c c c c c
[39] c c c c c c c c c c c c
Levels: c o
which is a factor that is not meaningful in multiplication (like the warning said), and when used within
Eta[[r]][Pi[, r], ] * rL[[r]]$x[as.character(dfPi[,r]),k]
The result drops dimensions (50,2) and returns a vector of 100 NA
values and this fails in the following matrix multiplication (%*%
) with the error of non-conformable arguments. So it is about handling factor variables.
from hmsc.
Thanks, @jarioksa, I was going to mark the same thing. We have not yet implemented the XFormula
type of data specification for this feature. However, there is one more issue - namely in the alignPosterior()
function, which is be default called after the MCMC is done. I have not fixed it yet, but the following code is a fully operating example of model fitting for covariate-dependent associations:
library(Hmsc)
studyDesign <- data.frame(sample = as.factor(1:50),
plot = as.factor(sample(1:20, 50, replace = TRUE)))
rL2 <- HmscRandomLevel(units = TD$studyDesign$plot)
xData = data.frame(x1=rep(1, length(TD$X$x1)), x2=as.numeric(TD$X$x2=="c"))
rL1 <- HmscRandomLevel(xData = xData)
m <- Hmsc(Y = TD$Y,XData = TD$X,
XFormula = ~x1+x2,
studyDesign = studyDesign,
ranLevels = list("sample" = rL1, "plot" = rL2))
ps <- sampleMcmc(m, samples = 1000, alignPost=FALSE)
from hmsc.
@gtikhonov : It works in this case where x2
is a two-level factor which can be turned into one contrast variable. A more general solution is:
model.matrix(reformulate(names(TD$X[, -1])), TD$X)
model.matrix
generates an (Intercept)
which already is in TD$X
and one of the replicated intercepts can be removed – this is one way, another way is to just drop the first column of the resulting model matrix. This solution will work when there is any number of factors with any number of levels. The generation of the model matrix could be incorporated in updateZ
to allow flexible use of factor variables.
from hmsc.
Hi, I have also been exploring this recently. Despite the minor bugs the models seem to fit. Regarding how to compute the Lambdas per site as a function of the covariates, I have tried (continuing from Gleb's working example above):
#Covariates for plot-level latent variable
xmat = as.matrix(m$rL[[1]]$x)
dim(xmat)
#Residual correlations given observed covariates
OmegaCor = getPostEstimate(ps, "OmegaCor", r=1, x=xmat)$mean
OmegaCor
#Residual correlations given observed covariates for first site
OmegaCor = getPostEstimate(ps, "OmegaCor", r=1, x=xmat[1,])$mean
OmegaCor
from hmsc.
@oysteiop, this does not look like a valid approach:
- you repeat
l
variable (small L) both for the loop index and accumulated sum. At least my copy of your code indicates this, maybe you meantI
andl
. Such repetition would not bring you any good. - you shall not sum over the
l1
at all, and your output should have dimension [2,4,50] ([nf,ns,ny]).
The package tensorA
features a function that can replace the whole loop with single line. Additionally, the Hmsc
function getPostEstimate
features parameter x
that controls the condition for association matrix if you set parName="Omega"
.
from hmsc.
Thanks for spotting this typo. I have edited my comment above to you much simpler solution.
from hmsc.
I think you are on a dangerous path: these tricks work when you study a case with a two-level factor where the internal numeric coding will give the correct contrast. However, this approach fails when you have a factor with three levels, where the internal numeric coding just changes the factor into an arbitrary continuous variable. Most dangerously, it will fail silently: no error messages, but wrong results. You must study the model matrix which changes a three-level factor into two contrast variables (and p-level factor into p-1 contrast variables). Solving a margin case will bring trouble and grievance later in life.
from hmsc.
Hmm, @gtikhonov, running your example results in a new error:
Error in lambda * array(rep(rL[[r]]$x[unLdfPi[q], ], each = nf * ns), :
non-numeric argument to binary operator
from hmsc.
@jarioksa , I am not sure whom did you address, but here are some thoughts of mine regarding this aspect. To my mind, any of the package's internal functionality on for simplifying user's life when specifying the covariate matrix (e.g. XData + XFormula way) are potentially unsafe, since a user can easily define something very different that he/she desires without noticing it. However, I also do fully recognise that it is often nuch more gandy in this way + Otso had some ideas on using this formula-empowered way for more informative postprocessing (not implemented yet).
from hmsc.
@aminorberg is the problem still actual? As far as I understood from the responses by other collaborators on this branch, for them that example code was executing without issues...
from hmsc.
@gtikhonov I reinstalled Hmsc (master version), and now your example works without errors. So only the CRAN version gives the error. Thank you!
from hmsc.
@gtikhonov my problems with cov-dep latents variables continued as I tried to compute predicted values:
cv_partition <- createPartition(ps, nfolds = 2, column = "sample")
cv_preds <- computePredictedValues(ps, partition = cv_partition, expected = FALSE)
Error in LambdaPostMean[k, ] : incorrect number of dimensions
Related to the alignPost = FALSE
?
from hmsc.
Immediate reason for this is that in your example, internal item LambdaPostMean
is a 3-dim array, and too few indices are given when it is referred: It should be referred as LambdaPostMean[k, , ]
– that is, one comma more since three indices are needed.
from hmsc.
@gtikhonov Is there any chance that the predict function would be updated for cov-dep latent variables some time soon?
from hmsc.
Hi @aminorberg - I've been following this discussion and also have a few projects where I'm hoping to use the covariate-dependent association features.
Am I correct in understanding that covariate(s) must be categorical?
from hmsc.
Hi, I'm still struggling to get the covariate-dependent functionality to work.
The code that @gtikhonov provoided on 28th Jan 2019 makes a model, but computeAssociations fails:
Error in dimnames(x) <- dn :
length of 'dimnames' [2] not equal to array extent
This seems to be because in the postList object created by poolMcmcChains in computeAssociations, the element of Lambda that corresponds to the covariate-dependent random level (rL1) has two species * species matrices (see also @jarioksa comment on 9th Dec 2019):
postList[[1]]$Lambda
[[1]]
, , 1
[,1] [,2] [,3] [,4]
[1,] 0.0262100640 0.0186711924 5.804635e-02 0.1077498535
[2,] -0.0136396407 -0.0261227143 -5.455211e-03 -0.0122868940
[3,] 0.0013262262 0.0002285267 6.567342e-03 -0.0011980602
[4,] -0.0004133883 -0.0007863265 -3.854753e-05 0.0004773631
, , 2
[,1] [,2] [,3] [,4]
[1,] -0.2373420188 -0.0828967318 -0.066711818 -0.1553379441
[2,] -0.0478642459 0.0118648434 -0.032209240 -0.0578322325
[3,] 0.0001250210 0.0081198502 -0.003553992 0.0140523202
[4,] -0.0002287827 -0.0001740797 0.001141877 0.0001855759
[[2]]
[,1] [,2] [,3] [,4]
[1,] 0.20153787 -0.10386425 -0.08746174 -0.04134730
[2,] 0.04805722 0.01782121 0.03143461 0.02981933
I can update computeAssociations to work around this, but I'm not certain how to use these two matrices to understand how the species associations are affected by the environmental variable. Am I right in thinking that the first matrix is the species associations independent of the environmental variable and the second matrix is the species associations dependent on the selected environmental variable?
Presumably this is explained in the Matlab code for the 2017 paper, but that's not accessible to me.
convertToCodaObject and the predict functions etc... also fails for the same reason I think. Again, I can fix for my data, but would need to understand what the two Lambda matrices are.
Thanks for any light you're able to shed!
from hmsc.
Related Issues (20)
- Chains in model cross validation not running parallel HOT 1
- Code quit working HOT 5
- plotBeta with ggplot2
- Error in computeDataParameters(hM) HOT 7
- Error in computePredictedValues() HOT 2
- More aggressive parallelization in computePredictedValues
- offset HOT 5
- How do you get the posterior standard deviation for each species?
- predict over millions of sites (non-spatially explicit); out of memory
- All 0's in Alpha posterior? HOT 6
- Interaction term interpretation and prediction HOT 7
- Can Hmsc accomodate repeated measure design? HOT 2
- How to specify the studyDesign and random level for a repeated measure design HOT 1
- Need help to speed up model fitting: How can we go parallel processing using different cores for fitting all the models at the same time? HOT 4
- Hmsc models for presence-only data HOT 2
- Structuring Hurdle Model for Y matrix with mix of zero-inflated and uninflated response variables HOT 6
- variance partitioning - proportion of explained /raw variance for species present at all sites HOT 1
- Memory leak when performing NNGP HOT 1
- Order of Random Levels in a Spatiotemporal Nested Design HOT 2
- plotBeta cannot handle phylogenies where the tip names contain parentheses HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hmsc.