Comments (8)
The error message may indeed be misleading (perhaps we should just say that the tree does not define full-rank metric distance matrices?). We do not use the tree as such but it will be translated to phylogenetic distance matrix (function ape::vcv
). However, your phylogenetic distance matrix is such that we cannot use it in Hmsc. We should be able to invert the matrix, and that fails. I think the assumption in the error message was that ultrametric trees give metric distances that can be inverted, and the error message just replaces an older one (Error in chol.default ... the leading minor of order 4 is not positive definite
– we thought that is less useful a message than the one we have now). However, I can also imagine cases where this error appears in basically metric distances, for instance, if you have identical taxa. We do not have your phylogenetic tree so we do not know what is the case here. If you send the Newick tree (privately, we do not need taxon names), I can have a look at the issue.
If you get down to computeDataParameters()
function, the tree was technically OK. If there was some technical problem with tree construction, you would have got the error much earlier (in Hmsc()
model definition).
from hmsc.
Thank you Jari for the prompt response. I've emailed you my Newick tree.
I don't think it's a technical problem with the tree construction since I don't get an error earlier.
-Sara
from hmsc.
Sara, thanks for the Newick tree. It was a large one with 1440 taxa. However, when I tried with that tree, ape told that is not ultrametric:
> library(ape)
> tre <- read.tree("Fryxell_1440_tree.tre")
## NOT ultrametric
> is.ultrametric(tre)
[1] FALSE
## try with phylogenetic correlations similarly as in Hmsc
> phylcor <- vcv(tre)
## phylcor need to be inverted, and for that its determinant should be >0
> det(phylcor)
[1] 0
## Finally check the rank
> attr(chol(phylcor, pivot=TRUE), "rank")
[1] 1400
Warning message:
In chol.default(phylcor, pivot = TRUE) :
the matrix is either rank-deficient or indefinite
## dim is 1440, and rank 1400
> dim(phylcor)
[1] 1440 1440
This looks like several taxa are "identical" or near identical. I don't know how to handle this, but we cannot use that phylogeny. At least the following 40 taxa are duplicates of some previous ones:
> which(duplicated(phylcor))
'BOLD:ABZ7767' 'BOLD:AAA4393' 'BOLD:AAA3933' 'BOLD:AAB8468' 'BOLD:AAA7565'
2 3 4 10 25
'BOLD:AAB4054' 'BOLD:ACE4385' 'BOLD:ACE4734' 'BOLD:ACE7664' 'BOLD:ABY4439'
34 63 75 84 90
'BOLD:AAA8814' 'BOLD:AAB5993' 'BOLD:AAA8386' 'BOLD:AAB6246' 'BOLD:ACE7380'
111 112 151 152 153
'BOLD:ACF1624' 'BOLD:AAB6095' 'BOLD:ACF4111' 'BOLD:AAB4640' 'BOLD:AAC5412'
161 188 194 196 197
'BOLD:AAD7518' 'BOLD:AAA9420' 'BOLD:AAB0268' 'BOLD:ABY9168' 'BOLD:AAI9560'
216 246 259 270 277
'BOLD:ACF0609' 'BOLD:AAB7992' 'BOLD:ABZ7431' 'BOLD:AAB2296' 'BOLD:AAB0890'
294 295 320 321 322
'BOLD:AAB0754' 'BOLD:AAA7669' 'BOLD:ABY7901' 'BOLD:ACE9386' 'BOLD:AAU8534'
323 338 339 424 543
'BOLD:ACF3126' 'BOLD:ADJ1669' 'BOLD:ADI7158' 'BOLD:ABA9093' 'BOLD:ACL7379'
775 781 798 940 975
Without these duplicates, there are no numerical problems.
from hmsc.
Hi Jari,
Thanks for the reply. Apologies I forgot to mention that I tried converting my tree to 'ultrametric' using these suggestions (1, 2) but still had the original error I shared (after checking that the tree is considered ultrametric).
Thanks for looking into this. I'm not sure I fully understand what you mean by "several taxa are 'identical'". Do you mean several leaf nodes have the same taxa name? Because I double-checked and I do have 1,440 unique taxa names (BOLD_ids). Or did you mean that several BOLD_ids have the exact same phylogeny with other BOLD_ids? I should note that I construct the tree using the COI sequences and I have several BOLD_ids that belong to the same family/genus so their phylogeny are very similar to one another since they're closely related. Would it not be possible to use HMSC if I have several taxa with closely related phylogeny?
Thanks for your help.
-Sara
from hmsc.
Sara, they are not only "closely related", but the tree implies they are identical or the same one taxon. The edges connecting these tips (taxa) have zero length (that is, no difference), and from the tree's point of view, these are the same taxon with several alternative names (or "synonyms" from the tree's point of view). I don't have your sequence data, but this could mean that they also have identical sequences. From the Hmsc point of view, these are then duplicated taxa, and if we have a matrix with duplicated taxa, we cannot use it: mathematics won't work (technically: we need to invert the phylogenetic correlation matrix, and if it is based on duplicated taxa, it will be rank-deficient and rank-deficient matrices do not have (normal) inverse matrix).
As a simple check, it seems that the first four entries (tips) in your tree are identical ('BOLD:AAA6619' 'BOLD:ABZ7767' 'BOLD:AAA4393' 'BOLD:AAA3933'
). You may check these to see if they are different. If they have different sequences, you need a way to build a tree that shows this difference, or you need a trick that makes these different, such as replacing zero length edges with a tiny value that makes these non-zero (and several recognized taxa can share the same barcode) – and it really must be tiny because now the shortest non-zero edge length is 4.593 × 10-5. Alternatively, if these are the same taxon, their data should be merged.
from hmsc.
from hmsc.
Hi Jari, thanks for the explanation! I will look into it.
Thanks for the suggestion Otso, I'll run the scripts with a smaller number of taxa before moving on to a bigger dataset.
-Sara
from hmsc.
I see this issue was closed, but I am having the same issue with similar data, and I am fairly confident that I have no repeated sequence data in my tree.
I am happy to supply simplified data to recreate the problem. Should I open a new issue, or discuss here?
from hmsc.
Related Issues (20)
- Hmsc Error: 8 nodes produced errors; first error: the leading minor of order 4 is not positive definite HOT 9
- When to use expected vs realized predictions
- sampleMcmc Error in checkForRemoteErrors(val) : one node produced an error: NA/NaN/Inf in foreign function call HOT 1
- HSMC usage to infer microbial communities? [discussion] HOT 1
- Spatial random variable with 9,738 coordinates causes R to crash HOT 2
- Interpretation of model coefficients in a multivariate poisson GLM with spatial random effect
- incorrect number of dimensions HOT 3
- Can not predict at the same coordinates used to train the model
- Missing help for `importPosteriorFromHPC` function
- Error in cross validation: missing value where TRUE/FALSE needed
- predict with Yc instead of constructGradient to avoid "Error: vector memory exhausted (limit reached?)" ?
- Interpretation of `predictEtaMean` / `predictEtaMeanField` arguments of the predict function
- In cor(lbeta[[i]][k, ], lmu[[i]][k, ]) : the standard deviation is zero HOT 2
- Unexpected trace plots for alpha parameters of a GPP model HOT 4
- Error in `importPosteriorFromHPC` for GPP/Hmsc-hpc models with `alignPost = TRUE` HOT 1
- Spatial Model running extremely slow HOT 6
- Error while converting Hmsc model object to JSON: `Error in rcpp_to_json(x, unbox, digits, numeric_dates, factors_as_string, : negative length vectors are not allowed` HOT 3
- im getting this error in running the Uhlig code
- Question about making predictions when using a hurdle approach
- Inconsistency in spatial model variance partitioning
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hmsc.