Comments (11)
I've just realized that there is a "--predixcan-method" option for focus import. Based on my test though, specifying it as MASHR did not change any results in the end, so I assume this is more for descriptive purposes and I can still use the old results generated, after changing the respective column (from ElasticNet -> MASHR). Please correct me if I am wrong.
However, I am still curious how rsID/GTEx IDs are handled in the FOCUS pipeline?
from focus.
Hi @maegsul , thanks for the kind words. Glad you enjoy the tool.
As you noted, there is a newer dev version that added the predixcan method to account for different training approaches, which should suite your needs. Operationally I think it wont make a difference, but outputted results should properly reflect the underlying method now.
In order for FOCUS to match SNPs with the LD reference and GWAS results, all marker identifiers need to be the same. One workaround would be to update model SNPs to dbSNP rsIDs.
I'm hoping to get more work done on FOCUS dev branch into the main branch soon.
from focus.
Thank you very much @quattro !
Indeed, my LD reference and GWAS summary statistics files are both manually annotated the same way, so they both contain rsIDs based on dbSNPv151 (GRCh38), and if there is no available corresponding rsID, they are annotated as "CHR_POS_REF_ALT".
However, as I indicated, PrediXcan GTEx v8 MASHR models (.db and .txt.gz files available at http://predictdb.org/) use chr1_100_A_T_b38 annotation system for the variants in the model, doesn't matter they have a corresponding rsID or not. This is not the case for ElasticNet models that are using rsID annotations.
So I also tested providing a modified GWAS summary statistics to focus finemap where SNPs were annotated as chr1_100_A_T_b38 (the same one I use for running PrediXcan/MetaXcan for MASHR models), but it gave an error:
"No overlap between LD reference and GWAS"
I guess this is expected, as I still provide an LD reference data with rsID annotation (this file is not used for running PrediXcan/MetaXcan, as .txt.gz files are used for LD information).
Anyway, I guess this might be sorted out at the weight database creation step somehow. I think the MASHR .db files contain both "rsid" and "varID" columns, where "varID" corresponds to the variant ID in the .txt.gz covariance file (used for running PrediXcan/MetaXcan only), and "rsid" column matches with the LD reference & summary statistics files provided in focus finemap step; so maybe that is why I did not see any problematic/weird FOCUS results in this scenario.
But of course if I interpret this wrong and I should be careful with this step, please do let me know! Thanks again!
from focus.
Hi,
did your import step of GTEx v8 MASHR models create any error? I met the [ERROR] 'cv_R2_avg' constantly when running
focus import $tissue predixcan --tissue $TISSUE --name GTEx --assay rnaseq --output gtex_v8
I downloaded the same mashr model from http://predictdb.org/ and I would like to ask how did you import it into one .db file. Thanks!
from focus.
Hi @WeiCSong,
I am experiencing the same issue with [ERROR] 'cv_R2_avg', and also the other error too (ERROR] 'symbol') which you mention at another thread. May I ask if you have since managed to import GTEx v8 MASHR models and resolve the [ERROR] 'cv_R2_avg' issue?
Any advice would be much appreciated!
Thanks very much!!
GG
from focus.
Hi @WeiCSong,
I am experiencing the same issue with [ERROR] 'cv_R2_avg', and also the other error too (ERROR] 'symbol') which you mention at another thread. May I ask if you have since managed to import GTEx v8 MASHR models and resolve the [ERROR] 'cv_R2_avg' issue?
Any advice would be much appreciated!
Thanks very much!!
GG
Hi @gg-n
I wrote an R script to manually concatenate all databases into the five data frames required by FOCUS, and wrote them to .db file. However, only the elastic net model worked fine using this script. Since my project needs robustness more than precision, i used this elasticnet.db and did not debug on mashr model. You may take a look at my script and write a script to concatenate mashr model. The workflow is straightforward: update refpanel -> molecularfeature -> model -> modelattribute -> weight. You may also update the precise start and stop position of each gene in the molecularfeature data frame (I haven't done that since it did not impact the result).
from focus.
Hi @WeiCSong @quattro,
I tried the focus import and try out one tissue from the GTEx v8 MASHR models, but I have error with mygene and rpy2? How do I installed mygene and rpy2? Any help?
===================================
FOCUS v0.6.10
focus import
/mashr/mashr_Prostate.db
predixcan
--name GTEx
--assay rnaseq
--tissue Prostate
--use-ens-id --out /mashr/output/mashr_Prostate
Starting log...
[2021-11-27 08:27:33 - INFO] Preparing weight database
[2021-11-27 08:27:34 - ERROR] Import submodule requires mygene and rpy2 to be installed.
[2021-11-27 08:27:34 - ERROR] No module named 'mygene'
[2021-11-27 08:27:34 - INFO] Finished importing prediction models
from focus.
Hi @sookwah-yee , you need to run pip install mygene
and pip install rpy2
. These aren't strict requirements for focus since the primary use case is fine-mapping, rather than importing new weights. We will update the error message to better reflect this.
from focus.
Hi @quattro , thank you for your response above. I did the install as you suggested, but I have another error which I don't understand how to resolve. How can I fix the error ''cv_R2_avg' ?
Starting log...
[2021-11-29 09:28:17 - INFO] Preparing weight database
[2021-11-29 09:28:18 - INFO] Starting import from PrediXcan database mashr_Prostate.db
[2021-11-29 09:28:18 - INFO] Querying mygene servers for gene annotations
[2021-11-29 09:29:58 - INFO] Starting individual model conversion
[2021-11-29 09:29:58 - ERROR] 'cv_R2_avg'
[2021-11-29 09:29:58 - INFO] Finished importing prediction models
from focus.
Hi @sookwah-yee , can you try cloning the latest dev branch from the repo and installing that? It should be focus v 0.7.
from focus.
Hi @quattro , it says that I have version 0.7, when I do "module show focus". I don't know why the log for each run I did show up as 0.6. May I know what you mean by cloning the latest dev branch from the repo? Where can I get it?
module show focus # gives instructions for how to use
/software/c4/wittelab/modulefiles/focus/0.7.0.lua:
help([[focus: a set of tools to finemap twas statistics
]])
whatis("Version: 0.7.0")
whatis("Keywords: twas")
whatis("URL: https://github.com/bogdanlab/focus")
whatis("Description: FOCUS (Fine-mapping Of CaUsal gene Sets) is software to fine-map transcriptome-wide association study statistics at genomic risk regions. The software takes as input summary GW
AS data along with eQTL weights and outputs a credible set of genes to explain observed genomic risk. Example: source $ENV; focus --help; deactivate")
setenv("ENV","/software/c4/wittelab/software/focus-0.7.0/focus_venv/bin/activate")
prepend_path("PATH","/software/c4/wittelab/software/focus-0.7.0/bin")
from focus.
Related Issues (20)
- P less than 5e-8 but it says doesnt' exist HOT 7
- Importing weights from FUSION
- How to evaluate result of a gene in 2 region.
- Default --n-min in munge HOT 2
- FOCUS stalls on chromosome 6 HOT 3
- Importing weights from FUSION HOT 2
- Tissue prioritization HOT 1
- Cleaning GWAS summary data HOT 1
- Data folder doesn't contain gencode_map_v37.tsv HOT 8
- [ERROR] import_fusion 'Series' object has no attribute 'DIR' HOT 7
- unsupported operand type(s) for +: 'float' and 'str'
- [ERROR] Import submodule requires mygene and rpy2 to be installed
- [ERROR] ufunc 'isfinite' not supported for the input types HOT 2
- Please specify independent regions location or default regions with '37:EUR', etc HOT 2
- Integer column has NA values in column 3 HOT 4
- unsupported operand type(s) for +: 'float' and 'str' HOT 3
- Focus import error - "500 Server Error: Internal Server Error for url: http://mygene.info/v3/query/" HOT 2
- ERROR] MetaData.__init__() got multiple values for argument 'schema' HOT 1
- error in creating the weight database HOT 3
- symbol issue when creating weight database using Predixcan databases HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from focus.