atkinson-lab / tractor Goto Github PK
View Code? Open in Web Editor NEWScripts for implementing the Tractor pipeline
License: MIT License
Scripts for implementing the Tractor pipeline
License: MIT License
I have come across a minor issue where extract_tracts.py
allows for compressing output, yet run_tractor.R
cannot accept these .gz files.
PR to follow.
Hello:
I was wondering whether or not it is possible to run a GWAS with genotype x phenotype interaction using TRACTOR and how would this be done.
Thank you for your help
Hi,
After running
$ python3 Tractor/scripts/extract_tracts.py --vcf subset1/query_file_phased.vcf --msp subset1/query_results.msp --output-dir output/ --num-ancs 8
INFO (__main__ 91): # VCF File : subset1/query_file_phased.vcf
INFO (__main__ 92): # Prefix of output file names : query_file_phased
INFO (__main__ 93): # VCF File is compressed? : False
INFO (__main__ 94): # Number of Ancestries in VCF : 8
INFO (__main__ 95): # Output Directory : output/
INFO (__main__ 101): Creating output files for 8 ancestries
INFO (__main__ 116): Iterating through VCF file
Traceback (most recent call last):
File "/path/Tractor/scripts/extract_tracts.py", line 240, in <module>
extract_tracts(**vars(args))
File "/patj/Tractor/scripts/extract_tracts.py", line 170, in extract_tracts
window = (ancs_entry[0], int(ancs_entry[1]), int(ancs_entry[2]))
ValueError: invalid literal for int() with base 10: '0.0'
Both vcf and msp are output files from the LAI tool G-Nomix, using the pre-trained model, using 8 ancestries. The vcf seems complete, so suspect the issue is regarding the msp, which has the next header:
#Subpopulation order/codes: EUR=0 EAS=1 NAT=2 AFR=3 SAS=4 AHG=5 OCE=6 WAS=7
#chm spos epos sgpos egpos n snps sample_1 sample_2 ... sample_n
13273 779322 0.0 2.02544 696 5 5 5
...
Maybe the issue is with the EUR=0
tag, the void tag of the chromosome or the 0.0
of the centimorgan positions.
Any help will be appreciated.
Thank you.
Hi,
I had this error while running the function of extracting the tracts, ExtractTracts.py. The input vcf file is not phased, though RFMix still gave reasonable results. So my question is that does Tractor here require phased vcf file to run the function? Thanks a lot for the help!
INFO (main 42): Creating output files for 2 ancestries
INFO (main 48): Opening input and output files for reading and writing
Traceback (most recent call last):
File "/Tractor/ExtractTracts.py", line 184, in
extract_tracts(**vars(args))
File "/Tractor/ExtractTracts.py", line 126, in extract_tracts
geno_b = str(geno[1])
~~~~^^^
IndexError: list index out of range
Hi,
Based on what I can see from the tractor output, there is no error term output for the SNP effect. I was wondering if there is some option I can set in the program so that the error term for each ancestral effect estimate is outputted?
Thanks for any help you can offer.
Hi,
I was trying to run the extract_tracks.py
script, but it threw an error
$ python3 /u/home/b/biona001/Tractor/scripts/extract_tracts.py \
--vcf /u/home/b/biona001/project-loes/ForBen_genotypes_subset/LAI/vcf_phased/chr22.vcf.gz \
--msp /u/home/b/biona001/project-loes/ForBen_genotypes_subset/LAI/output/chr22.msp.tsv \
--num-ancs 3 \
--output-dir /u/home/b/biona001/project-loes/ForBen_genotypes_subset/LAI/tracks
INFO (__main__ 90): # VCF File : /u/home/b/biona001/project-loes/ForBen_genotypes_subset/LAI/vcf_phased/chr22.vcf.gz
INFO (__main__ 91): # Prefix of output file names : chr22
INFO (__main__ 92): # VCF File is compressed? : True
INFO (__main__ 93): # Number of Ancestries in VCF : 3
INFO (__main__ 94): # Output Directory : /u/home/b/biona001/project-loes/ForBen_genotypes_subset/LAI/tracks
INFO (__main__ 100): Creating output files for 3 ancestries
Traceback (most recent call last):
File "/u/home/b/biona001/Tractor/scripts/extract_tracts.py", line 239, in <module>
extract_tracts(**vars(args))
File "/u/home/b/biona001/Tractor/scripts/extract_tracts.py", line 102, in extract_tracts
output_files[f"dos{i}"] = f"{output_path}anc{i}.dosage.txt{file_extension}"
NameError: name 'output_files' is not defined
Any tips/suggestions would be highly appreciated.
Hello,
I am unable to find the exact format of the Phe.txt file described here:
https://github.com/Atkinson-Lab/Tractor-tutorial/blob/main/Local.md
python RunTractor.py --hapdose ADMIX_COHORT/ASW.phased --phe PHENO/Phe.txt --method linear --out SumStats.tsv
Please help
Thank you
Hi, We have a genotyping cohort with samples (N>5000) of multiple races, and they have been imputed using TOPMed Imputation server (https://topmedimpute.readthedocs.io/en/latest/getting-started/), because this is the largest multi-ethnic reference panel till date. The output imputed data from the server is not phased, and it is in hg38 build.
Can you briefly describe how I should proceed, if my intent is to run tractorGWAS, with all 5 major ancestries using the logistic model.
I'm not sure, if I'm asking all the right questions to plan out my work.
Thank you for your time.
Hi,
I would like to apply Tractor to a dataset with high relatedness, but it looks like Tractor does not take in GRM.
Besides using an unrelated subset, do you have other suggestions?
Thanks,
Wanying
Hi,
I've seen that the Hail version of Tractor pipeline only does linear regression (in the example).
What about logistic regression? Does Hail performs Logistic regression too?
Thank you
After running the script I got the next
Error in rep(NA, ncol(mat)) : invalid 'times' argument
Calls: RunTractor -> subset_mat_NA -> t -> sapply -> lapply -> FUN
Execution halted
The output was the next one:
CHR POS ID REF ALT AF_anc0 AF_anc1 AF_anc2 AF_anc3 AF_anc4 AF_anc5 AF_anc6 AF_anc7 LAprop_anc0 LAprop_anc1 LAprop_anc2 LAprop_anc3 LAprop_anc4 LAprop_anc5 LAprop_anc6 LAprop_anc7 LAeff_anc0 LAeff_anc1 LAeff_anc2 LAeff_anc3 LAeff_anc4 LAeff_anc5 LAeff_anc6 LApval_anc0 LApval_anc1 LApval_anc2 LApval_anc3 LApval_anc4 LApval_anc5 LApval_anc6 Geff_anc0 Geff_anc1 Geff_anc2 Geff_ancGeff_anc4 Geff_anc5 Geff_anc6 Geff_anc7 Gpval_anc0 Gpval_anc1 Gpval_anc2 Gpval_anc3 Gpval_anc4 Gpval_anc5 Gpval_anc6 Gpval_anc7
chr1 662622 chr1:727242:G:A G A NA NA NA NA NA 0.025 NA NA 0 0 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0.726245 NA NA NA NA NA NA NA 0.0792779608694168 NA NA
I expect that R is not handling in a good way the NAs?
Hello,
I am having an issue with the ExtractTracts.py portion of Tractor. I have a fairly large primate dataset of 887 individuals, 9 of which are reference animals (equally split between two closely related species). I have used Tractor on this same dataset about 4 or 5 times now without issue. I have been trying different combinations of reference panels following the protocol of RFMix and then Tractor and it has worked up until now. Using samples extracted from the same master VCF as before, run through RFMix using the same code, I am now getting this error using ExtractTracts.py and am not sure what to do.
The following is the code I used to generate the error:
module load python
for i in {1..20}; do
python ExtractTracts.py \
--msp $SCRATCH/rfmix/Apr2_2023_UnrelatedandFounders.QueryPanel.Chr${i} \
--vcf-prefix $SCRATCH/Beagle_Software/Beagle5.4/Apr2_2023_UnrelatedandFounders.QueryPanel.Chr${i} \
--zipped \
--output-path Apr2_2023_UnrelatedandFounders.QueryPanel.Chr${i} \
--num-ancs 2; done 2>Tractor_error5.log
The error being thrown is below, and happens as well if I try and run a chromosome independently as well:
INFO (main 42): Creating output files for 2 ancestries
INFO (main 48): Opening input and output files for reading and writing
Traceback (most recent call last):
File "ExtractTracts.py", line 184, in
extract_tracts(**vars(args))
File "ExtractTracts.py", line 126, in extract_tracts
geno_b = str(geno[1])
IndexError: list index out of range
Thank you for your help!
Hi,
Thanks for creating a very useful method. I was wondering if you have example Hail code (similar to Tractor-Example-GWAS.py / Tractor-Example-GWAS.ipynb) for running Tractor on binary traits? The issue is that the hl.agg.linreg() Hail function used in the example code doesn't have an equivalent function for logistic regression. There is the hl.logistic_regression_rows() function, but it only allows a single predictor (x) to be used, thus it's not possible to also include the haplotype counts or the non-index allele dosage. Of course one could implement this outside of Hail, but if you already have a solution it would be easier. Any insight would be very helpful.
Thanks,
Stephane
Hi!
I am encountering an error for which a few issues have already been raised, but I have been trying to troubleshoot it and still haven't worked it out. The thing is I am using imputed files (from TopMed), but they have been filtered (by MAF and INFO) using PLINK. RFMix handled these vcf without problems, but when running the ExtractTracts.py, I get this message:
File "/mnt/lustre/scratch/nlsas/home/usc/gb/sdd/lat23/TRACTOR/Tractor/scripts/ExtractTracts.py", line 126, in extract_tracts
geno_b = str(geno[1])
This is the VCF header:
##fileformat=VCFv4.3
##fileDate=20231123
##source=PLINKv2.00
##filedate=2023.3.13
##INFO=<ID=AF,Number=1,Type=Float,Description="Estimated Alternate Allele Frequency">
##INFO=<ID=MAF,Number=1,Type=Float,Description="Estimated Minor Allele Frequency">
##INFO=<ID=R2,Number=1,Type=Float,Description="Estimated Imputation Accuracy (R-square)">
##INFO=<ID=ER2,Number=1,Type=Float,Description="Empirical (Leave-One-Out) R-square (available only for genotyped variants)">
##INFO=<ID=IMPUTED,Number=0,Type=Flag,Description="Marker was imputed but NOT genotyped">
##INFO=<ID=TYPED,Number=0,Type=Flag,Description="Marker was genotyped AND imputed">
##INFO=<ID=TYPED_ONLY,Number=0,Type=Flag,Description="Marker was genotyped but NOT imputed">
##pipeline=michigan-imputationserver-1.7.1
##imputation=minimac4-1.0.2
##phasing=eagle-2.4
##panel=apps@[email protected]
##r2Filter=0.3
##contig=<ID=5>
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
Sample genotypes are are split in columns by "\t", and genotype calls are separated by "|". It works fine when using the raw files from imputation instead (without filtering), but it is taking a lot of time just to run chr22 (and the output files are also very heavy). I have tried to modify the script in line 87 in case the problem was the "\t" separator between samples, but it does still throw the error. I would much appreciate your help here!
Thank you! :)
I'm running a 3-ancestry tractor and I get NaN for some variants, and I've noticed its only for those variants, which have 1 or more populations with AF_ancX = 0
I verified that when I run tractor on the same input, but only give 1 population hap and dosage files as input, I get Pvalue results for all those positions. This would mean that tractor is not doing a local ancestry aware GWAS when only 1 population dosages are provided, is it?
Example output:
3- populations based tractor output. anc0 = AFR, anc1 = EUR and anc2= AMR (My cohort is dominant in Afr and Eur)
CHROM POS ID REF ALT AF_anc0 AF_anc1 AF_anc2 LAprop_anc0 LAprop_anc1 LAprop_anc2 LAeff_anc0 LAeff_anc1 LApval_anc0 LApval_anc1 Geff_anc0 Geff_anc1 Geff_anc2 Gpval_anc0 Gpval_anc1 Gpval_anc2 8 205821 . C T 0.03498 0.00047 0.00094 0.41455 0.53202 0.05342 0.5867933365769665 -0.275980166234514 0.0012209901508823107 0.15753876256506594 0.04648066093110761 -17.450806869947634 -18.175099461425173 0.863883930882097 0.9989229240723104 0.9995039199153428 8 206716 . G C 0.01773 0.00218 0.0066 0.41455 0.53202 0.05342 0.5843741523379564 -0.28597165650323303 0.001261827582320208 0.1432488601362374 0.055573211202005056 1.330571431049564 -17.54550391111194 0.8882255510926501 0.19749680296697525 0.9988180297161879 8 206747 . C G 0.01713 0.00218 0.0066 0.41455 0.53202 0.05342 0.5837735937670595 -0.2860281420231862 0.001276367249093116 0.14317199618539617 0.09385740867275061 1.3307863458279647 -17.54529555393314 0.8125228667403726 0.1974232992189573 0.9988180556108245 8 207242 . G A 0.04616 0.00019 0.0 0.41455 0.53202 0.05342 nan nan nan nan nan nan nan nan nan nan 8 208175 . T G 0.06486 9e-05 0.0 0.41455 0.53202 0.05342 nan nan nan nan nan nan nan nan nan nan 8 208227 . A C 0.03462 0.00047 0.00094 0.41455 0.53202 0.05342 0.58649948554488 -0.275949037607527 0.0012281548283880866 0.1575858179938755 0.05462753081609236 -17.450673605916016 -18.17479306605417 0.8410591302452655 0.9989229409247578 0.9995039282782405 8 209006 . G C 0.0611 0.00028 0.0 0.41455 0.53202 0.05342 nan nan nan nan nan nan nan nan nan nan 8 211365 . G A 0.0345 0.00047 0.00094 0.41455 0.53202 0.05342 0.5864994855448779 -0.27594903760753037 0.0012281548283881785 0.1575858179938706 0.05462753081608959 -17.450673605916016 -18.174793066054686 0.8410591302452743 0.9989229409247578 0.9995039282782406 8 211559 . A T 0.03547 0.00558 0.0 0.41455 0.53202 0.05342 nan nan nan nan nan nan nan nan nan nan 8 211635 . A C 0.00255 0.1308 0.03393 0.41455 0.53202 0.05342 0.5633672510332808 -0.27457186750119905 0.001879978203044346 0.16478825079831816 -21.427942353297524 -0.2360630636660421 -20.542693693652247 0.9994105894895378 0.4225715386127862 0.9992785348646407 8 211831 . A G 0.03437 0.00047 0.00094 0.41455 0.53202 0.05342 0.5863420974023827 -0.27594723308428054 0.0012318491446336143 0.1575886887838127 0.05907664030957955 -17.450584915698936 -18.174632779883186 0.8283067215504775 0.9989229507441875 0.9995039326531734 8 212134 . T C 0.06595 9e-05 0.0 0.41455 0.53202 0.05342 nan nan nan nan nan nan nan nan nan nan 8 212431 . G A 0.01628 0.0 0.0 0.41455 0.53202 0.05342 nan nan nan nan nan nan nan nan nan nan 8 212585 . C T 0.03243 0.00047 0.00094 0.41455 0.53202 0.05342 0.586727459572817 -0.27604712744082566 0.0012228544583496716 0.15743135595710156 0.04933957840521465 -17.450826454128542 -18.175186862413863 0.8606224880774684 0.9989229226291162 0.999503917529778 8 212596 . G A 0.03207 0.00047 0.00094 0.41455 0.53202 0.05342 0.5862340972984453 -0.27606973908617627 0.0012343370077641566 0.1573955498682199 0.0635453815950705 -17.450556446357133 -18.174733095827484 0.8210885360654389 0.9989229521486803 0.9995039299150978 8 212796 . G C 0.0351 0.00047 0.00094 0.41455 0.53202 0.05342 0.5872028540737119 -0.27596227263118217 0.0012117519959782528 0.15756547160577028 0.035074669225241245 -17.45105960063877 -18.175506282013092 0.8975416557249976 0.9989228975956651 0.999503908811391 8 213962 . T G 0.01385 0.00019 0.00094 0.41455 0.53202 0.05342 0.5818012611002261 -0.2746938967833825 0.0013349880275129622 0.1596021880992482 0.4904680546030868 -17.893114903301132 -18.167808800088824 0.18618597007533477 0.9993093215293589 0.9995041189102106 8 213988 . C G 0.02769 9e-05 0.0 0.41455 0.53202 0.05342 nan nan nan nan nan nan nan nan nan nan 8 214676 . C T 0.06413 9e-05 0.0 0.41455 0.53202 0.05342 nan nan nan nan nan nan nan nan nan nan 8 281474 . A G 0.0 0.00151 0.00094 0.41322 0.53319 0.05359 nan nan nan nan nan nan nan nan nan nan 8 294589 . A C 0.0 0.02304 0.00468 0.41298 0.53324 0.05378 nan nan nan nan nan nan nan nan nan nan
Running 1 population based tractor and showing same positions above. Anc0 = Afr
CHROM POS ID REF ALT AF_anc0 LAprop_anc0 Geff_anc0 Gpval_anc0 8 205821 . C T 0.03498 1.0 0.6412522731228197 0.016018128393202897 8 206716 . G C 0.01773 1.0 0.6131118431244662 0.11886720160894383 8 206747 . C G 0.01713 1.0 0.6453075687149562 0.10098343499619976 8 207242 . G A 0.04616 1.0 0.12872366437608104 0.6611541190663489 8 208175 . T G 0.06486 1.0 0.9339493605726387 4.185585197316541e-07 8 208227 . A C 0.03462 1.0 0.6622026139834044 0.013408363791798283 8 209006 . G C 0.0611 1.0 0.3743355007576348 0.10392807669612138 8 211365 . G A 0.0345 1.0 0.6622026139834056 0.013408363791798042 8 211559 . A T 0.03547 1.0 0.6668868413362581 0.00735160056366909 8 211635 . A C 0.00255 1.0 -20.060908661749846 0.9990991633464656 8 211831 . A G 0.03437 1.0 0.6659527732795265 0.012891117266950454 8 212134 . T C 0.06595 1.0 0.9099151786072148 7.449635967596704e-07 8 212431 . G A 0.01628 1.0 1.273847584139422 1.1416550191617484e-05 8 212585 . C T 0.03243 1.0 0.6537431021225084 0.018000536254355784 8 212596 . G A 0.03207 1.0 0.6657178634342557 0.015993569517582372 8 212796 . G C 0.0351 1.0 0.6436369321530563 0.01624838163305082 8 213962 . T G 0.01385 1.0 1.0937502687294598 0.0028722620116477713 8 213988 . C G 0.02769 1.0 0.1295030868752873 0.7353893286125512 8 214676 . C T 0.06413 1.0 0.9386673906669657 3.2801637359264637e-07 8 281474 . A G 0.0 1.0 nan nan 8 294589 . A C 0.0 1.0 nan nan
I'm also attaching these outputs as files here, since they may not render properly.
github_issue_output_1pop.txt
github_issue_output_3pops.txt
Can you help me understand, and guide how I could combine to get best of both results?
Thank You.
I have encounted the following error a few times using the updated ExtractTracts script:
Traceback (most recent call last): File "/home/puckett3/software/Tractor/ExtractTracts.py", line 163, in <module> extract_tracts(**vars(args)) File "/home/puckett3/software/Tractor/ExtractTracts.py", line 106, in extract_tracts geno_b = str(geno[1]) IndexError: list index out of range
I have checked that the number of samples in the VCF is half the number in the MSP file.
I have also run the script changing the second header line in the MSP file from:
#chm spos epos sgpos egpos n snps
to
#chm spos epos sgpos egpos nsnps
Yet I get the same error.
Do you have any suggestions for troubleshooting?
With thanks,
Emily
HI! I have been looking at Tractor for usage in a new project. The program for Local Ancestry that we are likely to use in this project is Flare. This program outputs predicted ancestry in VCF format with AN1 & AN2 as fields. The program predicts ancestry only for the variants at which the input file has GT data.
Given that Tractor seems to require input in RFMIX format, there are some concerns with attempting to use the program. Is there anything in development to support other LA programs such as Flare?
Any help would be appreciated, thanks!
I'm trying to perform a 3-way model.
Does the R script is limited to a maximum number of samples/features? After running
Rscript path/Tractor/scripts/run_tractor.R \
--hapdose path/chr21/chr21.annotated.LiftOver.dose \
--phe path/phe.txt
--method logistic --out sumstats.tsv
I get the next output:
Tractor Script Version: 1.1.0
Loading required package: optparse
Running Tractor...
Error in data[[1]] : subscript out of bounds
Calls: RunTractor
Execution halted
After a quick look up, it's clear that I'm having an issue with the hapcount/dosage files, but I'm unsure what's going on.
On the other hand, I would like to use PLINK in order to test alternative models using the VCFs that one gets after running extract_tracts.py
. According to your published paper, I would need to run 3 different GWAS and then perform a meta-analysis, albeit I'm unsure of something.
As far as I understand, in this case the complete model would be
If, let's say, I would like to test the deconvolved VCF from ancestry 0, would you recommend to use the haplotype counts from the other two ancestries as covariates? Why or why not? In your wiki you are ignoring the counts. Why? It also intrigues me that you are not using principal component as covariates.
Thanks.
Hello Dr. Atkinson,
Thank you so much for these scripts and example code.
In the RFMix_v2 step, your example code produces per chromosome rfmix output.
In the Extract Tracts step, it's unclear whether you've used per chromosome or an autosome file (I used per chromosome MSP files and VCF files).
However in the example Hail code you've read in an autosomes.anc0.dosage.txt
file.
I'm wondering where, when, and how the autosomes file should be made. Or rather, did you run RFMix_v2 on the whole genome instead of by chromosome as the example suggests?
Thank you for clarifying!
Heidi
I am applying the pipeline on imputed data (chr 21) of 934 cases and 946 controls. The output I got contains all nan. Is it possibly due to the small sample size?
mspfile = open(args.msp, + '.msp.tsv', 'r')
#comma after args.msp should be removed
Hi, all,
We are writing a paper that benchmarks the performance Tractor with other GWAS methods. Is there a version number of Tractor, so that it would be easier for readers to track our simulation scenario. We would like to use the latest Tractor noted by the version number. Thanks!
Best,
Zikun
Hi,
I was following the tutorial with the example dataset, and I got the error message when I was running the following code,
python Tractor/ExtractTracts.py
--msp ADMIX_COHORT/ASW.deconvoluted
--vcf ADMIX_COHORT/ASW.phased
--zipped
--num-ancs 2
The error message is
File "Tractor/ExtractTracts.py", line 24
def extract_tracts(msp: str, vcf_prefix: str, zipped: bool, zip_output:bool, output_path:str, num_ancs: int = 2):
^
SyntaxError: invalid syntax
Could you let me know where I got wrong? Thanks!
Hi again, I am currently running some Tractor analysis and have the need to compile multiple cohorts from separate Tractor runs and meta analyze the results. To do that, we need standard error in the output files in addition to the BETA & P columns. The way your code is written, it extracts these values from the coefficient matrix of each glm (through the matrix helper function). It only keeps BETA & P, however, standard error & z score are also values in that matrix. Z score may not be important, but standard error should be reported. The changes are minimal to allow for this, but it would change the output files by default for everyone. I would say it should be that way, but maybe not what you want. With more of a script overhaul, BETA/SE/P could become a parameter for the user to specify. After all, there are also cases where users may exclusively care about effect size and not P due to low sample size or something.
One other enhancement I would recommend is to allow the user to specify the number of decimals to round to. The script prints out to 6 decimals, but some users may want less. This is an easy change using an optional flag and the default set to 6 so that nothing changes for indifferent user. Edit: After second thought, P will always need to report as many decimals as possible, so this option could be confusing & probably not worth it since it would apply to non P cols only.
I may open a PR soon for the SE & rounding issues since I am already writing those for my own use case, but feel free to comment any thoughts.
Best -
Kyle
Hi,
I am wondering if the pheno file accepts missing values when running the GWAS locally...
Thanks
Hello,
I ran tractor on the same cohort using the same rfmix output and noticed that the AF for some variants differed between my original tractor output and the v1.1.0 output. Was there any difference in how AF was calculated in the new version of tractor?
Thanks for any help you can provide.
Hi
After running RunTractor.py (version 0.0.1), I am getting an error but also an output summary stats file. I think my output summary stats is reasonable, and this could just be a minor bug but I'm not sure. Hence I'm raising an issue here.
python3.7 Tractor/RunTractor.py --hapdose MEG_phase2.chr22.phase \
--method logistic \
--phe summ_using_PCsAPOL/Phe.eskd2021_nicole.txt \
--out summ_using_PCsAPOL/MEG_phase2.chr22.eskd2021.summ.tsv &> summ_using_PCsAPOL/MEG_phase2.chr22.eskd2021.summ.tsv.log
cat summ_using_PCsAPOL/MEG_phase2.chr22.eskd2021.summ.tsv.log
v 0.0.1
Reading files....
------
MEG_phase2.chr22.phase.anc0.hapcount.txt
------
MEG_phase2.chr22.phase.anc1.hapcount.txt
------
MEG_phase2.chr22.phase.anc0.dosage.txt
------
MEG_phase2.chr22.phase.anc1.dosage.txt
------
Notice:
Tractor drop one local ancestry term for regression. Therefore, MEG_phase2.chr22.phase.anc1.hapcount.txt will not be used.
------
ERROR: Phenotype ID must match with Hapdose file ID
END of calculation
wc -l summ_using_PCsAPOL/Phe.eskd2021_nicole.txt
2775 summ_using_PCsAPOL/Phe.eskd2021_nicole.txt
head -1 summ_using_PCsAPOL/Phe.eskd2021_nicole.txt
IID y
head -1 MEG_phase2.chr22.phase.anc0.dosage.txt |wc -w
2779 (CHROM POS ID REF ALT 1000560 1000625 1000628 1000633 1000634 ..... )
head -1 MEG_phase2.chr22.phase.anc0.hapcount.txt |wc -w
2779 (CHROM POS ID REF ALT 1000560 1000625 1000628 1000633 1000634 ..... )
You can see the Error message. But as you can see the header of the pheno file, and the number of samples in the pheno file and hapdosage files are all tallying correctly.
I just want to make sure that everything is correct.
Thank You.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.