Coder Social home page Coder Social logo

mitty's People

Contributors

caneryildirim93 avatar ozemsbg avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

alenzhao ozemsbg

mitty's Issues

Issue with reads when template length is shorter than read length

Hello,

I am trying to simulate reads with a template length distribution such that some have a template length that is shorter than the read length, meaning not all reads in the file should reach the full read length of 75bp that I am using. However, all reads in these files are 75bp.

As an example of the issue I have provided a link to a dropbox containing two files, the simulated reads and read model used to create them. The template length is set to a mean of 50 and std of 0, meaning that the DNA fragments should all be 50, but all of the reads are 75bp (the read length set in the model).

https://www.dropbox.com/sh/uz2zjo2ze33978f/AAC8OXPwwnOtevohZ5dv_qjka?dl=0

Getting error error:1407742E:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert protocol version

While running the command

pip install git+https://github.com/sbg/Mitty.git

i am getting following error :
error:1407742E:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert protocol version

Details in the following screenshot

cloning error

When i tried to install packages mentioned in the setup.py, i am getting following error for pysam installation

pysam error

Googling about pysam, indicated that pysam is not supported on windows 10. I am using windows 10. Is that the reason for getting cloning error?

conda version is 4.3.30
Python 3.5.6

Thanks

Unusable variants present in VCF

Hello,

I am trying to generate simulated human reads. I cut my VCF using mitty filter-variants, however, after indexing it mitty generate-reads complains about some variants within the VCF file without specifying why and which ones.

mitty -v2 generate-reads \
  ~/Projects/Refs/ucsc.hg19/ucsc.hg19.fasta \
   IN.vcf.gz \
   SAMPLE \
   IN.bed \
   Custom-model.pkl \
   20 \
   7 \
   >(gzip > r1.fq.gz) \
   lq.txt \
   --fastq2 >(gzip > r2.fq.gz) \
   --threads 2
ERROR:mitty.lib.vcfio:Unusable variants present in VCF. Please filter or refactor these.

Is there a new version available?

Hey guys, Is there any available version of Mitty to download? It seems I am getting versioning problems due to the lack of availability of some packages to x86 like pysam==0.10.0

I saw a mitty3 image in docker, but didn't worked for me. Could you help?

Excluding regions with masked reference sequence when simulating reads

In your "Fast and accurate genomic analyses using genome graphs" paper you use Mitty to simulate reads, but only from regions without masked reference sequence (repetitive regions). I'm just wondering, is this possible to do directly with Mitty, e.g. by providing it with a hard-masked reference fasta file, or are there any ways to run Mitty telling it to not simulate reads from masked regions? Asking because I'm trying to reproduce the read simulations from your paper, and I don't find any details about how Mitty was used for the experiments shown in the paper.

In advance, thanks!

infra: Set up a CI to run tests on commits and PRs

Probably use Circle CI as that is what I know how to do.

  • Revise how to set up Circle CI
  • Create a branch for setting up the CI, running the tests
  • Add a badge (all the cool kids do it)
  • Merge branch to master

GodAligner "IndexError: list index out of range"

Hi,
I have been trying to use the GodAligner to recover my true alignment of my generated reads.
However when I run the following code, I do have an error "IndexError: list index out of range". After some digging I found out that the variable "ap2=[]" at some part of the loop (line 200 from god_aligner.py).. however I cannot found out why.

Here is the command I used to create the reads (and it worked perfectly)

#!/usr/bin/env bash
set -ex
FASTA=../data/human_g1k_v37.fa.gz

SAMPLEVCF=../data/1kg.20.22.vcf.gz

REGION_BED=ch20.bed

FILTVCF=Ch20filt.vcf.gz

SAMPLENAME=HG00119

COVERAGE=2

READ_GEN_SEED=7

FASTQ_PREFIX=Ch20-PEreads

READ_CORRUPT_SEED=7
READMODEL=1kg-pcr-free.pkl

mitty -v4 filter-variants
  ${SAMPLEVCF}
  ${SAMPLENAME}
  ${REGION_BED}
  -
  2> vcf-filter.log | bgzip -c > ${FILTVCF}

tabix -p vcf ${FILTVCF}

mitty -v4 generate-reads
  ${FASTA}
  ${FILTVCF}
  ${SAMPLENAME}
  ${REGION_BED}
  ${READMODEL}
  ${COVERAGE}
  ${READ_GEN_SEED}
  >(gzip > ${FASTQ_PREFIX}1.fq.gz)
  ${FASTQ_PREFIX}-lq.txt
  --fastq2 >(gzip > ${FASTQ_PREFIX}2.fq.gz)
  --threads 2

mitty -v4 corrupt-reads
  ${READMODEL}
  ${FASTQ_PREFIX}1.fq.gz >(gzip > ${FASTQ_PREFIX}-corrupt1.fq.gz)
  ${FASTQ_PREFIX}-lq.txt
  ${FASTQ_PREFIX}-corrupt-lq.txt
  ${READ_CORRUPT_SEED}
  --fastq2-in ${FASTQ_PREFIX}2.fq.gz
  --fastq2-out >(gzip > ${FASTQ_PREFIX}-corrupt2.fq.gz)
  --threads 2

and here is the command I used to run GodAligner:

#!/usr/bin/env bash
set -ex

FASTA=../data/human_g1k_v37.fa.gz
FASTQ_PREFIX=Ch20-PEreads
GODBAM=Ch20-god.bam
DO_NOT_INDEX=${1}

mitty -v4 god-aligner
${FASTA}
${FASTQ_PREFIX}-corrupt1.fq.gz
${FASTQ_PREFIX}-corrupt-lq.txt
${GODBAM}
--fastq2 ${FASTQ_PREFIX}-corrupt2.fq.gz
--threads 2

The full error is as follow:

Process Process-1:
Traceback (most recent call last):
File "/Users/anaconda3/envs/mymitty/lib/python3.5/multiprocessing/process.py", line 252, in _bootstrap
self.run()
File "/Users/anaconda3/envs/mymitty/lib/python3.5/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/Users/anaconda3/envs/mymitty/lib/python3.5/site-packages/mitty/benchmarking/god_aligner.py", line 140, in disciple
write_perfect_reads(qname, rg_id, long_qname_table, ref_dict, read_data, cigar_v2, fp)
File "/Users/anaconda3/envs/mymitty/lib/python3.5/site-packages/mitty/benchmarking/god_aligner.py", line 206, in write_perfect_reads
p10, p11, p20, p21 = ap1[0][1], ap1[-1][1], ap2[0][1], ap2[-1][1]
IndexError: list index out of range

I am working on macOS using Mitty version 2.28.3.

Thanks a lot,
Adrien

EDIT: I tried simulating reads from another chromosome (Chr22) using the same pipelines (changed the seeds and coverage) and this time, everything went smoothly for the new dataset. However the previous one (Chr20) is still bugged.

BQ of generated reads do not match model specification

Hi,

I've created a read model with the following script:

mitty create-read-model synth-illumina
100.pkl
--read-length 100
--mean-template-length 250
--std-template-length 20
--bq0 30
--k 200
--sigma 5

And when I check the read model in Mitty with the following:
mitty describe-read-model 100.pkl 100.png

It looks as expected:
100
But when I generate reads using the model with the following code:
k=HG00632
i=100

mitty -v4 generate-reads GRCh38.p12.fa
./final_vcfs/${k}all.vcf.gz
${k} all_merged_sorted.bed
${i}.pkl
40
7
${k}
${i}reads-test.1.fq
${k}
${i}-lq.txt
--fastq2 ${k}_${i}reads-test.2.fq
2> vcf-${i}
${k}.log

The generated reads have a flat BQ of 9 when I check them with FastQC:
image

And when I run the god-aligner to create a bam file, I can see in IGV that the reads are a mess. I've tried running different individuals, different read lengths but get the same pattern.

Have I misunderstood something with the read model generation?

Thank you very much for any help you can provide on the matter.

ImportError: No module named 'mitty.benchmarking'

When I try running the god-aligner on generated reads I get the error bellow:

(mymitty)$ mitty -v4 god-aligner ~/Refs/ucsc.hg19/ucsc.hg19.fasta r1c.fq.gz lqc.txt perfectc.bam --fastq2 r2c.fq.gz --threads 2
Traceback (most recent call last):
  File "/Users/u1/anaconda/envs/mymitty/bin/mitty", line 11, in <module>
    load_entry_point('mitty==2.9.1.dev0', 'console_scripts', 'mitty')()
  File "/Users/u1/anaconda/envs/mymitty/lib/python3.5/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/Users/u1/anaconda/envs/mymitty/lib/python3.5/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/Users/u1/anaconda/envs/mymitty/lib/python3.5/site-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/u1/anaconda/envs/mymitty/lib/python3.5/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/smiao/anaconda/envs/mymitty/lib/python3.5/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/Users/u1/anaconda/envs/mymitty/lib/python3.5/site-packages/mitty/cli.py", line 255, in god_aligner
    import mitty.benchmarking.god_aligner as god
ImportError: No module named 'mitty.benchmarking'

I am running the following version:

(mymitty)$ mitty --version
mitty, version 2.9.1.dev0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.