Coder Social home page Coder Social logo

Comments (5)

ShujiaHuang avatar ShujiaHuang commented on June 14, 2024

from basevar.

liuhankui avatar liuhankui commented on June 14, 2024

hi, thank you for using BaseVar, I’ll check this error and get back to you ASAP~ And I want to make sure that you're using BaseVar to call variants for NIPT data, right? BaseVar is just suited for supper low-depth (<1x) sequencing data. Best On Aug 5, 2020, at 9:57 AM, liuhankui [email protected] wrote: Hi, I have two errors in using the software for calling variants of 20,000 bams. The first one is, sometimes it reminds me can not open the index of bam; but I am sure the bam and the corresponding index are all good. The second one is, it reminds me too many reads in some region, and suggests me to reduce --buffer-size or increase --max_reads. Once I increase the max reads to 5000000000, the software still can not work. could you please help me to explain how these errors happen and give me some advisement? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#9>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANUOL6RM275O5OYC6OD4Q3R7C37FANCNFSM4PU7KEJA .

Thank you. Yes, we used basevar for NIPT sequencing data. These are few samples with more than 1X, about 1.5X. I will remove these samples and have another try. I set --batch-count to a small number, and the software seems to work now; but it still reminds me can not open some bams or some problem with the index.

from basevar.

ShujiaHuang avatar ShujiaHuang commented on June 14, 2024

hi, thank you for using BaseVar, I’ll check this error and get back to you ASAP~ And I want to make sure that you're using BaseVar to call variants for NIPT data, right? BaseVar is just suited for supper low-depth (<1x) sequencing data. Best On Aug 5, 2020, at 9:57 AM, liuhankui [email protected] wrote: Hi, I have two errors in using the software for calling variants of 20,000 bams. The first one is, sometimes it reminds me can not open the index of bam; but I am sure the bam and the corresponding index are all good. The second one is, it reminds me too many reads in some region, and suggests me to reduce --buffer-size or increase --max_reads. Once I increase the max reads to 5000000000, the software still can not work. could you please help me to explain how these errors happen and give me some advisement? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#9>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANUOL6RM275O5OYC6OD4Q3R7C37FANCNFSM4PU7KEJA .

Thank you. Yes, we used basevar for NIPT sequencing data. These are few samples with more than 1X, about 1.5X. I will remove these samples and have another try. I set --batch-count to a small number, and the software seems to work now; but it still reminds me can not open some bams or some problem with the index.

You don't have to delete the samples with more than 1x, BaseVar just get one reads from each position for each sample. I think you are using the developing version of BaseVar which I found there are some small bugs and I need more time to fix them. You'd better use the stable version, which is v0.8.0 release here: https://github.com/ShujiaHuang/basevar/archive/v0.8.0.tar.gz and the usage is the some and no need to set --buffer-size or --max_reads. You can just set --batch-count and I think you can try--batch-count 200 or --batch-count 500 which may fit you 20,000 bamlist better.

And when you use basevar for variants calling, I suggest you to add a sub-population group information for you samples by setting parameter--pop-group sample_subpopulation_group.info, which would allow you to calculate the allele frequency for each sub-population group, such as: Guangdong population and Beijing people. The format of sample_subpopulation_group.info is very simple (only two columns) and looks like:

Sample1_id         Beijing
Sample2_id         Beijing
Sample3_id         Beijing
Sample4_id         Guangdong
Sample5_id         Guangdong
Sample6_id         Henan
Sample7_id         Henan
...

I hope this version suit for you.

from basevar.

liuhankui avatar liuhankui commented on June 14, 2024

uit for you.
Hi,shujia,thanks a lot for you help. I will try this version and send back the reply.

from basevar.

liuhankui avatar liuhankui commented on June 14, 2024

hi, thank you for using BaseVar, I’ll check this error and get back to you ASAP~ And I want to make sure that you're using BaseVar to call variants for NIPT data, right? BaseVar is just suited for supper low-depth (<1x) sequencing data. Best On Aug 5, 2020, at 9:57 AM, liuhankui [email protected] wrote: Hi, I have two errors in using the software for calling variants of 20,000 bams. The first one is, sometimes it reminds me can not open the index of bam; but I am sure the bam and the corresponding index are all good. The second one is, it reminds me too many reads in some region, and suggests me to reduce --buffer-size or increase --max_reads. Once I increase the max reads to 5000000000, the software still can not work. could you please help me to explain how these errors happen and give me some advisement? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#9>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANUOL6RM275O5OYC6OD4Q3R7C37FANCNFSM4PU7KEJA .

Thank you. Yes, we used basevar for NIPT sequencing data. These are few samples with more than 1X, about 1.5X. I will remove these samples and have another try. I set --batch-count to a small number, and the software seems to work now; but it still reminds me can not open some bams or some problem with the index.

You don't have to delete the samples with more than 1x, BaseVar just get one reads from each position for each sample. I think you are using the developing version of BaseVar which I found there are some small bugs and I need more time to fix them. You'd better use the stable version, which is v0.8.0 release here: https://github.com/ShujiaHuang/basevar/archive/v0.8.0.tar.gz and the usage is the some and no need to set --buffer-size or --max_reads. You can just set --batch-count and I think you can try--batch-count 200 or --batch-count 500 which may fit you 20,000 bamlist better.

And when you use basevar for variants calling, I suggest you to add a sub-population group information for you samples by setting parameter--pop-group sample_subpopulation_group.info, which would allow you to calculate the allele frequency for each sub-population group, such as: Guangdong population and Beijing people. The format of sample_subpopulation_group.info is very simple (only two columns) and looks like:

Sample1_id         Beijing
Sample2_id         Beijing
Sample3_id         Beijing
Sample4_id         Guangdong
Sample5_id         Guangdong
Sample6_id         Henan
Sample7_id         Henan
...

I hope this version suit for you.

The version, 0.8.0, it works. It is great.

from basevar.

Related Issues (3)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.