Too many reads (5000001) in region about basevar HOT 5 CLOSED

shujiahuang commented on June 14, 2024

Too many reads (5000001) in region

from basevar.

Comments (5)

ShujiaHuang commented on June 14, 2024

hi, thank you for using BaseVar, I’ll check this error and get back to you ASAP~ And I want to make sure that you're using BaseVar to call variants for NIPT data, right? BaseVar is just suited for supper low-depth (<1x) sequencing data. Best On Aug 5, 2020, at 9:57 AM, liuhankui <[email protected]> wrote: Hi, I have two errors in using the software for calling variants of 20,000 bams. The first one is, sometimes it reminds me can not open the index of bam; but I am sure the bam and the corresponding index are all good. The second one is, it reminds me too many reads in some region, and suggests me to reduce --buffer-size or increase --max_reads. Once I increase the max reads to 5000000000, the software still can not work. could you please help me to explain how these errors happen and give me some advisement? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#9>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AANUOL6RM275O5OYC6OD4Q3R7C37FANCNFSM4PU7KEJA> .

from basevar.

liuhankui commented on June 14, 2024

hi, thank you for using BaseVar, I’ll check this error and get back to you ASAP~ And I want to make sure that you're using BaseVar to call variants for NIPT data, right? BaseVar is just suited for supper low-depth (<1x) sequencing data. Best On Aug 5, 2020, at 9:57 AM, liuhankui [email protected] wrote: Hi, I have two errors in using the software for calling variants of 20,000 bams. The first one is, sometimes it reminds me can not open the index of bam; but I am sure the bam and the corresponding index are all good. The second one is, it reminds me too many reads in some region, and suggests me to reduce --buffer-size or increase --max_reads. Once I increase the max reads to 5000000000, the software still can not work. could you please help me to explain how these errors happen and give me some advisement? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#9>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANUOL6RM275O5OYC6OD4Q3R7C37FANCNFSM4PU7KEJA .

Thank you. Yes, we used basevar for NIPT sequencing data. These are few samples with more than 1X, about 1.5X. I will remove these samples and have another try. I set --batch-count to a small number, and the software seems to work now; but it still reminds me can not open some bams or some problem with the index.

from basevar.

ShujiaHuang commented on June 14, 2024

hi, thank you for using BaseVar, I’ll check this error and get back to you ASAP~ And I want to make sure that you're using BaseVar to call variants for NIPT data, right? BaseVar is just suited for supper low-depth (<1x) sequencing data. Best On Aug 5, 2020, at 9:57 AM, liuhankui [email protected] wrote: Hi, I have two errors in using the software for calling variants of 20,000 bams. The first one is, sometimes it reminds me can not open the index of bam; but I am sure the bam and the corresponding index are all good. The second one is, it reminds me too many reads in some region, and suggests me to reduce --buffer-size or increase --max_reads. Once I increase the max reads to 5000000000, the software still can not work. could you please help me to explain how these errors happen and give me some advisement? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#9>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANUOL6RM275O5OYC6OD4Q3R7C37FANCNFSM4PU7KEJA .

Thank you. Yes, we used basevar for NIPT sequencing data. These are few samples with more than 1X, about 1.5X. I will remove these samples and have another try. I set --batch-count to a small number, and the software seems to work now; but it still reminds me can not open some bams or some problem with the index.

You don't have to delete the samples with more than 1x, BaseVar just get one reads from each position for each sample. I think you are using the developing version of BaseVar which I found there are some small bugs and I need more time to fix them. You'd better use the stable version, which is v0.8.0 release here: https://github.com/ShujiaHuang/basevar/archive/v0.8.0.tar.gz and the usage is the some and no need to set --buffer-size or --max_reads. You can just set --batch-count and I think you can try--batch-count 200 or --batch-count 500 which may fit you 20,000 bamlist better.

And when you use basevar for variants calling, I suggest you to add a sub-population group information for you samples by setting parameter--pop-group sample_subpopulation_group.info, which would allow you to calculate the allele frequency for each sub-population group, such as: Guangdong population and Beijing people. The format of sample_subpopulation_group.info is very simple (only two columns) and looks like:

Sample1_id         Beijing
Sample2_id         Beijing
Sample3_id         Beijing
Sample4_id         Guangdong
Sample5_id         Guangdong
Sample6_id         Henan
Sample7_id         Henan
...

I hope this version suit for you.

from basevar.

liuhankui commented on June 14, 2024

uit for you.
Hi，shujia，thanks a lot for you help. I will try this version and send back the reply.

from basevar.

liuhankui commented on June 14, 2024

hi, thank you for using BaseVar, I’ll check this error and get back to you ASAP~ And I want to make sure that you're using BaseVar to call variants for NIPT data, right? BaseVar is just suited for supper low-depth (<1x) sequencing data. Best On Aug 5, 2020, at 9:57 AM, liuhankui [email protected] wrote: Hi, I have two errors in using the software for calling variants of 20,000 bams. The first one is, sometimes it reminds me can not open the index of bam; but I am sure the bam and the corresponding index are all good. The second one is, it reminds me too many reads in some region, and suggests me to reduce --buffer-size or increase --max_reads. Once I increase the max reads to 5000000000, the software still can not work. could you please help me to explain how these errors happen and give me some advisement? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#9>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANUOL6RM275O5OYC6OD4Q3R7C37FANCNFSM4PU7KEJA .

Thank you. Yes, we used basevar for NIPT sequencing data. These are few samples with more than 1X, about 1.5X. I will remove these samples and have another try. I set --batch-count to a small number, and the software seems to work now; but it still reminds me can not open some bams or some problem with the index.

You don't have to delete the samples with more than 1x, BaseVar just get one reads from each position for each sample. I think you are using the developing version of BaseVar which I found there are some small bugs and I need more time to fix them. You'd better use the stable version, which is v0.8.0 release here: https://github.com/ShujiaHuang/basevar/archive/v0.8.0.tar.gz and the usage is the some and no need to set --buffer-size or --max_reads. You can just set --batch-count and I think you can try--batch-count 200 or --batch-count 500 which may fit you 20,000 bamlist better.

And when you use basevar for variants calling, I suggest you to add a sub-population group information for you samples by setting parameter--pop-group sample_subpopulation_group.info, which would allow you to calculate the allele frequency for each sub-population group, such as: Guangdong population and Beijing people. The format of sample_subpopulation_group.info is very simple (only two columns) and looks like:
Sample1_id         Beijing
Sample2_id         Beijing
Sample3_id         Beijing
Sample4_id         Guangdong
Sample5_id         Guangdong
Sample6_id         Henan
Sample7_id         Henan
...
I hope this version suit for you.

The version, 0.8.0, it works. It is great.

from basevar.

Too many reads (5000001) in region about basevar HOT 5 CLOSED

Comments (5)

Related Issues (3)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent