Coder Social home page Coder Social logo

dark_and_camouflaged_genes's People

Contributors

mebbert avatar mpage21 avatar tjense25 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

dark_and_camouflaged_genes's Issues

Missing annotation files?

Hi Mark,

Thanks for making this, I'm really excited to try this out.

I'm trying to use your supplied .bed files to run steps 6-10. I think there are some files missing for steps 9 & 10. Can you upload the corresponding annotation files?

Files seem to be missing:

09_FIND_FALSE_POSITIVES /submit.sh
CAMO_ANNOTATION="../results/hg38_no_alt/illuminaRL100/illuminaRL100.hg38_no_alt.camo_annotations.txt"

10_VARIANT_FILTERING
ANNO_BED = "./results/annotations/hg38/Homo_sapiens.GRCh38.93.annotation.hg38.bed"

Could you put the annotation files (that correspond to your bed files) in github? I'm using b37.

Thank you,
Pauline

Problems while running 05_CREATE_BED_FILE

Hello, I tried to use your script to detect the camo regions, but I encountered the following error when I ran 05_CREATE_BED_FILE (extract_camo_regions.py):

Wed Jul 26 20:08:19 CST 2023 python extract_camo_regions.py
Traceback (most recent call last):
  File "extract_camo_regions.py", line 113, in <module>
    main(sys.argv[1], sys.argv[2], sys.argv[3], sys.argv[4], sys.argv[5])       
  File "extract_camo_regions.py", line 94, in main
    group_pos = [regions[region_id]]
KeyError: 'DDX11L1_1::chr1:11869-12227'

Actually, I have no idea about this, and I don't know if there's something wrong with the format that KeyError points out?
Looking forward to your reply!

Alt-aware alignment files?

Hi there,

Thank you for your research, and for making this repository public.

Our group is running the pipeline using our own data. We were wondering whether to use our alt-aware alignment files or our non-alt-aware files.

Would this make a difference?

Thanks,
Kaitlyn

Issues with reproducing the .realign.sorted.bed file

Hello,

First of all, thank you for this interesting work and making available your code to the community.

I am trying to reproduce some of the steps of your pipeline using our own data and I noticed something strange in the .realign.sorted.bed I am producing compared to the one your are making available here.
Particularly, your .realign.sorted.bed contains the following coordinates for CR1 (among other coordinates):

chr1	207533408	207533588	CR1_10;CR1_26;CR1_18	3
chr1	207551963	207552143	CR1_18;CR1_10;CR1_26	3
chr1	207569058	207569238	CR1_26;CR1_18;CR1_10	3

But using the reference genome genome hg38 and the Ensemble build 93, I obtain:

chr1	207533408	207533588	CR1_10;CR1_26;CR1_18	3
chr1	207551963	207552143	CR1_18;CR1_10;CR1_26	3
chr1	207569058	207569122	CR1_26;CR1_18;CR1_10	3
chr1	207569122	207569238	CR1_UTR_2;CR1_18;CR1_10	3

Basically, your last CR1_26 coordinates show a 180bp region but in my case, it is CR1_26 + CR1_UTR_2 (the 2 last coordinates) which is 180bp long. While CR1_10, CR1_18 and CR1_26 are reported 3 times, CR1_UTR_2 is only reported one time. I checked in the GFF3 file of Ensemble build 93 and indeed, CR1_26 has coordinates chr1:207,569,058-207,569,122 and should be 64bp long.

Using Ensemble build 98, I obtain something even more strange:

chr1	207533408	207533588	CR1_10;AL137789.1_intron_2;AL137789.1_1;CR1_UTR_2;CR1_18	5
chr1	207551963	207552143	CR1_18;CR1_10;AL137789.1_intron_2;AL137789.1_1;CR1_UTR_2  5
chr1	207569058	207569122	CR1_26;AL137789.1_intron_;CR1_182;AL137789.1_1;CR1_10	5
chr1	207569122	207569238	CR1_UTR_2;CR1_18;CR1_10;AL137789.1_intron_2;AL137789.1_1	5

CR1_26 is this time reported similar to CR1_10 and CR1_18 only once while CR1_UTR_2 is reported 3 times.

Maybe I am missing a step somewhere in your pipeline where CR1_26 is merged with CR1_UTR_2 but at that point, I do not understand nor see how to obtain the coordinates that you have for chr1:207,569,058-207,569,122 (CR1_26) while I get 2 coordinates corresponding to CR1_26 and CR1_UTR_2.

I would appreciate any help and insights on this problem.

Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.