Coder Social home page Coder Social logo

bopsexchr's Introduction

This page demonstrates the analyses of sex-linked sequences, including demarcating evolutionary strata and verification of W-linked sequences. Other analyses can be found in directories Annotation, Divergent_rate analysis, synteny and RNA-seq analysis.

Sequence similarity between the Z and W

The analysis of sequence similarity between ZW help demarcate the evolutionary strata of avian sex chromosomes (Fig. 2B).

# calculating sequence similarity over 100k sliding windows on the Z chromosome
sh lastz.psl-100k_sim.sh [out_dir] [chrZ sequence] [W scaffolds]

out_dir: the output directory.

chrZ sequence: the Z chromosome sequence in fasta format. Z-linked scaffolds need to be linked into a single pseudochromosome.

W scaffolds: W-linked scaffolds or contigs in fasta format. Repeats need to be masked.

Lastz and ucscGenomeBrowser utility need to be installed.

z-w.psl.score.ide95.filt.ide-100k in the out_dir is the final output. The second column is the position of alignments on the Z chromosome. The third column is total number of mismatches and the fourth column is the alignment length. The last column shows the sequence similarity of a 100k sliding window on the Z.

# Plotting sequencing similarity along the Z chromosome
Rscript sim100k.r [ide-100k] [alignment size] [output name]

ide-100k: z-w.psl.score.ide95.filt.ide-100k file produced by lastz.psl-100k_sim.sh

alignment size: alignments with length below this values will be removed. The default size is 3000

output name: the output pdf name

R package ggplot2 needs to be installed. An example input file 'lawesii.z-w.psl.score.ide95.filt.ide-100k' is provided.

W-linked sequence verification

It's possible to verify W-linked scaffolds only when male sequencing data is available.

# Calculating sequencing depth and coverage
sbatch m-f.coverag.sh [genome] [male reads 1] [male reads 2] [female reads 1] [female reads 2]

genome: the genome assembly in fasta format.

male/female reads *: full paths for male and female sequence data that are in fastq format

This will generate male.BAM.coverage and female.BAM.coverage files. Each row represents one scaffold, with information of total scaffold size (3rd column), sequencing depth (4th column), and sequencing coverage (5th column). The ratio of mappable site is calculated by dividing the sequencing coverage by scaffold size.

After retrieving the scaffolds derived from chr5, chrZ and chrW, into a file, e.g. 'femaleCov.m2f', one can plot female coverage (Y axis) and m/f ratios of mappable site (X axis).

Rscript m2f_ratio.r [species]

species: species name, e.g. medium_ground_finch.

This script can reproduce Fig. 1A.

Genome assemblies and raw data

NCBI BioProject PRJNA491255

bopsexchr's People

Contributors

lurebgi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

bopsexchr's Issues

Need help to run lastz.psl-100k_sim.sh

Dear Luohao,

I am trying to investigate the ZW similarity and evolutionary strata using your script "lastz.psl-100k_sim.sh" and following similar analysis as reported in your paper Xu et al. 2019. I have assembled pseudo-scaffold for each Z and W chromosomes. I tried to run the script according to your protocol as following command:

sh ./BOPsexChr-master/lastz.psl-100k_sim.sh result_test Z_masked.fasta W-scaff.masked.fasta

But I get the following error messages.

FAILURE: in init_from_anchors(), structure size would exceed 2^32 (362128816 + 128*45266102)
consider raising scoring threshold (--hspthresh or --exact) or breaking your target sequence into smaller pieces
./BOPsexChr-master/lastz.psl-100k_sim.sh: 17: ./BOPsexChr-master/lastz.psl-100k_sim.sh: faSize: not found
./BOPsexChr-master/lastz.psl-100k_sim.sh: 18: ./BOPsexChr-master/lastz.psl-100k_sim.sh: faSize: not found
./BOPsexChr-master/lastz.psl-100k_sim.sh: 19: ./BOPsexChr-master/lastz.psl-100k_sim.sh: chainPreNet: not found
./BOPsexChr-master/lastz.psl-100k_sim.sh: 21: ./BOPsexChr-master/lastz.psl-100k_sim.sh: chainNet: not found
./BOPsexChr-master/lastz.psl-100k_sim.sh: 21: ./BOPsexChr-master/lastz.psl-100k_sim.sh: netSyntenic: not found
./BOPsexChr-master/lastz.psl-100k_sim.sh: 23: ./BOPsexChr-master/lastz.psl-100k_sim.sh: faToTwoBit: not found
./BOPsexChr-master/lastz.psl-100k_sim.sh: 24: ./BOPsexChr-master/lastz.psl-100k_sim.sh: faToTwoBit: not found
./BOPsexChr-master/lastz.psl-100k_sim.sh: 26: ./BOPsexChr-master/lastz.psl-100k_sim.sh: ./BOPsexChr-master/lastz.psl-100k_sim.sh: 26: ./BOPsexChr-master/lastz.psl-100k_sim.sh: netToAxt: not found
axtSort: not found
./BOPsexChr-master/lastz.psl-100k_sim.sh: 28: ./BOPsexChr-master/lastz.psl-100k_sim.sh: axtToMaf: not found
./BOPsexChr-master/lastz.psl-100k_sim.sh: 30: ./BOPsexChr-master/lastz.psl-100k_sim.sh: mafToPsl: not found
./BOPsexChr-master/lastz.psl-100k_sim.sh: 33: ./BOPsexChr-master/lastz.psl-100k_sim.sh: pslScore: not found

Could you please guide me, how can I fix these warnings and run the script with success?
Many thanks,
Regards,
Farhan

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.