Coder Social home page Coder Social logo

angsd-wrapper's People

Contributors

arundurvasula avatar mojaveazure avatar pmorrell avatar tomjkono avatar tvkent avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

angsd-wrapper's Issues

document defaults

Add documentation as to which parameters have defaults and note that those can be changed in the config file (i.e. that user doesn't ever have to change the bash script)!

lowess on plots

Could add a checkbox if user wants to plot a lowess fit to the data using lowess(). Could just use lowess defaults, which work pretty well, or allow user to change the f parameter of the function.

check if init.sh needs to be run

In the ANGSD_SFS.sh script, it should check if the init script needs to be run and run if it does need to be run. This can be done with:

if [ ! -d "$DIRECTORY" ]; then
  # Control will enter here if $DIRECTORY doesn't exist.
fi

That should also be in a function in a utils.sh script so that it can be easily run in other scripts.

Init.sh doesn't work when script is run using Slurm

Init.sh does not create the proper directories when it's called from 2DSFS, SFS, or Thetas resulting in a loss of work because ANGSD can't save the results anywhere. Possibly need to tell users to run init.sh separately before starting scripts.

Smoothing function on current window

From Jeff:
"An awesome feature would be to have a radio button that throws on a default lowess or other smoothing function to plot trends in the current window."

New GFFs

Add ability to select tracks/features from a GFF file and display multiple GFFs.

NgsF

script to run through ngsF from ngstools. TVK working on initial version. Should ideally output file of values for use in thetas etc. scripts.

override thetas

Current version of the code ignores thetas files if present. This is a good default, but I think there should be an option to override, perhaps in config file?

create makefile for whole project

Will accomplish the following (and other steps as necessary):

  • run init.sh
  • make angsd
  • make ngstools
  • download shiny and genomeIntervals

This will make installation much easier and faster.

overlap on rug?

Is the considerable overlap we see on the rug rectangles due to a weird subsetting of the GFF file? Shouldn't be lots of gene annotations within a single 1-2kb window.

question about message: key already exists

I have a regions file that looks like this:
9:52173261-52173990
5:12995975-12997206
9:113061855-113063368
9:15043238-15044776
2:14957924-14959456

After running ANGSD_Theta.sh I got a message: Problem with chr: 9, key already exists. 9:113061855-113063368 is not being written either. Is that because this region has less than 10 reads?

path

Maybe instead of unix user, you ask them to give path to the project home dir. For example I don't want things in ~/data I want them in ~/projects/bigd/angsdbigd/ etc. Should make a new directory called "angsdwrapper" in whatever dir they give, and then generate subdirectories.

Papers

Add list of papers and which methods come from each to wiki.

Add SGE/slurm scripts, add automagic cluster submission

angsd-wrapper can be run directly from the command line (as in the wiki), but it can also be submitted to a cluster queueing system. Write an example file to show how this is done:

#!/bin/bash
#SBATCH -D /home/adurvasu/angsd-wrapper
#SBATCH -J Slurm-Thetas
#SBATCH -o /home/adurvasu/angsd-wrapper/results/out-%j.txt
#SBATCH -e /home/adurvasu/angsd-wrapper/results/error-%j.txt

bash scripts/ANGSD_Thetas.sh scripts/thetas_example.conf

Also, this can be added automagically to the configuration script. I.e., set SLURM=true and then use the info in the conf file to point slurm to the project directory ($PROJECT_DIR). Can also add more cluster support later.

rug!

Gene annotation might be better shown using rug() or some other way to just plot genes along bottom rather than taking up the whole plot.

uniqueonly

default value for uniqueonly option should be 0

makefile

Write makefile to manage installation of necessary software initially.

shiny graphs

All theta estimates displayed on shiny graphs should alawys be value divided by window size to get a per bp value.

Shiny graphs should probably use dots with alpha to make graphs more easily read for larger chromosomes pieces.

An awesome feature would be to have a radio button that throws on a default lowess or other smoothing function to plot trends in the current window.

check if partial analysis is already done

Some steps in ANGSD require the same initial analyses to be done. There should be a function that checks if these analyses are done already and skips them if it is.

SNP call w/ANGSD

Going to need to have a wrapper script that calls SNPs. I will make a mock script in angsd that you could wrapperize for me?

Annotation shapes

Make GFFs polygons instead of rug. Use pentagon with pointy end point in direction for strand.

Update README to reflect creation of wiki

Remove redundant and old information from the README and point a link to the wiki. Add basic, overall information to README:

  • what this is
  • supported methods
  • contributing information

file names

"If these files are not present, the script will not work correctly"

The script should throw an error if there are no or incorrect files.

should common.conf be sourced in the scripts?

Latest commit: 445b861 moved a lot of variable declarations to a common.conf file, but that file needs to be sourced in the script files in order to be used right? Running ANGSD_SFS with the default conf gives the following error:

scripts/ANGSD_SFS.sh: line 95: ANGSD_DIR: unbound variable

I think because it can't find that declaration. Adding source common.conf before loading the user supplied configuration (here) should fix this.

Extra classes in the SFS

I have 10 highly inbred samples. I set the inbreeding coefficient to 0.99 for each individual in the respective file (x_F.txt ). However, the X_DerivedSFS file gives me 21 values when I am expecting a maximum of 10 or 11 (if adjusted to differences in sample size). Probably, the inbreeding coefficients are being ignored so the chromosome number is doubled.

Thetas

Theta statistics need to be divided by number of bp in a window. Windows with 0bp should not have points plotted.

Running thetas first results in a crash

If you run the thetas script without first running the SFS script, it will crash. Should call SFS before thetas if SFS results don't already exist. Also need to make sure that the regions are the same between both files.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.