Coder Social home page Coder Social logo

jvrana / dasi-dna-design Goto Github PK

View Code? Open in Web Editor NEW
10.0 1.0 1.0 10.05 MB

A Python molecular biology CAD tool with a dead-simple user interface

Home Page: https://jvrana.github.io/DASi-DNA-Design/

License: GNU General Public License v3.0

Makefile 0.44% Python 89.98% HTML 0.60% CSS 0.39% Batchfile 0.13% Jupyter Notebook 8.47%
dna biotechnology bioinformatics synthetic-biology molecular-biology cad

dasi-dna-design's Introduction

DASi DNA Design

PyPI version Code style: black pre-commit Build package Language grade: Python Total alerts

DASi is an automatic DNA cloning plan designer aimed for operating on small budgets by focusing on material re-use.

The software converts a nucleotide sequence, or a library of sequences, to an executable molecular assembly plan while optimizing material cost, assembly efficiency, and assembly time.

The key design paradigm for DASi is that no molecular biology expertise is required to use DASi. Complete novices should be able to use the software to design and construct new genetic sequences. This also enables automated software programs to automatically design and construct new genetic sequences.

The software goals are reminiscent of j5 or Teselegen but focused on:

  1. A dead-simple API usable by lab novices, experts or automated software programs.
  2. Utilizing information about current laboratory inventory in its optimization algorithm to minimize costs and turn-around time

Status

DASi is currently under development funded by the DARPA Synergistic Discovery and Design program. DASi is currently being used to connect automatically generate DNA designs to automated biological fabrication facilities (e.g. University of Washington Biofab).

Usage

DASi completely automates the cloning design work, finding approximately optimal solutions for cloning steps, preferentially using existing plasmids, linear DNA fragments, and primers to design semi-optimal cloning steps and designs.

The following command designs the cloning steps for a library of designs. The user only needs to specify the sequences they wish to construct and currently available primers and DNA templates as .genbank or .fasta files. DASi handles all design aspects. No molecular biology expertise is required to use DASi.

dasi library_design --designs mydesigns/*.gb --fragments fragments/*.gb --primers primers.fasta --templates plasmids/*.gb --cost_model cost.b --out results

Customization

DASi optimization parameters are completely customizable. The following are examples of parameters and aspects of DASi that are customizable:

  • primer synthesis costs
  • primer design parameters
  • synthetic fragment costs
  • vendor-specific synthetic fragment complexity
  • sequence dependent plasmid assembly efficiencies
  • optimizing over efficiency vs material costs
  • etc.

Planned Features

  • Golden-gate support
  • heirarchical assembly
  • library support (with bayesian search to optimize shared parts)
  • front-end
  • connection to fabrication facility

DASi optimization problem

Briefly, DASi approximates a solution the following optimization problem:

Given a set of 'goal' double-stranded sequences, a set of available single-stranded and double-strand sequences, and a set of actions that can create new sequences, find the optimal set of operations that produces the 'goal' sequences.

Formalization of this optimization problem is coming soon.

dasi-dna-design's People

Contributors

jvrana avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

lgtm-migrator

dasi-dna-design's Issues

RuntimeWarning: invalid value encountered in true_divide

Describe the bug
The following warning often occurs in the normal execution of DASi
RuntimeWarning: invalid value encountered in true_divide

To Reproduce
Run DASi on any sequence design

Expected behavior
No warnings

Additional context
This occurs in the networkx utilities algorithms.

Vendor Configurations (e.g. IDT, Twist)

Is your feature request related to a problem? Please describe.
There should be default vendor configurations for costs.

Describe the solution you'd like
Vendor considerations
IDT

  • Well defined length > cost and time estimates

TWIST

  • Produces adapter on each fragment. Primers must be ordered for assemblies in most cases. This optimization procedure is not implemented. For TWIST synthesized fragments, the cost would be equivalent to the synthesis cost + the cost of new primers. There may be optimizations to be had with ordering the TWIST fragment and ordering different primers to amplify the fragment.

Optimal Partitioning Schema

Milestones

  1. Implement DNA complexity to efficiency for graphs
  2. Implement partitioning for B->A gaps
  3. Implement partitioning algorithm to find optimal synthesis partitioning

JSON output schema

DASi should produce a clear, easy to understand JSON output for its designs.

Bug: `Sequence is cyclic but there are no precessors` error

Describe the bug

Designs occasionally produce this bug:

dasi.exceptions.DasiSequenceDesignException: Sequence is cyclic but there are no precessors for AssemblyNode(index=0, expandable=True, type='A', overhang=False)

Unclear how to reproduce.

Implement DNA complexity to efficiency for graphs

The efficiency of a gapped region should be lower for highly complex DNA. Highly complex DNA would be the following:

  • less than 25% GC content
  • higher than 75% GC content
  • homopolymeric runs of 10 or more A/Ts
  • homopolymeric runs of 6 or more G/Cs
  • repeats greater than 14bp
  • hairpins

Degenerate designs

In many cases, users may not care about the exact sequence between parts (scar sequences) or some degenerate sequence between fusion proteins (GSG linker)

Better Logger

  • Log to file
  • Better logging levels
  • Logging for CLI only

Invalid fragment PCR products are produced.

Describe the bug
DASi can design PCR products that are not amplifiable. For example, pcr reactions that have multiple binding sites.

Expected behavior
DASi should detect bad PCRs and heavily penalize their use during the optimization process.

TODO

  • make test that forces a bad PCR product.

Implement tblastn for CDS searches

In many cases, a better assembly may be found with slightly different DNA sequence that results in the same coding sequence:

  • In designs, identify CDS
  • use tblastn to find fragments that align to goal

Improve Documentation

  • Better Usage page and Installation page up front
  • Better format for JSON Schema
  • Point users to use output JSON format

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.