Coder Social home page Coder Social logo

corona's Introduction

Reverse engineering the coronavirus (SARS-CoV-2)

Start here: corona.py

๐Ÿ’ญ Background

This project applies techniques from reverse engineering to understand the SARS-CoV-2 virus. The goal here is simply to build an understanding of the virus from first principles.

Biology vs. software

Biological systems are fundamentally information processing systems. While not a perfect analogy, software provides a useful framework for thinking about biology. The table below provides a rough outline of this analogy.

๐Ÿ”ฌ Biology ๐Ÿ’ป Software Notes
nucleotide byte
genome bytecode
translation disassembly 3 byte wide instruction set with arbitrary "reading frames"
protein function a polyprotein is a function with multiple pieces
protein secondary structure basic blocks 80% accuracy in prediction
protein tertiary structure This seems like the hard one to predict: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0205819
quaternary structure compiled function with inlining https://en.wikipedia.org/wiki/Protein%E2%80%93protein_interaction_prediction
gene library bacteria are statically linked, viruses are dynamically linked
transcription loading
protein structure prediction library identification
genome analysis static analysis
molecular dynamics simulations of protein folding dynamic analysis Simulation doesn't seem to work yet. Constrained by tooling and compute.
no equivalent execution We are reverse engineering a CAD format. Runs more like FPGA code, all at once. No serial execution. (What are the FPGA reverse engineering tools?)

๐Ÿ”ง Progress

Downloading the SARS-CoV-2 genome

GenBank is the NIH genetic sequence database, an annotated collection of all publicly available DNA and RNA sequences. The SARS-CoV-2 sequences available in GenBank have been downloaded in download_sequences.py.

Translating RNA to proteins

lib.py contains a function translate that converts an RNA sequence to a chain of amino acids. This function is used in corona.py.

Annotating functions

The translate function is used in corona.py to identify and annotate functions for all proteins encoded by the genome.

Folding proteins

The OpenMM toolkit is used for molecular simulation of protein folding in fold.py.

๐Ÿ’ก Work to be done

  • Automatic extraction of genes from different coronaviruses
  • Good multisequence compare tool
  • Molecular dynamics?
  • Secondary Structure prediction on orf1a?

โ“ Open questions

๐Ÿ’ง Testing

How tests work

Homemade test?

๐Ÿ’Š Possible treatments and prophylactics

โš ๏ธ Disclaimer: The information in this repository is for informational purposes only. It is not medical advice.

Hydroxychloroquine + zinc

RdRP inhibitors

Dexamethasone

Lopinavir-Ritonavir (AIDS cocktail)

๐Ÿ“š Resources

Coronavirus-related publications

Biology

Bioinformatics

Epidemic modeling

Antibodies

Masks

Vaccines

Genome studies (what genes = bad covid)

corona's People

Contributors

geohot avatar ajaykarpur avatar emilwidlund avatar miugel avatar hexpwn avatar wjjmjh avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.