Coder Social home page Coder Social logo

Kenneth Y. Wertheim's Projects

alignment_variantcalling_annotation icon alignment_variantcalling_annotation

Working on the Galaxy platform, I aligned three sets of targeted re-sequencing data to a reference genome called hg19, called the variants, and annotated them.

alignmentdynamicprogramming_overlaps icon alignmentdynamicprogramming_overlaps

The Python program main.py aligns DNA sequencing reads with an excerpt of human chromosome 1 by dynamic programming, finding the minimum edit distance in each case. It also finds the overlaps between sequencing reads from Phi-X.

bashscripting_genomicfeatures icon bashscripting_genomicfeatures

The Bash script 'analysis.sh' uses samtools and bedtools to quantify the relations between a set of sequencing reads and their alignments on the genome of one strain (wu_0_A) of the plant Arabidopsis thaliana.

bashscripting_genomics icon bashscripting_genomics

The Bash script 'analysis.sh' analyses the files in an archive and reports key statistics about the genome of Malus domestica.

bashscripting_tuxedo icon bashscripting_tuxedo

The Bash script 'analysis.sh' runs the RNA-seq Tuxedo pipeline: TopHat, Cufflinks, Cuffcompare, Cuffmerge, and Cuffdiff.

bioconductor_epigenomics icon bioconductor_epigenomics

The R script uses the AnnotationHub package to obtain data on human CpG islands and histone modifications (H3K4me3 and H3K27me3); and the GenomicRanges package to extract basic statistics about the epigenome from the data.

boyer_moore_index_pigeonhole icon boyer_moore_index_pigeonhole

The Python program main.py aligns DNA sequencing reads with an excerpt of human chromosome 1 using the naive exact matching and Boyer-Moore algorithms, as well as k-mer indices (substrings and subsequences) and the pigeonhole principle.

cd4plus_tcells_model icon cd4plus_tcells_model

The files in this repository constitute a multi-approach and multi-scale model of CD4+ T lymphocytes. Running the main.m file within the MATLAB environment will start a stochastic simulation of CD4+ T cells responding to the input signals defined in antigen.mat.

dcgan_facegeneration icon dcgan_facegeneration

The Jupyter notebook presents a DCGAN built for the purpose of generating new images of human faces, as well as the procedures for training it with a processed CelebA dataset.

facialkeypointdetection icon facialkeypointdetection

The Jupyter notebooks present the procedures I followed to build a facial keypoint detection system.

greedy_shortest_common_superstring icon greedy_shortest_common_superstring

The Python program main.py contains the shortest common superstring algorithm, the greedy shortest common superstring algorithm, and an accelerated version of the greedy shortest common superstring algorithm. They can be used to assemble a genome from its sequencing reads.

naive_exact_matching icon naive_exact_matching

The Python program checks the quality of DNA sequencing reads and aligns reads with a genome using the naive exact matching algorithm and modified versions of it.

neuroblastoma_cellmodel icon neuroblastoma_cellmodel

This repository contains the first multicellular model of neuroblastoma, built for the PRIMAGE project.

nlp_attentionmodels icon nlp_attentionmodels

These notebooks present a series of attention models for natural language processing, including an encoder-decoder model (translator) comprising LSTMs and the scaled dot product attention mechanism, a transformer decoder (text summariser), a transformer encoder (question answering) based on the BERT model, and a reformer (chatbot).

nlp_fundamentals icon nlp_fundamentals

These notebooks introduce the fundamentals of natural language processing, including text preprocessing techniques, logistic regression, Naive Bayes classification, word vectors (embeddings), principal component analysis, the bag-of-words model, the k-nearest neighbours algorithm, and locality-sensitive hashing.

nlp_probabilisticmodels icon nlp_probabilisticmodels

These notebooks present a hidden Markov model (part-of-speech tagging), an n-grams language model (auto-complete system), and a CBOW model (computing word embeddings), as well as supporting techniques such as dynamic programming (autocorrect system), the Viterbi algorithm, and the perplexity score.

nlp_sequencemodels icon nlp_sequencemodels

These notebooks present a series of deep neural networks for natural language processing, including a feedforward network, an RNN with GRUs, an LSTM, and a Siamese network. Their applications are sentiment analysis, next character prediction, named entity recognition, and detecting duplicate questions respectively.

optimisechemoneuroblastoma icon optimisechemoneuroblastoma

This repository contains two sets of code. One is used to simulate neuroblastoma's clonal evolution in the presence of vincristine and cyclophosphamide. The other is used to find the optimal chemotherapy schedule given an initial clonal composition.

paediatric_cancer_classifiers icon paediatric_cancer_classifiers

This algorithm builds a series of paediatric cancer classifiers, including a decision tree, a naive Bayes classifier, support vector machines, an ensemble method (Adaboost), and a multilayer perceptron. It also implements hierarchical clustering and principal component analysis.

personalcppprojects icon personalcppprojects

This repository contains various C++ programs, including recursion, dynamic programming (memorisation and tabulation), and genetic algorithms (binary and continuous).

turing_pattern_analysis icon turing_pattern_analysis

These algorithms assess the potential of a multi-component system to form Turing patterns. The reaction-diffusion model describing the system is analytically intractable. The solution employed is to explore the parametric space one point at a time.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.