Coder Social home page Coder Social logo

uio-bmi / machine_learning_in_comp_bio_exercises Goto Github PK

View Code? Open in Web Editor NEW
1.0 5.0 0.0 2.1 MB

These exercises are a part of the IN-BIOS5000 course at UiO, for the intro lectures on machine learning in computational biology.

Python 48.25% Jupyter Notebook 51.75%

machine_learning_in_comp_bio_exercises's Introduction

Machine Learning in Computational Biology Exercises

This repository contains exercises for the machine learning part of the IN-BIOS5000/IN-BIOS9000 course at UiO. It can be run online using Binder or Google Colab. Links to open specific notebooks in either of these environments are given below.

Exercise 1: Transcription Factor Binding Prediction

Binder

In this exercise, we will have a dataset consisting of DNA sequences which are labeled 0 or 1 if the transcription factor binds to them or not, respectively.

We will run Exercise_1.ipnyb notebook to train the models and then examine the results to try to understand how the models work.

The dataset for this exercise was downloaded from https://github.com/QData/DeepMotif.

Exercise 2: Transcription Factor Binding Prediction - selecting hyperparameters

Binder

In this exercise, we will have the same dataset as in Exercise 1 with the same aim of building a good predictive model. To that aim, we will include cross validation (CV) to explore different hyperparameters and models. The exercise is in the Exercise_2.ipnyb notebook.

Exercise 3: predicting disease states from adaptive immune receptor repertoires

Adaptive immune receptors bind to antigens in the body (such as parts of viruses or bacteria) and help neutralize the threat. In the adaptive immune receptor repertoire that includes all receptors in the body, there are approximately 10^8 unique receptors that mostly recognize different threats. By using machine learning, it might be possible to predict if a person has a given immune-related disease from their repertoire data.

In this exercise, we will use a public immuneML Galaxy tool to build an ML model that will be able to classify between repertoires coming from healthy and diseased individuals.

Steps:

  1. Go to https://galaxy.immuneml.uiocloud.no
  2. Select the shared history from the top menu: Shared Data -> Histories -> Quickstart Data
  3. Click on the plus sign in the top right corner to import history: the data will then be shown in the right sidebar and can be examined by clicking on the eye icon
  4. From the menu on the left, select immuneML tools -> Create immuneML dataset tool, and provide the data to required fields and click on the Execute button to create the dataset: the dataset will show up in the right sidebar and will turn green when the tool has finished the execution
  5. From the menu on the left, select immuneML tools -> Train immune repertoire classifiers (simplified interface) tool and fill in the parameters from the suggested list in the tool: when the tool has finished the execution click on the eye icon of the Summary: repertoire classification element to examine the results.

The results show the performance of algorithms and encodings selected in step 5 using nested cross validation. Look into the results in both the inner cross validation (selection) and the outer (assessment) and compare them.

machine_learning_in_comp_bio_exercises's People

Contributors

pavlovicmilena avatar

Stargazers

Ronak Shah avatar

Watchers

James Cloos avatar Lex Nederbragt avatar sandve avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.