Coder Social home page Coder Social logo

linhduongtuan / pytorchpipe Goto Github PK

View Code? Open in Web Editor NEW

This project forked from ibm/pytorchpipe

0.0 1.0 0.0 1.26 MB

PyTorchPipe (PTP) is a component-oriented framework for rapid prototyping and training of computational pipelines combining vision and language

License: Apache License 2.0

Python 100.00%

pytorchpipe's Introduction

PyTorchPipe

Language GitHub license GitHub version

Build Status Language grade: Python Total alerts Coverage Status Maintainability

Description

PyTorchPipe (PTP) is a component-oriented framework that facilitates development of computational multi-modal pipelines and comparison of diverse neural network-based models.

PTP frames training and testing procedures as pipelines consisting of many components communicating through data streams. Each such a stream can consist of several components, including one task instance (providing batches of data), any number of trainable components (models) and additional components providing required transformations and computations.

As a result, the training & testing procedures are no longer pinned to a specific task or model, and built-in mechanisms for compatibility checking (handshaking), configuration and global variables management & statistics collection facilitate rapid development of complex pipelines and running diverse experiments.

In its core, to accelerate the computations on their own, PTP relies on PyTorch and extensively uses its mechanisms for distribution of computations on CPUs/GPUs, including multi-process data loaders and multi-GPU data parallelism. The models are agnostic to those operations and one indicates whether to use them in configuration files (data loaders) or by passing adequate run-time arguments (--gpu).

Datasets: PTP focuses on multi-modal reasoning combining vision and language. Currently it offers the following Tasks from the following task domains:

  • CLEVR, GQA, ImageCLEF VQA-Med 2019 (Visual Question Answering)
  • MNIST, CIFAR-100 (Image Classification)
  • WiLY (Language Identification)
  • WikiText-2 / WikiText-103 (Language Modelling)
  • ANKI (Machine Translation)

Aside of providing batches of samples, the Task class will automatically download the files associated with a given dataset (as long as the dataset is publicly available). The diversity of those tasks (and associated models) proves the flexibility of the framework, we are working on incorporation of new ones into PTP.

Pipelines: What people typically define as a model in PTP is framed as a pipeline, consisting of many inter-connected components, with one or more Models containing trainable elements. Those components are loosely coupled and care only about the input streams they retrieve and output streams they produce. The framework offers full flexibility and it is up to the programmer to choose the granularity of his/her components/models/pipelines. Such a decomposition enables one to easily combine many components and models into pipelines, whereas the framework supports loading of pretrained models, freezing during training, saving them to checkpoints etc.

Model/Component Zoo: PTP provides several ready to use, out of the box components, from ones of general usage to very specialized ones:

  • Feed Forward Network (Fully Connected layers with activation functions and dropout, variable number of hidden layers, general usage)
  • Torch Vision Wrapper (wrapping several models from Torch Vision, e.g. VGG-16, ResNet-50, ResNet-152, DenseNet-121, general usage)
  • Convnet Encoder (CNNs with ReLU and MaxPooling, can work with different sizes of images)
  • LeNet-5 (classical baseline)
  • Recurrent Neural Network (different kernels with activation functions and dropout, a single model can work both as encoder or decoder, general usage)
  • Seq2Seq (Sequence to Sequence model, classical baseline)
  • Attention Decoder (RNN-based decoder implementing Bahdanau-style attention, classical baseline)
  • Sencence Embeddings (encodes words using embedding layer, general usage)

Currently PTP offers the following models useful for multi-modal fusion and reasoning:

  • VQA Attention (simple question-driven attention over the image)
  • Element Wise Multiplication (Multi-modal Low-rank Bilinear pooling, MLB)
  • Multimodel Compact Bilinear Pooling (MCB)
  • Miltimodal Factorized Bilinear Pooling
  • Relational Networks

The framework also offers several components useful when working with text:

  • Sentence Tokenizer
  • Sentence Indexer
  • Sentence One Hot Encoder
  • Label Indexer
  • BoW Encoder
  • Word Decoder

and several general-purpose components, from tensor transformations (List to Tensor, Reshape Tensor, Reduce Tensor, Concatenate Tensor), to components calculating losses (NLL Loss) and statistics (Accuracy Statistics, Precision/Recall Statistics, BLEU Statistics etc.) to viewers (Stream Viewer, Stream File Exporter etc.).

Workers: PTP workers are python scripts that are agnostic to the tasks/models/pipelines that they are supposed to work with. Currently framework offers three workers:

  • ptp-offline-trainer (a trainer relying on classical methodology interlacing training and validation at the end of every epoch, creates separate instances of training and validation tasks and trains the models by feeding the created pipeline with batches of data, relying on the notion of an epoch)

  • ptp-online-trainer (a flexible trainer creating separate instances of training and validation tasks and training the models by feeding the created pipeline with batches of data, relying on the notion of an episode)

  • ptp-processor (performing one pass over the all samples returned by a given task instance, useful for collecting scores on test set, answers for submissions to competitions etc.)

Installation

PTP relies on PyTorch, so you need to install it first. Please refer to the official installation guide for details. It is easily installable via conda_, or you can compile it from source to optimize it for your machine.

PTP is not (yet) available as a pip package, or on conda. However, we provide the setup.py script and recommend to use it for installation. First please clone the project repository:

git clone [email protected]:IBM/pytorchpipe.git
cd pytorchpipe/

Next, install the dependencies by running:

  python setup.py develop

This command will install all dependencies via pip_, while still enabling you to change the code of the existing components/workers and running them by calling the associated ptp-* commands. More in that subject can be found in the following blog post on dev_mode.

Maintainers

A project of the Machine Intelligence team, IBM Research, Almaden.

HitCount

pytorchpipe's People

Contributors

tkornuta-ibm avatar aasseman avatar cshivade avatar stevemart avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.