Coder Social home page Coder Social logo

grasp-lyrl / picture_of_space_of_tasks Goto Github PK

View Code? Open in Web Editor NEW
4.0 1.0 0.0 2.65 MB

Picture of the Space of Learnable Tasks (ICML 23)

License: MIT License

Python 95.80% Shell 4.20%
icml-2023 information-geometry pretraining representation-learning

picture_of_space_of_tasks's Introduction

Picture of the Space of Learnable Tasks

This repository includes code to reproduce the results of the paper titled A picture of the space of typical learnable tasks by Rahul Ramesh, Jialin Mao, Itay Griniasty, Rubing Yang, Han Kheng Teoh, Mark K. Transtrum, James P. Sethna and Pratik Chaudhari that was presented at ICML 2023.

We developed information-geometric techniques to understand the representations learned by algorithms like supervised, transfer, meta, semi, self-supervised learning as they learn different kinds of tasks. We found that the manifold of probabilistic models trained on different tasks using different representation learning methods is effectively low-dimensional, and this dimensionality is extremely small. For example, trajectories of the predictions of deep networks trained on different subsets of Imagenet can be embedded in three dimensional space. This seems to indicate that typical tasks have a very strong shared structure. The structure of the space of visual classification tasks as evidenced by our techniques is consistent with the ontology of Wordnet (which was created using natural language semantics). This paper also studies the behavior of the different representation learning algorithms and finds that (a) episodic meta-learning algorithms and supervised learning traverse different trajectories during training but they fit similar models eventually, (b) contrastive and semi-supervised learning methods traverse trajectories similar to those of supervised learning, etc.

The above paper builds upon some sophisticated mathematical techniques developed in parallel in a paper titled The Training Process of Many Deep Networks Explores the Same Low-Dimensional Manifold by Jialin Mao, Itay Griniasty, Han Kheng Teoh, Rahul Ramesh, Rubing Yang, Mark K. Transtrum, James P. Sethna and Pratik Chaudhari (under review). This paper examines the representation of a deep network as a high-dimensional probabilistic model to reveal that trajectories of networks with a wide range of architectures, sizes, trained using different optimization methods, regularization techniques, data augmentation techniques, and weight initializations lie on the same manifold. This manifold is also extremely low-dimensional, e.g., the top 3 dimensions can faithfully capture the geometry in the space of predictions of more than 150,000 networks with differnt configurations.

Starter Colab notebook to understand information geometry in a very simple way: A good place to start would be this jupyter notebook. It contains a brief overview of the techniques and only takes a few minutes to run on Google Colab.

As an example of the kinds of results one can obtain using these techniqus, consider the figure below. We have plotted the representations learned by networks trained on different subsets of ImageNet. Interestingly, the trajectories of the representations resemble the Wordnet phylogenetic tree, which was built using only natural language-based semantics.

We can use these techniques to compare representations learnt using different datasets and using different methods---making them usable across many different settings. We study phenomena relating to supervised, meta- and contrastive learning and fine-tuning by studying networks in prediction space.

Setup

micromamba is a nicer package manager than anaconda and it is highly recommended. To install the packages run:

micromamba create -y -f env.yml
micromamba activate picture

Usage

The steps below can be used to reproduce the supervised learning results on ImageNet. Feel free to send us an email if you want code to reproduce the other results.

Step 1: Generate network trajectories. We first train networks and store predictions at different points on the training trajectory. The folder supervised_imagenet, describes how to train a network on ImageNet.

You can skip this step and download the ImageNet trajectories from this link; Move the downloaded files to the predictions/ folder.

Step 2: Analyze the trajectories. The folder info_geometry contains code to generate InPCA embeddings and compare different trajectories.

cd info_geometry
python inpca.py
python trajectory.py

Directory Structure

.
├── picture_of_tasks_tutorial.ipynb
├── README.md
├── LICENSE
├── env.yml
├── info_geometry          
│   ├── inpca.py           # Compute InPCA embedding on data
│   └── trajectory.py      # Analyze training trajectories
├── supervised_imagenet    
│   ├── 01_training        # Train models and store trajectory weights
│   └── 02_imprinting      # Imprint models and store predictions
├── predictions            # Folder to store trajectories
└── plots

picture_of_space_of_tasks's People

Contributors

pratikac avatar rahul13ramesh avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.