Coder Social home page Coder Social logo

bigscience-gsoc-ideas's Introduction

Hi there ๐Ÿ‘‹

bigscience-gsoc-ideas's People

Contributors

oserikov avatar

Watchers

 avatar  avatar

bigscience-gsoc-ideas's Issues

Implement the unified activations interpretation API for similar models

difficulty: scalable, can be both 170 and 340 hours
mentor: @oserikov , TBD
requirements:

  1. pytorch
  2. sklearn
  3. python engineering code, OOP, etc.
  4. experience with Transformer Language models

useful links:

  • NeuroX codebase
  • Bert re-invents the classical NLP pipeline
  • Captum

Idea Description:

While HuggingFace quickly became the standard way to publish language models, several architectural trade-offs have been made to support the quick growth of the models' zoo. This resulted in several theoretically similar models being implemented by different teams, thus e.g. several alternative implementations of self-attentive transformers arose. While refactoring the whole zoo of models seems to be far from the accessible task, the interpretability community is forced to provide unification wrappers for handling such dissimilarities in similar models. The task is to provide a reasonable trade-off with the refactoring of the crucial models and providing the unified wrappers, and thus bring the unified interpretability API to the crucial HuggingFace models.

We could see this task from two prospects. First, one could unify the interpretability API of the sibling models such as BERT and RoBERTa . Second, one could think about bringing the unified interface to interpret and compare encoder models with e.g. encoder-decoder ones, allowing to study the similarities and distinctiveness in their behavior.

Coding Challenge

WIP

Implement the unified attention interpretation API for similar models

difficulty: scalable, can be both 170 and 340 hours
mentor: @oserikov , TBD
requirements:

  1. pytorch
  2. sklearn
  3. python engineering code, OOP, etc.
  4. experience with Transformer Language models

useful links:

  • NeuroX codebase
  • Bert re-invents the classical NLP pipeline
  • Captum

Idea Description:

While HuggingFace quickly became the standard way to publish language models, several architectural trade-offs have been made to support the quick growth of the models' zoo. This resulted in several theoretically similar models being implemented by different teams, thus e.g. several alternative implementations of self-attentive transformers arose. While refactoring the whole zoo of models seems to be far from the accessible task, the interpretability community is forced to provide unification wrappers for handling such dissimilarities in similar models. The task is to provide a reasonable trade-off with the refactoring of the crucial models and providing the unified wrappers, and thus bring the unified interpretability API to the crucial HuggingFace models.

We could see this task from two prospects. First, one could unify the interpretability API of the sibling models such as BERT and RoBERTa . Second, one could think about bringing the unified interface to interpret and compare encoder models with e.g. encoder-decoder ones, allowing to study the similarities and distinctiveness in their behavior.

Coding Challenge

WIP

Imlement tests for abstract structures such as in Curcuits thread

difficulty: scalable, can be both 170 and 340 hours
mentor: @oserikov , TBD
requirements:

  1. PyTorch
  2. sklearn
  3. experience with re-using the academic code
  4. experience with Transformer Language models

useful links:

Idea Description:

In Circuits, several abstract structures found in CV models were summarized. The Branches Specialization tendency of the CV neural networks as well as the Weight Banding property of NNs last layers have not been directly studied in LLms, though the findings of several papers (1, 2) could be related.

The task is to perform a study of the abstract structures representedness in CV and NLP models, by applying the same inspection techniques to both groups of models.

Coding Challenge

Reproduce the Branches Specialization test on some CV model; Reproduce the Individual Neurons analysis on BERT model.

Implement and perform the interpretability analysis of the BigScience models.

difficulty: scalable, can be both 170 and 340 hours
mentor: @oserikov , TBD
requirements:

  1. pytorch
  2. sklearn
  3. experience with re-using the academic code
  4. experience with Transformer Language models

useful links:

Idea Description:

During the season 2021/22, the BigScience team reached several crucial milestones by producing large-scale transformer language models. Some of them even come with the training checkpoints archived, thus allowing to study the emergence of the structures in language models. During this task, we propose to cover the released models with the supplementary interpretability information by applying classical XAI and probing methods described in the attached papers.

Coding Challenge

To better feel what the interpretability work looks like, we ask you to perform a diagnostic classification study of the GPT-like language model, using the SentEval data. Reach out to mentors as soon as possible to discuss the analysis results.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.