The bigscience-gsoc-ideas from oserikov

Propose the unified attention interpretation API for both the encoders, decoders and encoder-decoder models

difficulty: scalable, can be both 170 and 340 hours
mentor: @oserikov , TBD
requirements:

pytorch
sklearn
python engineering code, OOP, etc.
experience with Transformer Language models

useful links:

Captum
Bert re-invents the classical NLP pipeline
Captum

Idea Description:

WIP

Coding Challenge

WIP

Implement the unified activations interpretation API for similar models

difficulty: scalable, can be both 170 and 340 hours
mentor: @oserikov , TBD
requirements:

pytorch
sklearn
python engineering code, OOP, etc.
experience with Transformer Language models

useful links:

NeuroX codebase
Bert re-invents the classical NLP pipeline
Captum

Idea Description:

While HuggingFace quickly became the standard way to publish language models, several architectural trade-offs have been made to support the quick growth of the models' zoo. This resulted in several theoretically similar models being implemented by different teams, thus e.g. several alternative implementations of self-attentive transformers arose. While refactoring the whole zoo of models seems to be far from the accessible task, the interpretability community is forced to provide unification wrappers for handling such dissimilarities in similar models. The task is to provide a reasonable trade-off with the refactoring of the crucial models and providing the unified wrappers, and thus bring the unified interpretability API to the crucial HuggingFace models.

We could see this task from two prospects. First, one could unify the interpretability API of the sibling models such as BERT and RoBERTa . Second, one could think about bringing the unified interface to interpret and compare encoder models with e.g. encoder-decoder ones, allowing to study the similarities and distinctiveness in their behavior.

Coding Challenge

WIP

Implement the unified attention interpretation API for similar models

difficulty: scalable, can be both 170 and 340 hours
mentor: @oserikov , TBD
requirements:

pytorch
sklearn
python engineering code, OOP, etc.
experience with Transformer Language models

useful links:

NeuroX codebase
Bert re-invents the classical NLP pipeline
Captum

Idea Description:

While HuggingFace quickly became the standard way to publish language models, several architectural trade-offs have been made to support the quick growth of the models' zoo. This resulted in several theoretically similar models being implemented by different teams, thus e.g. several alternative implementations of self-attentive transformers arose. While refactoring the whole zoo of models seems to be far from the accessible task, the interpretability community is forced to provide unification wrappers for handling such dissimilarities in similar models. The task is to provide a reasonable trade-off with the refactoring of the crucial models and providing the unified wrappers, and thus bring the unified interpretability API to the crucial HuggingFace models.

We could see this task from two prospects. First, one could unify the interpretability API of the sibling models such as BERT and RoBERTa . Second, one could think about bringing the unified interface to interpret and compare encoder models with e.g. encoder-decoder ones, allowing to study the similarities and distinctiveness in their behavior.

Coding Challenge

WIP

Implement the unified activation API for the different transformer encoders (e.g. BERT, RoBERTa, ALBERT)

Imlement tests for abstract structures such as in Curcuits thread

difficulty: scalable, can be both 170 and 340 hours
mentor: @oserikov , TBD
requirements:

PyTorch
sklearn
experience with re-using the academic code
experience with Transformer Language models

useful links:

Idea Description:

In Circuits, several abstract structures found in CV models were summarized. The Branches Specialization tendency of the CV neural networks as well as the Weight Banding property of NNs last layers have not been directly studied in LLms, though the findings of several papers (1, 2) could be related.

The task is to perform a study of the abstract structures representedness in CV and NLP models, by applying the same inspection techniques to both groups of models.

Coding Challenge

Reproduce the Branches Specialization test on some CV model; Reproduce the Individual Neurons analysis on BERT model.

Integrate existing tools: Captum, LIT, AllenNLP Interpret, NeuroX into the HF pipeline.

difficulty: scalable, can be both 170 and 340 hours
mentor: @oserikov , TBD
requirements:

pytorch
sklearn
python engineering code, OOP, etc.
experience with Transformer Language models

useful links:

Captum
AllenNLP Interpret
NeuroX

Idea Description:

WIP

Coding Challenge

WIP

Implement and perform the interpretability analysis of the BigScience models.

difficulty: scalable, can be both 170 and 340 hours
mentor: @oserikov , TBD
requirements:

pytorch
sklearn
experience with re-using the academic code
experience with Transformer Language models

useful links:

Models produced by BigScience
BigScience Interpretability papers curated list
Survey on probing classifiers
A Primer on Bertology

Idea Description:

During the season 2021/22, the BigScience team reached several crucial milestones by producing large-scale transformer language models. Some of them even come with the training checkpoints archived, thus allowing to study the emergence of the structures in language models. During this task, we propose to cover the released models with the supplementary interpretability information by applying classical XAI and probing methods described in the attached papers.

Coding Challenge

To better feel what the interpretability work looks like, we ask you to perform a diagnostic classification study of the GPT-like language model, using the SentEval data. Reach out to mentors as soon as possible to discuss the analysis results.

oserikov / bigscience-gsoc-ideas Goto Github PK

bigscience-gsoc-ideas's Introduction

Hi there 👋

bigscience-gsoc-ideas's People

Contributors

Watchers

bigscience-gsoc-ideas's Issues

Idea Description:

Coding Challenge

Idea Description:

Coding Challenge

Idea Description:

Coding Challenge

Idea Description:

Coding Challenge

Idea Description:

Coding Challenge

Idea Description:

Coding Challenge

Recommend Projects

Recommend Topics

Recommend Org