Coder Social home page Coder Social logo

techthiyanes / teaching Goto Github PK

View Code? Open in Web Editor NEW

This project forked from sebastian-hofstaetter/teaching

0.0 0.0 0.0 26.36 MB

Open-Source Information Retrieval Courses @ TU Wien

License: GNU General Public License v3.0

Python 78.80% C# 4.54% Jupyter Notebook 16.66%

teaching's Introduction

Hi there 👋 Welcome to my teaching materials!

I'm working on Information Retrieval at the Vienna University of Technology (TU Wien), mainly focusing on the award-wining master-level Advanced Information Retrieval course. I try to create engaging, fun, and informative lectures and exercises – both in-person and online!

Please feel free to open up an issue or a pull request if you want to add something, find a mistake, or think something should be explained better!

Contents


Advanced Information Retrieval 2021 & 2022

🏆 Won the Best Distance Learning Award 2021 @ TU Wien

Information Retrieval is the science behind search technology. Certainly, the most visible instances are the large Web Search engines, the likes of Google and Bing, but information retrieval appears everywhere we have to deal with unstructured data (e.g. free text).

A paradigm shift. Taking off in 2019 the Information Retrieval research field began an enormous paradigm shift towards utilizing BERT-based language models in various forms to great effect with huge leaps in quality improvements for search results using large-scale training data. This course aims to showcase a slice of these advances in state-of-the-art IR research towards the next generation of search engines.


New in 2022: Use GitHub Discussions to ask questions about the lecture!


Syllabus The AIR syllabus overview

Lectures

In the following we provide links to recordings, slides, and closed captions for our lectures. Here is a complete playlist on YouTube.

Topic Description Recordings Slides Text
0: Introduction 2022 Infos on requirements, topics, organization YouTube PDF Transcript
1: Crash Course IR Fundamentals We explore two fundamental building blocks of IR: indexing and ranked retrieval YouTube PDF Transcript
2: Crash Course IR Evaluation We explore how we evaluate ranked retrieval results and common IR metrics (MRR, MAP, NDCG) YouTube PDF Transcript
3: Crash Course IR Test Collections We get to know existing IR test collections, look at how to create your own, and survey potential biases & their effect in the data YouTube PDF Transcript
4: Word Representation Learning We take a look at word representations and basic word embeddings including a usage example in Information Retrieval YouTube PDF Transcript
5: Sequence Modelling We look at CNNs and RNNs for sequence modelling, including the basics of the attention mechanism. YouTube PDF Transcript
6: Transformer & BERT We study the Transformer architecture; pre-training with BERT, the HuggingFace ecosystem where the community can share models; and overview Extractive Question Answering (QA). YouTube PDF Transcript
7: Introduction to Neural Re‑Ranking We look at the workflow (including training and evaluation) of neural re-ranking models and some basic neural re-ranking architectures. YouTube PDF Transcript
8: Transformer Contextualized Re‑Ranking We learn how to use Transformers (and the pre-trained BERT model) for neural re-ranking - for the best possible results and more efficient approaches, where we tradeoff quality for performance. YouTube PDF Transcript
9: Domain Specific Applications Guest lecture by @sophiaalthammer We learn how about different task settings, challenges, and solutions in domains other than web search. YouTube PDF Transcript
10: Dense Retrieval ❤ Knowledge Distillation We learn about the (potential) future of search: dense retrieval. We study the setup, specific models, and how to train DR models. Then we look at how knowledge distillation greatly improves the training of DR models and topic aware sampling to get state-of-the-art results. YouTube PDF Transcript

Neural IR & Extractive QA Exercise

In this exercise your group is implementing neural network re-ranking models, using pre-trained extractive QA models, and analyze their behavior with respect to our FiRA data.

📃 To the 2021 assignment

📃 To the 2022 assignment


Our Time-Optimized Content Creation Workflow for Remote Teaching

Our workflow creates an engaging remote learning experience for a university course, while minimizing the post-production time of the educators. We make use of ubiquitous and commonly free services and platforms, so that our workflow is inclusive for all educators and provides polished experiences for students. Our learning materials provide for each lecture: 1) a recorded video, uploaded on YouTube, with exact slide timestamp indices, which enables an enhanced navigation UI; and 2) a high-quality flow-text automated transcript of the narration with proper punctuation and capitalization, improved with a student participation workflow on GitHub. We automate the transformation and post-production between raw narrated slides and our published materials with custom tools.

Workflow Overview

Head over to our workflow folder for more information and our custom python-based transformation tools. Or check out our full paper for an in-depth evaluation of our methods published at the SIGCSE Technical Symposium 2022:

A Time-Optimized Content Creation Workflow for Remote Teaching Sebastian Hofstätter, Sophia Althammer, Mete Sertkan and Allan Hanbury https://arxiv.org/abs/2110.05601

teaching's People

Contributors

1nternerd avatar amartzloff avatar anasasilva avatar annabelre avatar bernhard-steindl avatar bernyweiss avatar coalae avatar dependabot[bot] avatar deryeger avatar fambrogi avatar francescoanello avatar granigd avatar guille21alaman avatar heholord avatar jbogensperger avatar joaza avatar laurenzv avatar magdalenafritz avatar matthag avatar mcmolti avatar oh-e-dialog avatar omnidan avatar paulsattlegger avatar sadushzeqiri avatar sebastian-hofstaetter avatar simmac avatar sophiaalthammer avatar sschwantler avatar tobiaspk avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.