MLLMArxivTalk
최신 MLLM 관련 스터디. 기본 오후에 진행. 논문, 강의, 코드, 뉴스, 블로그 등 다양한 자료로 학습.
MLLM, LLM, NLG, Dialogue, Reinforcement learning, Distillation, Efficient, Sentence similarity, multiple tasks, multimodal, Stable diffusion, TTS, Text-To-Video, All-To-All, 우주, 생명, 지능, 윤리, 규제, 법, 노화, 의학, 투자, 개발, 인프라, 디자인, 경영, ETC...
유망 스타트업 C레벨, 국내외 탑티어 연구자, 국내외 탑티어 대학, 대학원 재학생과 졸업생, 석학, 교수 등 A급 인재들이 최신 논문, 강의 등 스터디 및 프로젝트 진행.
기본 매주 수요일 오후 7시반. 사전 학습 없이 논문 읽기 최대 20분, 토론 최대 40분. 한 번에 1 ~ 10개 논문, 강의 등 진행. 지금까지는 항상 3개. 주제 논문 선정은 자유. 탑티어 학회 논문 및 프로젝트 제작 예정.
주말을 포함하여, 거의 매일 추가 스터디 존재. 흥미로운 주제거나 참여 되는 날만 중간에 들어와서 중간에 나가도 무관. 모든 규칙은 협의 가능. 오프라인 모임도 예정. 자율 참여.
진행 사항 + 예정
2023-02-16 23:30 ~ 24:45 염기웅, 강수진, 고현웅
- GPT Understands, Too
- P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks
- Do Prompt-Based Models Really Understand the Meaning of their Prompts?
2023-02-19 23:30 ~ 24:30 염기웅, 박상준, 강수진
- ∞-former: Infinite Memory Transformer
- Improving language models by retrieving from trillions of tokens
- Augmented Language Models: a Survey
2023-02-22 19:30 ~ 21:00 염기웅, 박상준, 이웅기, 이현제
- BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
- Structure and Content-Guided Video Synthesis with Diffusion Models
- MusicLM: Generating Music From Text
2023-02-23 23:00 ~ 24:00 염기웅, 박상준, 황명하
- InstructGPT : Training language models to follow instructions with human feedback
- BioGPT: Generative Pre-trained Transformer for Biomedical Text Generation and Mining
2023-02-24 17:00 ~ 19:00 염기웅
- Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback
- Constitutional AI: Harmlessness from AI Feedback
- Provable Copyright Protection for Generative Models
- What learning algorithm is in-context learning? Investigations with linear models
- A Path Towards Autonomous Machine Intelligence
- PAL: Program-aided Language Models
- Toolformer: Language Models Can Teach Themselves to Use Tools
2023-03-01 20:30 ~ 21:40 염기웅, 이대환
- LLaMA: Open and Efficient Foundation Language Models
- Improving alignment of dialogue agents via targeted human judgements
- Training Compute-Optimal Large Language Models
2023-03-04 22:00 ~ 23:30 염기웅, 황명하
- LLaMA-based ChatGPT training, ChatLLaMA
- RLHF: Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback
- BaGuaLu: Targeting Brain Scale Pretrained Models with over 37 Million Cores
2023-03-05 21:00 ~ 21:30 염기웅,
2023-03-08 00:00 ~ 01:00 염기웅, 김미담
- Language Is Not All You Need: Aligning Perception with Language Models
- Flamingo: a Visual Language Model for Few-Shot Learning, Blog
- Multimodal Chain-of-Thought Reasoning in Language Models
2023-03-08 19:30 ~ 20:30 염기웅, 최재훈, 황지현, 김혜인
2023-03-09 20:00 ~ 22:00 염기웅, 윤상현, 신승욱
- Competition-Level Code Generation with AlphaCode
- Scaling Language Models: Methods, Analysis & Insights from Training Gopher
- GPU and learning method required for KoChatLlaMA fine-tuning
- Advantages and Problems of UForm
2023-03-10 21:00 ~ 22:20 염기웅, 나요한, 최재훈, 외 청강 5인
- GPT-4 is coming next week – and it will be multimodal, says Microsoft Germany
- MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation
- Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
- PaLM-E: An Embodied Multimodal Language Model
2023-03-10 20:00 ~ 21:00 염기웅, 황지현, 이대환, 나요한
- Language Is Not All You Need: Aligning Perception with Language Models
- Multimodal Chain-of-Thought Reasoning in Language Models
NEXT
- Tightly-Integrated Generative Encoder-Decoder Representation
- Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models
- PaLM: Scaling Language Modeling with Pathways
- SpikeGPT: Generative Pre-trained Language Model with Spiking Neural Networks
- LoRA: Low-Rank Adaptation of Large Language Models
- Language Models are Few-Shot Learners
- Low-rank Adaptation for Fast Text-to-Image Diffusion Fine-tuning
- huggingface-projects/diffusers-gallery
- huggingface-projects/diffusers-gallery-bot
규칙
- 영어만 사용은 금지. 한국어 중심 사용. 특수 용어는 영어 사용.
- 1주일에 논문 2개 이상 스터디. 되는 사람은 10개 이상.
- 3분에서 20분 현장에서 논문 읽기. 5분에서 30분 토론.
- 1시간 스터디 시, 바로 나가도 됨. 원할 때 10분 이하 참여도 무관. 자유롭게 진행. 2시간 매일도 가능.
- 각자 더 뛰어난 게 있다는 것을 인지. 다들 대단한 분들이니 질문 많이 하고, 정보 공유 자주.
- 본인이 하기로 한 일만은 수행. 한다고 말하고, 안 하는 것은 민폐다.
- 기본적으로 녹화 후 내부 공유.
- 정보를 혼자 알게 쓰지 말고, 다 같이 알게 말하기.
- 개인 사정으로 스터디 탈퇴 시, 자기소개에 인사 작성.
- 여러 기관 좋은 규칙 붙여넣기.
- 팀에 도움이 된다고 판단하면, 위 규칙을 모두 무시하고 행동.
- 추가.
후보
앞으로 할만한 논문, 코드, 강의 등.
papaer
-
Improving language models by retrieving from trillions of tokens
-
T0: Multitask Prompted Training Enables Zero-Shot Task Generalization
-
The Flan Collection: Designing Data and Methods for Effective Instruction Tuning
-
The Wisdom of Hindsight Makes Language Models Better Instruction Followers
-
Exploring the Benefits of Training Expert Language Models over Instruction Tuning
-
Unsupervised Imputation of Non-ignorably Missing Data Using Importance-Weighted Autoencoders
-
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
-
Do Prompt-Based Models Really Understand the Meaning of their Prompts?
-
Muse: Text-To-Image Generation via Masked Generative Transformers
-
Structure and Content-Guided Video Synthesis with Diffusion Models
-
Accurate global machine learning force fields for molecules with hundreds of atoms
-
Algorithms with More Granular Differential Privacy Guarantees
-
Anomaly Clustering: Grouping Images into Coherent Clusters of Anomaly Types
-
Are we cobblers without shoes? Making Computer Science data FAIR
-
Creating, Calibrating, and Validating Large-Scale Microscopic Traffic Simulation
-
Increasing Impact of Mobile Health Programs: SAHELI for Maternal and Child Care
-
Designing Responsible AI: Adaptations of UX Practice to Meet Responsible AI Challenges
-
Developer Productivity for Humans: A Human-Centered Approach to Developer Productivity
-
Development of a Machine Learning Model for Sonographic Assessment of Gestational Age
-
Estimates of broadband upwelling irradiance from GOES-16 ABI
-
Flexible Budgets in Restless Bandits: A Primal-Dual Algorithm for Efficient Budget Allocation
-
Helpful Neighbors: Leveraging Neighbors in Geographic Feature Pronunciation
-
High-Performance GPU-to-CPU Transpilation and Optimization via High-Level Parallel Constructs
-
Helpful Neighbors: Leveraging Neighbors in Geographic Feature Pronunciation
-
KwikBucks: Correlation Clustering with Cheap-Weak and Expensive-Strong Signals
-
Machine Learning for Healthcare: A Bibliometric Study of Contributions from Africa
-
Propeller: A Profile Guided, Relinking Optimizer for Warehouse-Scale Applications
-
Deepmind: Improving language models by retrieving from trillions of tokens
-
Deepmind: Mastering Stratego, the classic game of imperfect information
-
Deepmind: AlphaFold reveals the structure of the protein universe
-
Deepmind: Tackling multiple tasks with a single visual language model
-
Deepmind: Exploring the beauty of pure mathematics in novel ways
-
Deepmind: Putting the power of AlphaFold into the world’s hands
-
Google Research: Deciphering clinical abbreviations with privacy protecting ML
-
Google Research: Google Research, 2022 & beyond: Language, vision and generative models
-
Google Research: Google Research, 2022 & beyond: Responsible AI
-
Google Research: Google Research, 2022 & beyond: ML & computer systems
-
Google Research: Real-time tracking of wildfire boundaries using satellite imagery
-
Google Research: DiffQG: Generating Questions on Paired Sentences
-
Google Research: Assessment of Security Defense of Native Programs Against Software Faults
-
Google Research: Adaptive mixing of auxiliary losses in supervised learning
github
youtube
- Study Playlist
- Improving Language Models by Retrieving from Trillions of Tokens | NLP Journal Club
- ECMLPKDD2021: WuDao: Pretrain the World, Keynote speaker talk by Jie Tang
- StrictlyVC in conversation with Sam Altman, part two (OpenAI)
- Are Bigger Language Models Better? | DeepMind Gopher and RETRO
- The Illustrated Retrieval Transformer