LMMArxivTalk

최신 LLM 관련 논문 스터디. 항상 오후에 진행. LLM, NLG, Dialogue, Reinforcement learning, Distillation, Efficient, Sentence similarity, multiple tasks, multimodal, Stable diffusion, TTS, Text-To-Video, All-To-All ETC...

규칙

영어 금지.
외국인 금지.
1주일에 논문 2개 이상.
되는 사람은 10개 이상.
최대 20분 현장에서 논문 읽기.
최대 40분 토론.
1시간 스터디 시 바로 나가도 됨.
자유롭게.
모두연 규칙 붙여넣기.
다들 대단한 분들이니 질문 많이.
공유 자주.
각자 더 뛰어난게 있다는 것을 인지.
겸손하기 노력하기 잘하기.
잠수 금지.

진행 사항 + 예정

2023-02-16 11:30 ~ 12:45 염기웅, 강수진, 고현웅

2023-02-18 7:30 ~ 8:30 염기웅, 박상준, 강수진,

∞-former: Infinite Memory Transformer

2023-02-19 11:30 ~ 12:30 염기웅, 박상준, 강수진, 김찬란,

Improving language models by retrieving from trillions of tokens

2023-02-22 7:30 ~ 8:30 염기웅, 박상준, 강수진, 고현웅, 이현제

BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models

후보

앞으로 할만한 논문, 코드, 강의 등.

papaer

github

youtube

deepmind

google

openai

other

참여 인원 소개

염기웅: 저는 여러분을 모으고 프로메우스와 바드의 꿈이라는 책을 쓰는 염기웅입니다. LLM Dialogue Distillation 에 관심과 경험이 있습니다! 경량화 Mlops 서빙 멀티모달 멀티태스크 모델에도 관심이 있습니다. [email protected] https://github.com/gyunggyung
강수진:
고현웅:
박상준:
김찬란:
이현제: 삼성SDS에서 자연어처리를 연구하고 있습니다. instruction finetuning, sentence representation 쪽에 관심이 있습니다. [email protected]
김기현:

Google Scholar

Google Scholar에서 추천 받은 것들

Richard Socher - 새로운 관련 연구

[PDF] Stabilized In-Context Learning with Pre-trained Language Models for Few Shot Dialogue State Tracking D Chen, K Qian, Z Yu - arXiv preprint arXiv:2302.05932, 2023 Prompt-based methods with large pre-trained language models (PLMs) have shown impressive unaided performance across many NLP tasks. These models improve even further with the addition of a few labeled in-context exemplars to guide output … 저장 Twitter LinkedIn Facebook

[PDF] Decoupling the Skeleton Parsing and Schema Linking for Text-to-SQL H Li, J Zhang, C Li, H Chen - arXiv preprint arXiv:2302.05965, 2023 One of the recent best attempts at Text-to-SQL is the pre-trained language model. Due to the structural property of the SQL queries, the seq2seq model takes the responsibility of parsing both the schema items (ie, tables and columns) and the … 저장 Twitter LinkedIn Facebook

[PDF] What do Language Models know about word senses? Zero-Shot WSD with Language Models and Domain Inventories O Sainz, OL de Lacalle, E Agirre, G Rigau - arXiv preprint arXiv:2302.03353, 2023 Language Models are the core for almost any Natural Language Processing system nowadays. One of their particularities is their contextualized representations, a game changer feature when a disambiguation between word senses is necessary. In this … 저장 Twitter LinkedIn Facebook

[PDF] The Wisdom of Hindsight Makes Language Models Better Instruction Followers T Zhang, F Liu, J Wong, P Abbeel, JE Gonzalez - arXiv preprint arXiv:2302.05206, 2023 Reinforcement learning has seen wide success in finetuning large language models to better align with instructions via human feedback. The so-called algorithm, Reinforcement Learning with Human Feedback (RLHF) demonstrates impressive … 저장 Twitter LinkedIn Facebook

[PDF] Task-Specific Skill Localization in Fine-tuned Language Models A Panigrahi, N Saunshi, H Zhao, S Arora - arXiv preprint arXiv:2302.06600, 2023 Pre-trained language models can be fine-tuned to solve diverse NLP tasks, including in few-shot settings. Thus fine-tuning allows the model to quickly pick up task- specific skills,''but there has been limited study of where these newly-learnt skills … 저장 Twitter LinkedIn Facebook

[PDF] Discourse Structure Extraction from Pre-Trained and Fine-Tuned Language Models in Dialogues C Li, P Huber, W Xiao, M Amblard, C Braud, G Carenini - arXiv preprint arXiv …, 2023 Discourse processing suffers from data sparsity, especially for dialogues. As a result, we explore approaches to build discourse structures for dialogues, based on attention matrices from Pre-trained Language Models (PLMs). We investigate … 저장 Twitter LinkedIn Facebook

[PDF] LongEval: Guidelines for Human Evaluation of Faithfulness in Long-form Summarization K Krishna, E Bransom, B Kuehl, M Iyyer, P Dasigi… - arXiv preprint arXiv …, 2023 While human evaluation remains best practice for accurately judging the faithfulness of automatically-generated summaries, few solutions exist to address the increased difficulty and workload when evaluating long-form summaries. Through a survey of … 저장 Twitter LinkedIn Facebook

[PDF] Prompting Large Language Model for Machine Translation: A Case Study B Zhang, B Haddow, A Birch - arXiv preprint arXiv:2301.07069, 2023 Research on prompting has shown excellent performance with little or even no supervised training across many tasks. However, prompting for machine translation is still under-explored in the literature. We fill this gap by offering a systematic study … 저장 Twitter LinkedIn Facebook

[PDF] Analyzing the Effectiveness of the Underlying Reasoning Tasks in Multi-hop Question Answering X Ho, AKD Nguyen, S Sugawara, A Aizawa - arXiv preprint arXiv:2302.05963, 2023 To explain the predicted answers and evaluate the reasoning abilities of models, several studies have utilized underlying reasoning (UR) tasks in multi-hop question answering (QA) datasets. However, it remains an open question as to how effective … 저장 Twitter LinkedIn Facebook

[PDF] Selective In-Context Data Augmentation for Intent Detection using Pointwise V-Information YT Lin, A Papangelis, S Kim, S Lee, D Hazarika… - arXiv preprint arXiv …, 2023 This work focuses on in-context data augmentation for intent detection. Having found that augmentation via in-context prompting of large pre-trained language models (PLMs) alone does not improve performance, we introduce a novel approach based …

[PDF] Knowledge is a Region in Weight Space for Fine-tuned Language Models A Gueta, E Venezian, C Raffel, N Slonim, Y Katz… - arXiv preprint arXiv …, 2023 Research on neural networks has largely focused on understanding a single model trained on a single dataset. However, relatively little is known about the relationships between different models, especially those trained or tested on different datasets. We … 저장 Twitter LinkedIn Facebook

[PDF] MQAG: Multiple-choice Question Answering and Generation for Assessing Information Consistency in Summarization P Manakul, A Liusie, MJF Gales - arXiv preprint arXiv:2301.12307, 2023 State-of-the-art summarization systems can generate highly fluent summaries. These summaries, however, may contain factual inconsistencies and/or information not present in the source. Hence, an important component of assessing the quality of … 저장 Twitter LinkedIn Facebook

[PDF] Probing Out-of-Distribution Robustness of Language Models with Parameter-Efficient Transfer Learning Methods H Cho, C Park, J Kim, HJ Kim, KM Yoo, S Lee - arXiv preprint arXiv:2301.11660, 2023 As the size of the pre-trained language model (PLM) continues to increase, numerous parameter-efficient transfer learning methods have been proposed recently to compensate for the tremendous cost of fine-tuning. Despite the impressive … 저장 Twitter LinkedIn Facebook

[PDF] Progressive Prompts: Continual Learning for Language Models A Razdaibiedina, Y Mao, R Hou, M Khabsa, M Lewis… - arXiv preprint arXiv …, 2023 We introduce Progressive Prompts-a simple and efficient approach for continual learning in language models. Our method allows forward transfer and resists catastrophic forgetting, without relying on data replay or a large number of task … 저장 Twitter LinkedIn Facebook

[PDF] Improved Knowledge Distillation for Pre-trained Language Models via Knowledge Selection C Wang, Y Lu, Y Mu, Y Hu, T Xiao, J Zhu - arXiv preprint arXiv:2302.00444, 2023 Knowledge distillation addresses the problem of transferring knowledge from a teacher model to a student model. In this process, we typically have multiple types of knowledge extracted from the teacher model. The problem is to make full use of them … 저장 Twitter LinkedIn Facebook

Multi-stage transfer learning with BERTology-based language models for question answering system in vietnamese K Van Nguyen, PNT Do, ND Nguyen, AGT Nguyen… - International Journal of …, 2023 With the fast growth of information science and engineering, a large number of textual data generated are valuable for natural language processing and its applications. Particularly, finding correct answers to natural language questions or … 저장 Twitter LinkedIn Facebook

[PDF] Debiased Fine-Tuning for Vision-language Models by Prompt Regularization B Zhu, Y Niu, S Lee, M Hur, H Zhang - arXiv preprint arXiv:2301.12429, 2023 We present a new paradigm for fine-tuning large-scale visionlanguage pre-trained models on downstream task, dubbed Prompt Regularization (ProReg). Different from traditional fine-tuning which easily overfits to the downstream task data, ProReg uses … 저장 Twitter LinkedIn Facebook

[PDF] Understanding Finetuning for Factual Knowledge Extraction from Language Models M Kazemi, S Mittal, D Ramachandran - arXiv preprint arXiv:2301.11293, 2023 Language models (LMs) pretrained on large corpora of text from the web have been observed to contain large amounts of various types of knowledge about the world. This observation has led to a new and exciting paradigm in knowledge graph … 저장 Twitter LinkedIn Facebook

[PDF] Few-Shot Table-to-Text Generation with Prompt Planning and Knowledge Memorization Z Guo, M Yan, J Qi, J Zhou, Z He, Z Lin, G Zheng… - arXiv preprint arXiv …, 2023 Pre-trained language models (PLM) have achieved remarkable advancement in table-to-text generation tasks. However, the lack of labeled domain-specific knowledge and the topology gap between tabular data and text make it difficult for … 저장 Twitter LinkedIn Facebook

[PDF] Parameter-Efficient Low-Resource Dialogue State Tracking by Prompt Tuning MD Ma, JY Kao, S Gao, A Gupta, D Jin, T Chung… - arXiv preprint arXiv …, 2023 Dialogue state tracking (DST) is an important step in dialogue management to keep track of users' beliefs. Existing works fine-tune all language model (LM) parameters to tackle the DST task, which requires significant data and computing resources for … 저장 Twitter LinkedIn Facebook

Jianfeng Gao님의 자료를 팔로우하세요

Grounded language-image pre-training LH Li, P Zhang, H Zhang, J Yang, C Li, Y Zhong, L Wang, L Yuan, L Zhang, JN … Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022

Vinvl: Revisiting visual representations in vision-language models P Zhang, X Li, X Hu, J Yang, L Zhang, L Wang, Y Choi, J Gao Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2021

Unified vision-language pre-training for image captioning and vqa L Zhou, H Palangi, L Zhang, H Hu, J Corso, J Gao Proceedings of the AAAI conference on artificial intelligence, 2020

oglee815 / lmmarxivtalk Goto Github PK

lmmarxivtalk's Introduction