I completed my master's (2019) and bachelor's (2016) degrees at Renmin University of China, under the guidance of Prof. Zhicheng Dou and Prof. Ji-Rong Wen, delving into various NLP challenges.
Research interests: Retrieval-augmented generation, large language models for information retrieval, session-based document ranking
2024.5: We publish a new toolkit β‘FlashRAG, which can help implement RAG methods quickly! See more details.
2024.5: Congrats! Our three papers have been accepted by ACL 2024!
2024.4: We write a new survey about generative information retrieval. See more details.
2024.1: We propose a new instruction tuning dataset (INTERS) for unlocking the power of LLMs on search tasks. See more details.
2023.11: We analyze the risk of data leakage in LLM pre-training and write a new paper to alert this problem. See more details.
2023.8: We write a new survey about applying large language models for information retrieval. See more details.
2023.8: We publish a new version of YuLan-Chat. It achieves better performance than the official LLaMA-2 and LLaMA-2-Chat on MMLU, C-Eval, and AGI-Gaokao benchmarks!
Hello! I'm very interested in your brilliant work, but I was a little confused in some details.
I noticed that there is no knowledge label in the original dataset, so how did you do to calculate the knowledge accuracy during the test period? And I couldn't find this calculation process in the code.
I was wondering if you measure the accuracy using the weak label you propesed in the article? If yes, then how to ensure its credibility?