Coder Social home page Coder Social logo

shreyasi2002 / nli4ct Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 0.0 7.34 MB

Safe Biomedical Natural Language Inference for Clinical Trials based on SemEval 2024 Task 2

License: GNU General Public License v3.0

Makefile 0.56% Python 13.76% Jupyter Notebook 85.68%

nli4ct's Introduction


IITK at SemEval-2024 Task 2: Exploring the Capabilities of LLMs for Safe Biomedical Natural Language Inference for Clinical Trials

Official code implementation

View Paper ยท Report Bug ยท Request Feature

Table of Contents
  1. About
  2. Usage Instructions
  3. Results
  4. Citation

About

Large Language models (LLMs) have demonstrated state-of-the-art performance in various natural language processing (NLP) tasks across multiple domains, yet they are prone to shortcut learning and factual inconsistencies. This research investigates LLMs' robustness, consistency, and faithful reasoning when performing Natural Language Inference (NLI) on breast cancer Clinical Trial Reports (CTRs) in the context of SemEval 2024 Task 2: Safe Biomedical Natural Language Inference for Clinical Trials. We examine the reasoning capabilities of LLMs and their adeptness at logical problem-solving. A comparative analysis is conducted on pre-trained language models (PLMs), GPT-3.5, and Gemini Pro under zero-shot settings using Retrieval-Augmented Generation (RAG) framework, integrating various reasoning chains. models

Usage Instructions

Project Structure

๐Ÿ“‚ NLI4CT
|_๐Ÿ“ Gemini                   
  |_๐Ÿ“„ run-gemini-chain.py   # Multi-turn conversation using Gemini Pro
  |_๐Ÿ“„ prep_results.py       # Converting the labels to Entailment/Contradiction
  |_๐Ÿ“„ Gemini_results.json   # Output of Gemini Pro - explanations and labels
  |_๐Ÿ“„ results.json          # Final labels
|_๐Ÿ“ GPT-3.5                 # Experimentation with GPT-3.5
  |_๐Ÿ“„ GPT3.5.py
  |_๐Ÿ“„ ChatGPT_results.json
|_๐Ÿ“ training-data           # Training data - Clinical Trial Reports (CTRs)
|_๐Ÿ“ Experiments             # Experimentation with other models - Flan T5 and Pre-trained Language Models (PLMs)
  |_๐Ÿ“„ flant5-label.ipynb
  |_๐Ÿ“„ PLMs.ipynb
|_๐Ÿ“„ Makefile                # Creating conda environment and installing dependencies
|_๐Ÿ“„ LICENSE
|_๐Ÿ“„ requirements.txt  
|_๐Ÿ“„ .gitignore

Install dependencies

Run the following command -

make

This will create a new anaconda environment and install the required dependencies. In case you do not use anaconda, run the following command to install the dependencies.

pip install -r requirements.txt

Get API Keys

Create a .env file in the main directory. Fetch the API Keys for GPT-3.5 and Gemini Pro and put them in the .env file as follows -

GOOGLE_API_KEY = "..."
OPENAI_API_KEY = "..."

Run Gemini Pro

Run the multi-turn conversation chain using the following command -

python run-gemini-chain.py

template Gemini Pro will generate an explanation and a label (Yes/No) for each statement in the dataset.

Results

The zero-shot evaluation of Gemini Pro yielded an F1 score of 0.69, with a consistency of 0.71 and a faithfulness score of 0.90 on the official test dataset. Our system achieved a fifth-place ranking based on the faithfulness score, a sixteenth-place ranking based on the consistency score, and a twenty-first-place ranking based on the F1 score. Gemini Pro outperforms GPT-3.5 with an improvement in F1 score by +1.9%, while maintaining almost similar consistency score. Additionally, the faithfulness score of Gemini Pro improves by +3.5% compared to GPT-3.5.

Citation


nli4ct's People

Contributors

shreyasi2002 avatar

Stargazers

Shreyasi Mandal avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.