laceymalarky / nlp_question_answer Goto Github PK

View Code? Open in Web Editor NEW

Developed a generative large language model fine-tuned on Stack Overflow data for question answering.

Python 0.41% Jupyter Notebook 99.59%

nlp_question_answer's Introduction

Generative Question and Answer Large Language Model

Project Overview

DataSpeak, one of the industry's largest providers of predictive analytics solutions, needed a proof-of-concept machine learning model that can automatically generate answers to user-input questions.

Machine Learning Skills/Technologies

Text2TextGeneration, Transformers, Tokenizers, PyTorch, Hugging Face, Flan-T5 LLM, spaCy, Streamlit, Render, GPU, BeautifulSoup, Google Colab

Project Conclusions

Developed a generative language model using google/flan-t5-base, fine-tuned on Stack Overflow data.
Conducted cosine semantic similarity analysis on a generated vector embeddings database to identify the top 5 most similar questions in the dataset for user-input questions.
Developed a web application featuring a chatbot UI that provides generative answers from the model and generates 5 alternative answers based on cosine similarity, along with percent similarity scores.
Improved training set quality by pre-processing and normalizing raw text data.

Screenshot of Web Application UI

Performance & Evaluation

Achieved a 19% ROUGE-1 score and an average perplexity of 1.96.
Demonstrated high efficiency, with response times under 15 seconds.

Requirements

Python libraries: pandas, numpy, matplotlib, seaborn, scikit-learn, nltk, transformers, spacy, torch

Data Description:

Python Questions from Stack Overflow

Recommend Projects

laceymalarky / nlp_question_answer Goto Github PK

nlp_question_answer's Introduction

Generative Question and Answer Large Language Model

Project Overview

Machine Learning Skills/Technologies

Project Conclusions

Screenshot of Web Application UI

Performance & Evaluation

Requirements

Data Description:

nlp_question_answer's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent