Coder Social home page Coder Social logo

laceymalarky / nlp_question_answer Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 40.96 MB

Developed a generative large language model fine-tuned on Stack Overflow data for question answering.

Home Page: https://huggingface.co/lmalarky/flan-t5-base-finetuned-python_qa

Python 0.41% Jupyter Notebook 99.59%
huggingface machine-learning nlp transformers flan-t5 pytorch streamlit-webapp tokenizer

nlp_question_answer's Introduction

Generative Question and Answer Large Language Model

Project Overview

DataSpeak, one of the industry's largest providers of predictive analytics solutions, needed a proof-of-concept machine learning model that can automatically generate answers to user-input questions.

Machine Learning Skills/Technologies

Text2TextGeneration, Transformers, Tokenizers, PyTorch, Hugging Face, Flan-T5 LLM, spaCy, Streamlit, Render, GPU, BeautifulSoup, Google Colab

Project Conclusions

  • Developed a generative language model using google/flan-t5-base, fine-tuned on Stack Overflow data.
  • Conducted cosine semantic similarity analysis on a generated vector embeddings database to identify the top 5 most similar questions in the dataset for user-input questions.
  • Developed a web application featuring a chatbot UI that provides generative answers from the model and generates 5 alternative answers based on cosine similarity, along with percent similarity scores.
  • Improved training set quality by pre-processing and normalizing raw text data.

Screenshot of Web Application UI

Screenshot 2023-11-01 at 4 22 20 PM

Performance & Evaluation

  • Achieved a 19% ROUGE-1 score and an average perplexity of 1.96.
  • Demonstrated high efficiency, with response times under 15 seconds.
Screenshot 2023-10-30 at 8 45 15 PM

image

Requirements

Python libraries: pandas, numpy, matplotlib, seaborn, scikit-learn, nltk, transformers, spacy, torch

Data Description:

Python Questions from Stack Overflow

nlp_question_answer's People

Contributors

laceymalarky avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.