The project deals with the development a Semantic Search and Recommendation System on Quora Question -Answer Data using Sentence Transformer. schematic search it is an Information retrieval technique where the meaning of the word is taken into consideration then your word is looked into search engine. The data set used for this project is given in the below link:
Quora Question-Answer Dataset Link: https://huggingface.co/datasets/toughdata/quora-question-answer-dataset?row=0 Parquet file format of dataset link: https://huggingface.co/datasets/quora/tree/refs%2Fconvert%2Fparquet/default/train
The datset is preprocessed and the the sentence transformer model distiluse-base-multilingual-cased-v1 is used for the creation of the embeddings of the corpus dataset. Based on the input text from the user the present model is capable to provide similar words or sentennces. Also based on the input question the developed recommedation system can deliver similar questions from the corpus data.
Input Query:
Semantic Results: