This is a project on natural language processing. The purpose of this is to create a chatbot using a transformer model. I also apply BERT embeddings to the transformer model to see how the chatbot results change somewhat.
The data set used was a data set related to all inquiries or refunds regarding packages extracted from AIhub's 'purpose conversation data for each purpose', which consists of a JSON file.
The experimental environment is GeForce RTX 2060 SUPER GPU, which limits learning depending on the data size, so only the "refund exchange" dataset is trained.
The total number of data is 1910 QAs.
A question is put in for the training of the chatbot, and an Answer is put in as the label of the chatbot.
However, one consideration was made while refining the dataset.
If a data set is created by dividing the interactive structure data set only into speakers Q and A, can the model understand the Q&A of the interactive structure data set?
Thus, in preparation for the fact that the desired answers of the chatbot will not all end with a single sentence sequence, the Answer's data set was also manufactured in a structure that teaches learning with a training set.
This repository is based on the following repositories:
@misc{vaswani2023attention,
title={Attention Is All You Need},
author={Ashish Vaswani and Noam Shazeer and Niki Parmar and Jakob Uszkoreit and Llion Jones and Aidan N. Gomez and Lukasz Kaiser and Illia Polosukhin},
year={2023},
eprint={1706.03762},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
@misc{devlin2019bert,
title={BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding},
author={Jacob Devlin and Ming-Wei Chang and Kenton Lee and Kristina Toutanova},
year={2019},
eprint={1810.04805},
archivePrefix={arXiv},
primaryClass={cs.CL}
}