This project is about topic modeling and sentiment analysis of product reviews of thailand's famous online product website. The dataset was scrapped from an online shopping store of thailand. The dataset consist of following attributes.
- Product title
- Product rating by user
- Purchase date
- Comments about products delivery
- Product option
This notebook mainly focuses on topic modeling using (latent Dirichlet Analysis) and label generation using unsupervised learning (K-MEANS Clustering) for sentiment analysis.
Required libraries
- NLTK for preprocessing
- Numpy and Pandas
- TextBlob and Gensin(for LDA)
- TFIDF Vectorizer (for feature selection using N-gram model)
- K-Means Clustering by Sci-Kit Learn