Project of classifying tweets of company Limportant into 13 categories
- Language: Python
- It uses CountVectorizer, TfidfTransformer to pre-processing text data
- Machine learing package is sklearn
- It uses GridSearch to get hyperparameter of the best model
- Ensemble modeling with voting
- Full-score, which combines precision and recall, is 0.68/1.00
- Training size: 18 000 tweets
- Input: text, category_id
- Output: category_id
- Input: text
- Output: category_id