In this mini project I have learned:
- How to handle / manupulate Datatable.
- How text representaion in NLP is:
- Index-based Encoding pipeline (Corpus / text -> Text normalization -> Vectorize -> new text representation).
- Text normalization (lowercasing, Puntuation Removal, Create Dictionary, ...)
- Data crawling:
- Using Selenium package to crawl data derived from Vietnam.net (Vietnamese article website)