Mintlemon, Türkçe Doğal Dil İşleme Kütüphanesi, Teknofest Türkçe Doğal Dil İşleme Yarışması kapsamında geliştirildi. Nane&Limon Takımı adı altında katıldığımız 2023 Türkçe NLP Yarışması'nı 1. olarak tamamladık.
This document outlines a series of tasks aimed at enhancing the mintlemon normalizer module's capabilities in processing Turkish texts. The enhancements include refining existing functions, adding a new function for correcting spelling errors, and ensuring the module covers a comprehensive range of text normalization needs.
Tasks
1. Existing Functions
Ensure all current functions are optimized for performance and accuracy. The functions to be reviewed and refined include:
Normalize Turkish Chars
Remove Numerical Expressions
Remove Stopwords
Removing Accent Marks
Removing Punctuations
Turkish Text Lowercasing
Turkish Text Deasciification
2. Rename "Convert Text Numbers" Function
Rename the "Convert Text Numbers" function to a more intuitive name that clearly describes its purpose, such as "Convert Numbers to Words" or "NumberToText". This will make the function's functionality more transparent to users.
3. Testing and Validation
Create comprehensive test cases for all new and updated functions to ensure they work as expected across a wide range of inputs. (Include tests for edge cases.)
Conclusion
This will enhance the module's utility for developers and researchers working on Turkish NLP projects.