suman101112 / hate-speech-detection-on-code-mixed-dataset-using-a-fusion-of-custom-and-pre-trained-models-with-pro Goto Github PK
View Code? Open in Web Editor NEWWith the increase in user-generated content on social media networks, hate speech and offensive language content are also increasing. From the perspective of computer science, automatic detection of such hate speech and offensive language content is an interesting problem to solve. The natural language community has taken a step to identify such content via automated hate speech and offensive content detection. The hate speech content is generated mostly on social media, and automatic hate speech and offensive language detection face many challenges due to non-standard spelling and grammar variations. Specifically, in a multilingual community, the hate content would be in code-mixed form, making the task further challenging. In this article, we propose a model for code-mixed hate speech detection. This model embeds the knowledge from both user-trained and multilingual pre-trained models. The proposed method also calculates the profanity word list and augments it. Experimental results on code-mixed hate speech and offensive language detection benchmarks show that our method outperforms the existing baselines.