Coder Social home page Coder Social logo

nave's Introduction

News Authentication Via Emotion (NAVE)

Alexander Epstein

Northeastern University

Description

A method for utilizing a large language model (LLM) as a way to generate intermediate interpretable features for the downstream task of fake or real news classification. The LLM generates fine grained emotion sentiment analysis on both the title and each sentence of the article from which the features are exracted. Then these features are processed through a simpler classifier to determine the authenticity of the sample news article. Results of the final classifier indicate a test accuracy of ~89% suggesting the practical usage of this model in authenticating news articles.

Motivation

Machine Learning advancements have enabled large performance improvements in various problems spanning both scientific research and business. However one of the pitfalls of complex architectures that have the ability to perform complex tasks are rarely interpretable and even less so in Natural Language Processing. It is desired to provide insight into the way that these models perform these tasks so that the outputs need not be taken immediately at face value. This interpretability is instrumental in the adoption of Artificial intelligence in the mainstream and ensures we have both fair and ethical intelligent systems.

In the Age of Information it is shown that retractions in the scientific community are detrimental to the societies understanding of the topics as retractions rarely garner the same interest as the original misinformation. It is not a large leap to say that this phenomenon would extend to news as well which can lead to disinformation spreading rapidly across our society. If we seek to stop the spread of this misinformation we need to intervene at the moment of either production or consumption of this information. This work seeks to tackle this problem from the standpoint of consumption as this does not require any integration with the publishers and can be applied more generically.

In this vein we saught to authenticate the trustworthiness of a news article utilizing interpetable features. This has a two-fold impact, it allows us to attempt to provide information to the consumer that grades the trustworthiness of the information possibly preventing the spread of information, while attempting to make the models reasoning more interpretable, allowing for the consumer to decide if they trust the output of the algorithm.

Pretrained Distill BERT Model

To be able to run the jupyter notbebook efficiently long computations outputs have been serialized to just load them upon new executions. To avoid having to train the Distill BERT model a pretrained model is provided for your conviencence. This model can be downloaded here. The model must be placed in the model_outputs_distill directory to be used in the notebook.

Datasets

GoEmotions Dataset

Link

https://github.com/google-research/google-research/tree/master/goemotions

License

Apache License 2.0

Acknowledgements

@inproceedings{demszky2020goemotions, author = {Demszky, Dorottya and Movshovitz-Attias, Dana and Ko, Jeongwoo and Cowen, Alan and Nemade, Gaurav and Ravi, Sujith}, booktitle = {58th Annual Meeting of the Association for Computational Linguistics (ACL)}, title = {{GoEmotions: A Dataset of Fine-Grained Emotions}}, year = {2020} }

Fake or Real News Dataset

Link

https://www.kaggle.com/datasets/clmentbisaillon/fake-and-real-news-dataset

License

Unknown

Acknowledgements

Ahmed H, Traore I, Saad S. “Detecting opinion spams and fake news using text classification”, Journal of Security and Privacy, Volume 1, Issue 1, Wiley, January/February 2018. Ahmed H, Traore I, Saad S. (2017) “Detection of Online Fake News Using N-Gram Analysis and Machine Learning Techniques. In: Traore I., Woungang I., Awad A. (eds) Intelligent, Secure, and Dependable Systems in Distributed and Cloud Environments. ISDDC 2017. Lecture Notes in Computer Science, vol 10618. Springer, Cham (pp. 127-138).

License

MIT License

nave's People

Contributors

alexanderepstein avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.