Coder Social home page Coder Social logo

sohomghosh / fincat_financial_numeral_claim_analysis_tool Goto Github PK

View Code? Open in Web Editor NEW
6.0 2.0 4.0 64.11 MB

A tool to detect whether numerals present in Financial Texts are in-claim or out-of-claim

License: MIT License

Python 6.75% Jupyter Notebook 93.25%
financial-data financial-machine-learning claim-detection text-classification numeral

fincat_financial_numeral_claim_analysis_tool's Introduction

FiNCAT: Financial Numeral Claim Analysis Tool

A tool to detect whether numerals present in Financial Texts are in-claim or out-of-claim. It has been accepted at the FinWeb@TheWebConf-2022 (formerly ACM-WWW) (Core rank: A*) (pre-print)

alt text

Architecture

alt text

How to use?

Use it directly from HuggingFace Spaces or Google Colab

alt text

The API is available here.

For re-training or re-using the tool locally, please refer to requirements.txt for versions of the Python libaries used while developing this tool.


Training
For training you need to execute the FiNCAT_training.ipynb notebook the present in the training folder. It needs fincat_utils.py present in the main folder and the embeddings/labels present in the training folder as .csv files. X_train_df.zip needs to be unzipped to get the X_train_df.csv file. You can obtain the raw data from here .


Using the tool locally
For using the tool locally, you do not need to train it as we have already provided the model artifacts. You can simply execute the FiNCAT_tool_enhanced_UI.ipynb notebook. More details have been provided in the tools folder. alt text

FiNCAT (with enhanced UI)

alt text

FiNCAT Video Demonstration (on YouTube)

Video Demonstration

References

This tool has been built using Google Colab and Gradio. It has been hosted using 🤗 HuggingFace Spaces.

Tool citation:

@inproceedings{ghosh-fiNCAT,
    title = "FiNCAT: Financial Numeral Claim Analysis Tool",
    author = "Sohom Ghosh, Sudip Kumar Naskar",
    year = "2022",
    journal = "In Companion Proceedings of the Web Conference 2022 (WWW ’22 Companion)"
    url = "https://arxiv.org/abs/2202.00631",
    doi = "10.1145/3487553.3524635"
}
@article{fincat2,
title = {FiNCAT-2: An enhanced Financial Numeral Claim Analysis Tool},
journal = {Software Impacts},
volume = {},
pages = {},
year = {2022},
issn = {2665-9638},
doi = {10.1016/j.simpa.2022.100288},
url = {https://www.sciencedirect.com/science/article/pii/S2665963822000367},
author = {Sohom Ghosh, Sudip Kumar Naskar},
}

Dataset and shared task citation:

@inproceedings{finum3,
  title={Overview of the NTCIR-16 FinNum-3 Task: Investor’s and Manager’s 
Fine-grained Claim Detection},
  author={Chen, Chung-Chi and Huang, Hen-Hsen and Huang, Yu-Lieh and Takamura, Hiroya and Chen, Hsin-Hsi},
  journal={Proceedings of the 16th NTCIR Conference on Evaluation of Information Access Technologies, Tokyo Japan},
  year={2022}
}
@inbook{numclaim,
author = {Chen, Chung-Chi and Huang, Hen-Hsen and Chen, Hsin-Hsi},
title = {NumClaim: Investor's Fine-Grained Claim Detection},
year = {2020},
isbn = {9781450368599},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3340531.3412100},
booktitle = {Proceedings of the 29th ACM International Conference on Information & Knowledge Management},
pages = {1973–1976},
numpages = {4}
}

Blog by Arushi Prakash

NOTE:
This tool is released under MIT license.
The embeddings and labels are released under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license.

fincat_financial_numeral_claim_analysis_tool's People

Contributors

sohomghosh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.