Coder Social home page Coder Social logo

jhihruei / gpt-investar Goto Github PK

View Code? Open in Web Editor NEW

This project forked from uditgupta10/gpt-investar

0.0 0.0 0.0 1.38 MB

Enhancing Stock Investment Strategies through Annual Report Analysis with Large Language Models

License: MIT License

Python 5.30% Jupyter Notebook 94.70%

gpt-investar's Introduction

GPT-InvestAR

Enhancing Stock Investment Strategies through Annual Report Analysis with Large Language Models

This repository contains a set of tools and scripts designed to enhance stock investment strategies through the analysis of annual reports using Large Language Models. The components in this repository are organized as follows:

  1. download_10k.py: This Python script downloads 10-K filings of companies from the SEC website, which contain crucial financial information.

  2. convert_html_to_pdf.py: Converts HTML files to PDF files. PDFs are preferred due to their token efficiency for further analysis.

  3. make_targets.py: Generates a DataFrame of stock tickers with target values of different time resolutions, which can be used as investment targets for a Machine Learning model.

  4. embeddings_save.py: Generates embeddings of PDF files and saves them using Cromadb. These embeddings are numerical representations of the textual content in annual reports.

  5. gpt_scores_as_features.py: Utilizes saved embeddings to query all questions for each annual report using a Large Language Model (LLM) such as GPT-3.5, and uses the scores or answers as features.

  6. modeling_and_return_estimation.ipynb: This Jupyter Notebook contains the core modeling process. It uses machine learning techniques, specifically Linear Regression, to model the dataset and estimate returns. The goal is to create a portfolio of top-k predicted stocks and compare their returns with the S&P 500 index.

By following the sequence of these components, you can analyze annual reports, generate embeddings, and build predictive models to potentially enhance stock investment strategies.

Feel free to explore each component for more details and usage instructions.

Dependencies

  1. LLama Index (and related dependencies)

  2. OpenBB (and related dependencies)

  3. Scikit-Learn

  4. PDFKit (and related dependencies)

It is recommended to install libraries 1 and 2 in separate virtual (conda) environments. The python scripts mentioned above do not require both these libraries to be installed in the same environment.

Citation

If you use the code or find this repository helpful, please consider citing the paper:

GPT-InvestAR: Enhancing Stock Investment Strategies through Annual Report Analysis with Large Language Models
Udit Gupta
Publication Links:

  1. arXiv Link
  2. SSRN link
@article{GPT-InvestAR,
  author = {Udit Gupta},
  title = {GPT-InvestAR: Enhancing Stock Investment Strategies through Annual Report Analysis with Large Language Models},
  journal = {arXiv e-prints},
  year = {2023},
  eprint = {arXiv:2309.03079},
  url = {https://arxiv.org/abs/2309.03079},
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.