Coder Social home page Coder Social logo

corp-filings-analysis's Introduction

Financial Analysis in Retail: A Dive into Pandemic Impact

Introduction

For this project, I focussed on the four major retail corporations of the US: Target, Home Depot, Costco, and Lowe's.

The initial motivation was to extract information from the SEC 10-K filings and analyze financial trends pre, during, and post-pandemic.

However, the current progress of the project may not fully reflect the initial intentions. I faced challenges during extraction of financial metric values from filings due to computational power and time constraints.

Nevertheless, the project lays a solid groundwork for further work in this direction.

Why these insights?

Entire economies were caught off guard by the COVID-19 pandemic and faced serious financial consequences, it is reasonable to assume that pandemics are likely to persist.

Thus, studying financial trends in retail companies is particularly significant due to their direct connection to consumer behavior, economic trends, and market dynamics.

The performance of retail companies provide insights into both microeconomic factors affecting individual businesses and macroeconomic trends shaping the overall economy, making them crucial indicators for investors, policymakers, and analysts.

Tools and Technologies Used:

  • Programming Language: Python is my preferred development language. It is also suitable for this project because of its rich ecosystem of libraries available for data analysis, visualization and easy interface with various LLM APIs.
  • Deployment Platform: The primary reason I used Streamlit is because it allows very quick and easy deployment. I considered building a dashboard using Plotly Dash initially, and hosting the project on a free cloud hosting service, but that would have required more time than was available.
  • API for 10K Filing Retrieval: sec-edgar-downloader, as recommended, was initially my choice, but it proved computationally expensive as it downloaded all documents, some of which were hundreds of MBs. Handling storage and deleting temporary files became complicated. Then I tried SEC API, it seemed very promising initially, but I realized the next day that only the first 100 calls were free. Finally, I found sec-edgar-api (a wrapper on sec-edgar-downloader) that allowed storing files in memory instead of downloading to disk, which proved very helpful. Ultimately, this was the one I used.
  • LLM Inference API: Initially, I experimented with numerous Hugging Face models like Facebook BART CNN, Roberta Base Squad 2, Distilbart CNN among many others, for summarization and question-answering type tasks. However, the results were suboptimal, largely because the models weren't trained on finance-specific data. Other Hugging Face models that were pretrained on financial data either had a very small context-window, or were better suited for Sentiment Analysis based applications (which wasn't my objective). Next, I attempted to use the OpenAI API, but encountered issues with my API key, likely requiring purchasing credits. Eventually, I turned to the Google Gemini API, which yielded good results, and I didn't look back.
  • Tools for Text Preprocessing: Text preprocessing was a labourious, involving extensive regex, BeautifulSoup, and parsing to extract required section items from filings and converting them from HTML to text format.

Screen recording of the project

(The page takes around 3 to 4 minutes to load completely, this video has been sped up in certain places)

corp-filings-analysis_screencast.mp4

Challenges Encountered:

  • Insight Extraction: Extraction of useful information, especially financial metric figures and pandemic-related trends proved to be a challenge. LLMs pretrained on relevant data combined with better text pre-processing (like extraction of tabular data) would give better results.
  • Computational Resource Limitations: Since I was relying on free credits and compute power, processing large datasets and complex analysis was challenging due to resource constraints. The page loading time could be reduced by introducing parallel programming to make several API calls at once.
  • Time Constraints: The time that I could dedicate to the project was constrained which impacted the quality of analysis achieved.

Future Work

Although the current state of the project does not fully reflect the initial objectives, there is significant potential for future work in this direction. With additional resources and time, further analysis could provide valuable insights into the financial performance of these retail giants across different periods, especially in response to significant events like the COVID-19 pandemic.

corp-filings-analysis's People

Contributors

navyagarwal avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.