Coder Social home page Coder Social logo

volkovacodes / block_codes Goto Github PK

View Code? Open in Web Editor NEW
60.0 7.0 37.0 120 KB

This depository uses SEC EDGAR data in Schedule 13D and Schedule 13G data to find all positions above 5% in all US stocks between 1994 and 2018.

Home Page: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3621939

R 100.00%
finance financial-data ownership-stock sec-edgar 13-d 13-g corporate-finance

block_codes's Introduction

Block_Codes

This GitHub page describes construction of the data in the paper "Is Blockholder Diversity Detrimental?" by Miriam Schwartz-Ziv and Ekaterina Volkova (2020)

The most recent version of the paper is avaliable as SSRN: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3621939

Step 1. Download Files.

  • download_forms.R file downloads sc13d/13g files and their amendments and puts them into SQL database.
  • this file downloads the list of all forms for each year from SEC website, the only thing you need to specify is a range of years in loop and working directory
  • code is slow and takes up to several hours to complete. To make sure, that I get all posible files, I download each file twice from master file for filer and for subject.

Step 2. Extract and Convert Main Filings.

  • extract_body_form.R extracts main filing from complete submission files and convert .htm to plain text format if needed.
  • I put output into another SQL database.

Step 3. Parse SEC Header.

Step 4. Extract CUSIP from the filings.

  • extract_CUSIP.R script returns six and eight digit CUSIP from SEC filings.
  • Output of this part is a CIK-CUSIP map, which could be downloaded in .csv format from my website (www.evolkova.info)

Step 5. Extract size of the block positon.

  • parsing_prc_position.R extracts the aggregate block size from the filing.

Step 6. Extract identity of blockholders.

Step 7. Aggregate information into blockholder-company-year panel

Step 8. Download insider ownership transactions

  • Added in 2022 to improve data accuracy

Step 9. Add missing insider blocks

  • Added in 2022 to improve data accuracy

block_codes's People

Contributors

psiphitheta avatar volkovacodes avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

block_codes's Issues

Duplicate CUSIP values

Not sure if this repo is still maintained but on the off chance that it is: why are there so many duplicate CUSIP values?

Some major offenders' CUSIP6 are:

  • 13GUND
  • 13GCUS
  • 000000
  • 549SCH
  • 20549S
  • 13DUND
  • 13G(RU

Cusip all zeros

Hello Ekaterina,

Thank you for the work, but may I know why there are so many observations with cusip as "00000000"?

Block ownership above 100

Thank you very much for your work. I find that some values of 'block_hold' are above 100, may I know how you handled this issue?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.