Coder Social home page Coder Social logo

openownership / bodsanalysis Goto Github PK

View Code? Open in Web Editor NEW
9.0 9.0 0.0 246 KB

Notebooks and code for analysing data published to the Beneficial Ownership Data Standard

Home Page: https://www.openownership.org/en/publications/analysis-notebooks-and-dashboards-for-beneficial-ownership-data-standard-bods-data/

License: MIT License

Jupyter Notebook 44.66% Python 55.34%
beneficial-ownership open-data open-source python beneficial-ownership-data

bodsanalysis's People

Contributors

dependabot[bot] avatar kathryn-ods avatar kd-ods avatar lgs85 avatar stephenabbott avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bodsanalysis's Issues

Identifying circular ownership in BODS data

Circular ownership structures involve two or more legal vehicles directly or indirectly owning each other. In the UK, this is banned due to the risks.

In 2019, Global Witness found that 0.01% of companies registered in UK in 2019 were involved in circular ownership schemes, thus violating UK law (see Getting the UK's house in order report).

How can we best identify circular ownership in BODS data and provide an up-to-date query showing the number of companies which might exhibit circular ownership? See https://github.com/Global-Witness/uk-beneficial-ownership-analysis-2019-public

Add set of queries to capture trends in company dissolution

I'd like us to add a new set of queries to https://github.com/openownership/bodsanalysis/blob/main/qbods.py which show trends in the data we have on companies which have reported both an incorporation date and a dissolution date.

This idea is prompted by recent stories in the UK. As highlighted by Graham Barrow on Twitter and during a parliamentary select committee hearing session on 8th November 2022 (see Guardian coverage), 'burner companies' are those created for a short period of time which can be linked to short-term fraudulent activity.

Proposed queries:

  1. Number of companies in a BODS dataset with a dissolutionDate
  2. Average lifespan of a company in a whole BODS dataset from foundingDate to dissolutionDate
  3. Plot annual trend in average lifespan of a company in a BODS dataset
  4. Average lifespan of a company in the last year from a BODS dataset
  5. Map the addresses of companies dissolved in the last year to see if there is any geographic pattern

These should be added alongside other queries aligned with the 'Up to date and auditable' principle.

Sort out query names

The numbered query names in qbods.py isn't suited to a generalised module, and we should come up with a better system for naming functions.

Consider how to deal with very large datasets

At present, the latvia_demo uses a dataset that is small enough to be downloaded onto a local machine or deepnote server, and for tables to be read into memory. However for very large BODS datasets we will start to encounter memory issues. We will need to consider options for how to best deal with this - most likely this will involve pointing the queries at a database.

Check if a company has a BO under the age of 20 or over 80

Consider amendment to q224 inspired by Transcrime research to check for number of companies registered in any jurisdiction where at least one BO is under 20 or over 80:

Bosisio et al. (2021) found that 3% of companies registered in Lombardy (Italy) had at least one BO or director displaying this anomaly (being under 20 years old or over 80 years old).

https://www.transcrime.it/en/publications/datacros/
https://www.transcrime.it/en/publications/the-changes-in-ownership-of-italian-companies-during-the-covid-19-emergency/

Develop generalised functions

The structure of qbods.py is largely one query == one function. This results in a lot of redundant, inefficient and ugly code. We should reorganise the module and develop a set of generalised functions that are common to multiple queries.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.