Coder Social home page Coder Social logo

crunchbase-scraper-parser's Introduction

Crunchbase Scraper and Parser

About the software

Python 3 tool to scrape Crunchbase profiles (organization or person), and parse the profiles

Prerequisites

  • Python 3.7
  • pip
  • selenium
  • webdriver_manager
  • beautifulsoup4

Installation

Install the required packages

pip install -r requirements.txt

Or use the package manager pip to install the following libraries individually

pip install selenium
pip install webdriver_manager
pip install beautifulsoup4

Usage

from code.crunchbase import Crunchbase
import json

# Some Crunchbase profiles (both organization and person) to scrape 
crunchbase_urls = {"Google": "https://www.crunchbase.com/organization/google", "Larry Page": "https://www.crunchbase.com/person/larry-page"}

crunchbase = Crunchbase()

# Login into Crunchbase Pro, if Pro information needs to be scraped and parsed
'''
email = 'XXXXXX'
password = 'XXXXXX'
crunchbase.login(email=email, password=password)
'''

# List to store to the parsed data
crunchbase_data = list()

# Iterates through the Crunchbase urls to scrape the data
for name, url in crunchbase_urls.items():
    # Set Pro parameter to true if logged into Crunchbase Pro
    data = crunchbase.process_profile(pro=False, name=name, url=url)
    if data is not None:
        crunchbase_data.append(data)

    # Writes the scraped data to the JSON file
    with open('data/crunchbase/demo_crunchbase_data.json', 'w', newline='') as json_file:
        json.dump(crunchbase_data, fp=json_file, indent=3, ensure_ascii=False)

Citation

@inproceedings{thirupathi2021machine,
  title = {A Machine Learning Approach to Detect Early Signs of Startup Success},
  author = {Thirupathi, Abhinav Nadh and Alhanai, Tuka and Ghassemi, Mohammad M},
  booktitle = {Proceedings of the Second ACM International Conference on AI in Finance},
  pages = {1--8},
  year = {2021},
  publisher = {Association for Computing Machinery},
  doi = {10.1145/3490354.3494374},
  series = {ICAIF '21}
}

crunchbase-scraper-parser's People

Contributors

abhit20 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

crunchbase-scraper-parser's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.