Coder Social home page Coder Social logo

suhailroushan13 / techcrunch-api Goto Github PK

View Code? Open in Web Editor NEW
19.0 0.0 1.0 20 KB

TechCrunch API is a Node.js package that allows you to scrape articles from TechCrunch based on categories or tags. This package is designed for systems using Ubuntu or other Debian-based distributions that support sudo commands, leveraging Puppeteer

Home Page: https://www.npmjs.com/package/techcrunch-api

JavaScript 89.47% Shell 10.53%
api news newsapi nodejs npm npm-package techcrunch

techcrunch-api's Introduction

TechCrunch API ๐Ÿง‘โ€๐Ÿ’ป

TechCrunch API is a Node.js package that allows you to scrape articles from TechCrunch based on categories or tags. This package is designed for systems using Ubuntu or other Debian-based distributions that support sudo commands, leveraging Puppeteer to navigate and scrape content from a headless Chromium environment. ๐ŸŒ

Features ๐Ÿš€

  • Scrape by Category: Automatically retrieve all articles under a specified category. ๐Ÿ“‚
  • Scrape by Tag: Collect articles that are tagged with a specific keyword. ๐Ÿท๏ธ
  • Headless Browser Support: Runs Chromium in headless mode to scrape dynamic content. ๐Ÿ‘ป
  • Optimized for Ubuntu: Includes installation instructions specifically for Ubuntu, but compatible with other Linux distributions. ๐Ÿง

Prerequisites ๐Ÿ“‹

Before installing the TechCrunch Scraper, you need to ensure your system has the following dependencies installed:

  • Node.js (Version 14 or later recommended) ๐ŸŸข
  • Puppeteer ๐ŸŽญ
  • Dependencies required for Puppeteer and headless Chromium ๐Ÿ”ง

Installation

Follow these steps to set up the TechCrunch Scraper package:

Step 1: Install System Dependencies

Open a terminal and execute the following commands to install necessary libraries:

npm install puppeteer
sudo apt-get update
sudo apt-get install -y libgbm-dev xvfb chromium-browser libvpx7 libevent-2.1-7 libharfbuzz-icu0  libwebpdemux2 libenchant-2-2 libsecret-1-0  libmanette-0.2-0 libflite1  libgles2-mesa
Xvfb :99 -screen 0 1920x1080x24 &
export DISPLAY=:99

Step 2: Install TechCrunch API Package

Install the package via npm with the following command:

npm install techcrunch-api

Usage

After installation, you can use the package in your Node.js scripts as follows:

ES6 Syntax

import { getByCategory, getByTag } from "techcrunch-api";

// Fetch articles by category using async/await
// Valid categories/tags for fetching articles (must be used in lowercase):
// 1. media-entertainment
// 2. transportation
// 3. cryptocurrency
// 4. security
// 5. artificial-intelligence
// 6. apps
// 7. fintech
// 8. startups
// 9. venture
// 10. hardware

const fetchArticles = async () => {
  try {
    const articles = await getByCategory("security"); 
    console.log(articles);
  } catch (error) {
    console.error("Error fetching articles:", error);
  }
};

fetchArticles();

const fetchTag = async () => {
  try {
    const tags = await getByTag("apis");
    console.log(tags);
  } catch (error) {
    console.error("Error fetching tags:", error);
  }
};

fetchTag();

Running the Scraper

node app.js 

techcrunch-api's People

Contributors

suhailroushan13 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

sami3102

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.