This project utilizes Scrapy, a powerful and flexible web scraping framework for Python, to extract data from websites.
Scrapy is an open-source and collaborative web crawling framework for Python. It provides a set of pre-defined methods and tools for crawling websites, extracting data, and saving it in a structured format. Scrapy is widely used for scraping large amounts of data efficiently and is highly extensible.
- Python (>=3.6)
- pip (Python package installer)
-
Fork the repository:
- Click on the "Fork" button at the top right of the repository page on GitHub.
-
Clone the forked repository:
git clone https://github.com/Harshavardhan-Yaddalapuri/WebScrappingUsingScrapy.git cd webscrappingUsingScrapy
-
To run the spider:
scrapy crawl bookspider
-
Create your own spider:
scrapy genspider <YourSpiderName> <website-to-scrape.com>