This project aims to query a website that contains news about technology. To do this, data scraping was used, which is a technique for collecting data from online platforms. The data is captured from the scripts that are generated by the pages and programs that βscrapeβ the information. After the scraping is finished, the data is saved in a database.
With the data already saved and structured, the program allows to search by title, date, tags and news category.
An interactive menu is available so that the user can do the processes more easily.
Clone the application using the git clone
command. After that, enter the project folder using the command cd tech-news
.
- Create the virtual environment for the project
python3 -m venv .venv && source .venv/bin/activate
- Install the dependencies
python3 -m pip install -r dev-requirements.txt
In the root folder of the project, use the command docker-compose up -d mongodb
.
- In the terminal, use the command:
python3 -m tech_news.menu
This command will bring up the menu, which contains several options on how to view the data that was collected from the scrape.
If this is your first time using the application, first use option 0
on the menu to populate the database.