This Python module provides a simple functionality for (a) retrieving HTML data from web resources using BeautifulSoup and for (b) storing the data retrieved on a local storage.
You'll find the following files:
- datamanagement.py: The module itself which contains the two classes "DataSource" and "DataStorage".
- projectconfig.py: Contains config data such as the names of directories where data shall be written to.
- pipeline.py: To be used for executing the process of retrieving data from URLs and storing it locally