The application will access the website called Craigslist in San Francisco Bay Area, look into the list of jobs for software/qa/dba/etc and then scrape the title, date posted, neighborhood, URL, job description and compensation (if available)
of each listing using Puppeteer and Cheerio. which is an implementation of jQuery.
The data will then be stored inside a MongoDB database after the program is stopped. I used MLab to set up the MongoDB database.
I also implemented some other possible use cases of using Puppeteer in the examples
directory, such as taking a screenshot of a website or scraping data from a table. I intend to use them as small features in other project(s).
- Simply run
npm install
to install the dependencies. npm start
to start the scraper application- Or check
package.json
for other scripts to run other examples of using Puppeteer