coursecrawler's Introduction

uOttawa Course Crawler

This application collects data for TimetablePlanner by crawling the University of Ottawa website to collect sessions, courses and timeslots. The application exports 3 files:

Session.json (name, identifier)
Course.json (code, name, sessions)
Timeslot.json (sessionIdentifier, courseCode, section, activity, place, professor, day)

This data can then be imported to Parse via the Parse Dashboard. Originally I wanted to have the entire process automated however that isn't practical due to Parse 1800 query per minute limit. Over 20,000 objects need to be saved. I suppose I could create a script to save them in batches, however it would take over 10 minutes to run if everything is successful. The problem is that there is no way to find out how many queries you have left until you reach the limit. In addition, the requests from the TimetablePlanner application would count towards it also. Given the complexity of the problem, I decided that it was much simpler and faster to just export them to json and upload them manually.

All the logic of the web crawler is in Spider.swift. To run it, simply call Spider.crawl() and watch it do its thing. There is no GUI yet. All updates are posted to the console.

Recommend Projects

coledunsby / coursecrawler Goto Github PK

coursecrawler's Introduction

uOttawa Course Crawler

coursecrawler's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent