Coder Social home page Coder Social logo

pengwei715 / spot Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 1.0 4.08 MB

Spatial parking optimized tracking system to avoid parking ticket

Home Page: http://datapipeline.online

License: MIT License

Scala 34.94% Shell 12.31% Python 26.35% JavaScript 11.44% HTML 14.96%
spark spark-sql flask googlemaps scala javascript python airflow sbt sbt-assembly

spot's Introduction

Spot!

Spatial parking optimized tracking system to avoid parking ticket

This is a project I completed during the Insight Data Engineering program (Boston, Summer 2020). Visit datapipeline.online to see it in action (or watch it here).

Table of Contents

  1. Usage
  2. System
  3. Data Source
  4. Setup
  5. Run the system
  6. Contact Information

Usage

This project aim to provide the drivers if the location has highter than average rate of parking citations or not.

Red means that the number of parking citation is more than 1.5 x of the average in the 250 * 250 m^2 spatial buffer given some time unit buffer. Yellow means the number of parking citation is between 0.8 x and 1.5 x.Green means that the number of parking citation is less than 0.8 x.

The system requires three inputs.

  • Timestamp's format is "yyyy/mm/dd hh:mm:ss".
  • Time units (hour, week day, week of month, day of month).
  • Address

Demo_gif

For example, the first query above means that 1 pm parking near the University of Chicago is more likely to get a parking ticket compared to the other hours.


System

The parking ticket data is stored in S3 bucket. Spark fetch the data, add the spatial index and abstract the useful time information, then aggregate the data based on spatial and temporal buffers. Store the result into postgres.

system_png


Data Source

Chicago parking tickets


Setup

Install and configure AWS CLI and Pegasus on your local machine, and clone this repository using

Cluster Structure:

  • (4 nodes) Spark Cluster - Batch & Airflow
  • (1 node) PostgreSQL
  • (1 node) Flask
peg up ./cluster_configs/spark/master.yml
peg up ./cluster_configs/spark/worker.yml
peg up ./cluster_configs/post_node.yml
peg up ./cluster_configs/flask_node.yml

For each cluster, install the services.

spark cluster

peg service install spark_cluster aws
peg service install spark_cluster environment
peg service install spark_cluster hadoop
peg service install spark_cluster spark

Install airflow on leader node of spark cluster

sudo apt-get install python3-pip
sudo python3 -m pip install apache-airflow

Config spark cluster and sync the hadoop and spark configs among nodes.

bash ./cluster_configs/sync_scripts/sync_h.sh
bash ./cluster_configs/sync_scripts/sync_s.sh

postgres node & flask node

peg service install post_node aws
peg service install post_node environment
peg service install flask_node aws
peg service install flask_node environment

On the postgres node install postgres

sudo apt-get update && sudo apt-get -y upgrade
sudo apt-get install postgresql postgresql-contrib​

On the flask node install flask

sudo apt-get install python3-pip
pip install Flask

Run the system

Compile scala project

Generate the fat jar using sbt tools.

cd spark_batch
sbt clean
sbt compile
sbt assembly

Run spark job

After compile the jar file. Submit the job to spark to run.

bin/spark-submit --class com.spot.parking.tracking.Aggregateor --master yarn --deploy-mode client ~/Spot/parking-tracking/target/scala-2.11/parking-tracking-assembly-0.0.1.jar

Schedule job

Running airflow/schedule.sh on the master of spark cluster will add the batch job to the scheduler. The batch job is set to execute every 24 hours

bash airflow/schedule.sh

Run web app

On the flask node

sudo python3 flask/run.py

Contact Information

spot's People

Contributors

pengwei715 avatar

Watchers

 avatar  avatar

Forkers

nazlicancay

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.