Sample project to ingest publicly available food truck location data in San Francisco and present location info on google maps.
-
very simplistic/"monolithic" lambda service that:
- is scheduled to fire every hour
- grabs the csv from public endpoint
- records data in dynamodb
- generates simple google maps plotting of active(?) food trucks
- presents the map as simple html in s3 website enabled bucket
-
all infra managed/created with pulumi/python
Though this exercise calls for pulumi cloud, you should be well within the free tier for managed resources.
- running docker environment to build images
- python 3.10+
- pulumi account with api key/token
- pulumi binary (install and setup notes)
- AWS account with API credentials
- we assume either a linux environment, but easy to adapt to macOS
- google maps API key
git clone REPO
cd REPO
export AWS_ACCESS_KEY_ID=<MY_REAL_KEY_ID_HERE>
export AWS_SECRET_ACCESS_KEY=<MY_REAL_SECRET_KEY_HERE>
export GOOGLE_MAPS_API_KEY=<MY_REAL_GOOGLE_MAPS_KEY_HERE>
pip install -r requirements.txt
pulumi login # this will prompt you for your pulumi api key
pulumi up -y
If all goes well, you should see your s3 bucket url at the end of pulumi output:
Outputs:
bucket_name : "food-truck-mapr-of-excellence"
bucket_website: "food-truck-mapr-of-excellence.s3-website-us-east-1.amazonaws.com"
The lambda service triggers every hour, but if you want to see the map right away, just log into the AWS console and trigger the lambda with any test event...content won't matter.
- how do we handle stale data?
- separate data retrieval, storage, and mapping functionality
- better organize the infra-as-code
- need to be more restrictive with our lambda execution policy
- use secrets manager instead of loading in from env vars
- CI/CD the pulumi run
- add tests!!!
- observability: how well does our app perform throughout?