Coder Social home page Coder Social logo

polku-poc's Introduction

Polku Proof-of-Concept

Build Status

NOTE: This is still work in progress. Please come back in a few days!

This proof of concept implements the following streaming architecture:

alt text

The architecture above implements an app that does the following:

  • Ingests events from two sources: server and client.
  • Translates the user IDs in all incoming events into user names
  • Delivers all client events to a table called polku_poc.client in Redshift.
  • Delivers all server events to a table called polku_poc.server in Redshift.
  • Logs a warning when a user is named for the first time.
  • Forwards all warning log events to Slack.

Prerequisites

Configure the AWS CLI

You should have configured the AWS CLI with a profile that has enough rights to deploy all the components of this infrastructure environment. You will also need a S3 bucket where that same CLI profile has read and write access.

Python 3.6

This PoC uses the Python3.6 runtime in AWS Lambda. That means that to be able to run the tests below you will need to install that Python version on your local machine. I recommend you use pyenv to manage and install multiple Python versions on your system. With pyenv installing the required Python version is as easy as:

pyenv install 3.6.0b2

Install the deployment dependencies

make install
. .env/bin/activate
humilis configure --local

Environment variables

The following environment variables are needed to deploy all the feature of this PoC:

variable description
HUMILIS_BUCKET A S3 bucket for deployment artifacts
HUMILIS_AWS_REGION The AWS region, e.g. eu-west-1
REDSHIFT_HOST The hostname of your Redshift cluster master node
REDSHIFT_PORT The port where the Redshift master node is listening
REDSHIFT_DB The name of the Redshift database
REDSHIFT_USER The Redshift username
REDSHIFT_PWD The Redshift password
SENTRY_DSN The [Sentry][sentry] DSN
SLACK_TOKEN The token to access Slack's web API
SLACK_CHANNEL The name of the channel where messages will be posted

Note that you will need to manually create the HUMILIS_BUCKET S3 bucket before attempting to deploy this Polku PoC.

Deployment

I have extracted the most important deployment parameters into a [parameters.yaml.j2][./parameters.yaml.j2] file. A brief explanation of the purpose of each parameter can be found in the comments embedded in the parameters file. You can edit the deployment parameters as you see fit. Then:

polkupoc --stage DEV appy

The command above will deploy to a stage named DEV. You can have multiple parallel (identical) deployments by using a different deployment stage.

Once the deployment has completed you will find the deployment outputs (things such as the name of the S3 bucket where events are delivered) in a file called polkupoc-[STAGE]-outputs.yaml.

Testing

Unit Test:

make test

Integration Test:

make testi

Redshift migration

There is one last step you need to take to have a fully functional app. You need to create the target tables in Redshift so that Firehose can deliver the relevant events to them. You do that by editing the models in polku_poc/models/polkupoc.py and then using Alembic to generate a migration script for you:

polkupoc --stage DEV alembic -- revision --autogenerate

Check that the migration script is correct, then apply the migration:

polkupoc --stage DEV alembic -- upgrade head

Contact

If you have questions, bug reports, suggestions, etc. please create an issue on the GitHub project page.

polku-poc's People

Contributors

raam86 avatar

Stargazers

Jefferson Souza avatar

Watchers

James Cloos avatar German Gomez-Herrero avatar Lee Clissett avatar  avatar

Forkers

isabella232

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.