Coder Social home page Coder Social logo

bloomtech-labs / pt15-cityspire-c-ds Goto Github PK

View Code? Open in Web Editor NEW
0.0 7.0 4.0 29.21 MB

A one-stop resource for users to receive the most accurate city information.

Home Page: http://cityspire00n.eba-diy2emuk.us-east-1.elasticbeanstalk.com/

License: MIT License

Python 1.95% Jupyter Notebook 98.03% Dockerfile 0.02%
recommend-cities livability-score tfidf-vectoring nearest-neighbors

pt15-cityspire-c-ds's Introduction

CitySpire - Data Science

Docs

Mission

Be a one-stop resource for users to receive the most accurate city information.

Description

An app that analyzes data from cities such as populations, cost of living, rental rates, crime rates, park (walk score), and many other social and economic factors that are important in deciding where someone would like to live. This app will present such important data in an intuitive and easy to understand interface.

Use data to find a place right for you to live.

alt text

Data Engineering

FastAPI app, deployed to AWS, provides 3 primary routes:

  • /cityspire is a GET route that provides all of the data in the database in a table format.
  • /locations is a GET route that provides a list of all cities in the database.
  • /location/data is a POST route that takes a request of location in the form of "City, State" and returns all of the data about that location.
Type Endpoint Required Parameters Returns
GET /cityspire none "[[0, 0, "Akron, Ohio", 197597.0, 678.0, 1782.0, 27.0, 181.0, 328.0, 1246.0, 6568.0, 1686.0, 4305.0, 577.0, 65.0, 8484.440553247267, 46, 46, 90.8, 7972.779227752733], ...]"
GET /locations none { "locations": ["Akron, Ohio", "Albany, New York", ...] }
POST /location/data/ "location": "City, State" { "city_name": "El Paso, Texas", "population": 681728, "rent_per_month": 990, "walk_score": 41, "livability_score": 12687 }

[TODO] More details about the API endpoints can be found at the ReDoc interface or by exploring the interactive SwaggerUI.

Machine Learning

Nearest Neighbors ML Model for CitySpire City Living Recommendations

The data wrangling and merging and can be found in the wrangling.ipynb notebook, while the tokening and TFIDF vectoring of text, creation of Nearest Neighbors model, training on tokenized and vectorized text, and pickling of Nearest Neighbors Model and TFIDF Vectorizer can all be found in the rec_modeling.ipynb notebook in the notebooks directory.

The Nearest Neighbors and TFIDF Vectorizer pickles can be found in the pickles directory.

The pickled Nearest Neighbors model and TFIDF Vectorizer are imported into recommend.py in the app directory so that they can be used in a recommend function in the Data Engineering API in order to recommend cities to live in to users based on desired population, rental rate, crime rate, walkability score, cost of living index, and livability score.

Deployment

The CitySpire API is backed by a Postgres DB in AWS RDS. The data was uploaded to the DB using the df_to_sql.py script in the notebooks directory.

After you create your own PG DB on AWS RDS you need to add the DB URL to a .env file:

DATABASE_URL=postgresql://DBusername:[email protected]/dbname

Commands to deploy locally:

Create virtual environment in root directory of project: pipenv shell

Install project dependencies in virutal environment: pipenv install --dev

Launch app locally: uvicorn app.main:app --reload

Launch app locally on different port: uvicorn app.main:app --reload --port 8080

The API app is deployed to AWS Elastic Beanstalk using a Dockerfile. It is crucial to organize all of the app directories into the app directory because the Dockerfile copies the app structure from the app directory, not the root directory of this repo.

Documentation on how to set up AWS and EB CLI

Commands to deploy to Elastic Beanstalk:

Commit your work: git add --all git commit -m "Your commit message"

Then use these EB CLI commands (Elastic Beanstalk command line interface) to deploy. (Replace CHOOSE-YOUR-NAME with your own name.) eb init --platform docker --region us-east-1 CHOOSE-YOUR-NAME eb create --region us-east-1 CHOOSE-YOUR-NAME

Do you have environment variables? Then configure environment variables in the Elastic Beanstalk console.

Now you can open your deployed app! ๐ŸŽ‰ eb open

Commands to redeploy to Elastic Beanstalk:

Commit your work: git add --all git commit -m "Your commit message"

Then use these EB CLI commands (Elastic Beanstalk command line interface) to re-deploy. eb deploy eb open

It is also possible to redeploy without committing your work with these commands: git add . eb deploy --staged

Data Sources

Population Data - https://www2.census.gov/programs-surveys/popest/tables/2010-2019/cities/totals/SUB-IP-EST2019-ANNRES.xlsx

Rental Rates - https://files.zillowstatic.com/research/public_v2/zori/Zip_ZORI_AllHomesPlusMultifamily_SSA.csv

Crime Rates - https://ucr.fbi.gov/crime-in-the-u.s/2019/crime-in-the-u.s.-2019/tables/table-8/table-8.xls/view

Walk Scores - https://www.walkscore.com/cities-and-neighborhoods/

Cost of Living Index - https://advisorsmith.com/data/coli/

Contributors

John Dailey Neha Kumari Theda Mickey Wells
Data Scientist Data Scientist Data Scientist Data Scientist

pt15-cityspire-c-ds's People

Contributors

johnjdailey avatar neha-kumari31 avatar rrherr avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.