The bikeshare from dssg

Delete Unused IPython Notebooks

Just to make things neater!

Add weather as feature in GLM_Model script

Estimate model parameters

Once we have the model we want, we need to estimate its parameters by crunching historical data of number of bikes and docks at each station.

So we need a parameter_estimation.py script that:

Pulls all the historical data we need to do that estimation from the database using SQL query and puts it in a dataframe.
Estimates the parameters using this dataframe
Spits out the parameters as a vector so our model can use them in our prediction script.

Generalize Model Validation code to accomodate Binomial GLM with additional predictors

Reengineer web app to make (poisson) simulation calls asynchronously

In Model Validation script, predict based on previously predicted values instead of actual values at previous time points

Update Requirements.txt

Make sure that requirements.txt contains every package for the project.

Modularize Model Validation code into a function

parameters should include things like:
-model being validated
-number of time points into the future to predict
-minimum number of time points with which to fit the model

Make stupid simple web app to show model results

Here are the fields we want to show in a table on a page:

station name
percent full right now
how long station has been full/empty as of now
% full one hour from now
how long will it have been full/empty in 60 minutes

This involves figuring out the best way to use some data as a test set. For example, do we randomly leave out data points and subsequent data points (based on the "degree" of the AR model we're using), train the model, and then see how we did on our left out points?

Using a year of data, make a visualization of average bikes over the course of a day for every station

Get live-updating weather data into database.

Add month to GLM_Models script

Add more time points into GLM_Models script once issue #6 is closed

Wiki

REQUIRED WIKI SECTIONS

Homepage

Intro: have a sentence or two about the project: the problems its solving, the partner, and how you're solving it. Also say that "this wiki is the central place to learn about the social problem we worked on, the data we used, the methods we used to solve it, and our findings" so people know what they're looking at.
List of pages in the wiki
Problem

An in-depth description of the problem your organization, the problem you're trying to solve, and any relevant domain knowledge. Feel free to copy from blog posts and posters, if relevant.
Data

Describe the dataset(s) you used in the project as well as your database. Walk people through the data model (tables are handy for this), and include a (fake) sample of each dataset.
If you scraped data, this is the place to document that.
Methodology

An in-depth, technical write up of the method(s) you used on your projects. Use latex equation, walk people through algorithms and models, link out to relevant documentation when possible.
Results

Discuss what metrics you're using to evaluate performance (if applicable), and what your final findings where.
Future work

Discuss what you would like to do / what is in progress.

OPTIONAL WIKI SECTIONS - if its fits your project

Analysis

If you did exploratory data analysis, this is the place to put it and explain your findings. Explain each finding and what your learned from it / how it motivated the methods you used. Put this between the "Data" and "Methodology" sections. Feel free to lift content from relevant blog posts, if any.
Resources

Resources for domain knowledge, methods, and tech. Whatever pieces of paper you used to learn what you know.
Tool
- API Documentation
- Web app Documenation

Defines the ARMA model we're using.
Takes the parameter estimates output by parameter_estimation.py and uses them for our model
Query the database to get the current number of bikes at the station, time, day of week, month, and whatever other inputs the model needs.
Throw these inputs into the model and spit out a prediction of the number of bikes that will be present at each station in 60 minutes. This predicted output will eventually be shown on a simple webpage.

*V2 [JSON] Data

Directory Structure

Clean, Well named set of directories. Examples include webapp, database, and models.
No random files in the root.
Explanation of each directory in README.
Sub-README's in appropriate folders
No directories named after DSSG specific info (ie, person names)
Should your team have more than one project, each should have it its own repo.
In your data or database folder, provide a way to re-create your database from scratch. .sql files are often appropriate for this.

README

Links to appropriate sections in wiki. See wiki issue for more info.
Answers: What have you built? In a few sentences.
Answers: How do you install it?
Answers: What needs to be done/How can I help?
Has some sort of Contact Info
Open source license

Config

No public facing config info - Make sure never to hardcode in database url, password, etc.
Description of how you hide config info, ie yaml, environment variables, etc
config.example files
Requirements.txt or similar file.
Relative links in any html.

dssg / bikeshare Goto Github PK

bikeshare's People

Contributors

Stargazers

Watchers

Forkers

bikeshare's Issues

REQUIRED WIKI SECTIONS

OPTIONAL WIKI SECTIONS - if its fits your project

Directory Structure

README

Config

Recommend Projects

Recommend Topics

Recommend Org