Coder Social home page Coder Social logo

cal-itp / eligibility-server Goto Github PK

View Code? Open in Web Editor NEW
2.0 2.0 3.0 1.19 MB

Server implementation of the Eligibility Verification API

Home Page: https://docs.calitp.org/eligibility-server

License: GNU Affero General Public License v3.0

Dockerfile 1.42% Python 68.80% Shell 2.77% HCL 27.01%

eligibility-server's People

Contributors

afeld avatar angela-tran avatar dependabot[bot] avatar machikoyasuda avatar pre-commit-ci[bot] avatar thekaveman avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

eligibility-server's Issues

Refactor app.py

app.py is one giant file with a lot going on. As this project starts to mature, it makes sense to refactor the existing file into a few different modules that can grow and be maintained a little more independently.

Wait until #25 is closed to make it easier!

Refactor eligibility verification into reusable package

This is a corollary to cal-itp/benefits#141, with which this work must coordinate. See that issue for the general overview.

Server specifics

eligibility_server/verify.py is where the server-side implementation lives.

Some of the code there is setting up the flask_restful resource and dealing with HTTP request processing, and that would stay. Anything concerned with creating and processing the Eligibility Request and Response objects could be moved into the new package:

Refactor repo with .devcontainer

  • Move Dockerfile, devcontainer.json, docker-compose.yml, bin/pre-commit, .env.sample into a new top-level directory, /.devcontainer
  • Update configuration file points as necessary
  • Update docs as necessary

Why? To align this repository's folder architecture with benefits, eligibility-api and current best practices.

Implement hashed lookup

Building on #30 and cal-itp/hashfields, we want to support (at least) two distinct modes of verification lookup from our source:

  1. Normal lookup: does (id, name) exist on the list?
  2. Hashed lookup: does (hash(id), hash(name)) exist on the list?

(note in either case, the Eligibility Request coming into the server will contain the un-hashed data)

We'll want to be able to configure:

  • What kind of lookup (normal or hashed)
  • The hash function to use (sha256, sha384, sha512, others?)

Add GitHub Action to build/publish Docker image

Background

The docker-compose.yml over in benefits defines the server service and uses a local build path when the image doesn't yet exist on the machine:

server:
  build: ./server #<-- build the contents of the local ./server directory (which contains a Dockerfile)
  image: eligibility_verification_server:dev

Once the server code is refactored out of that repository and into this one, a local build path won't work anymore.

Proposal

GitHub Container Registry can be used to host Docker images, which can be built and published automatically via GitHub Actions.

  • Set up a publishing workflow for this repository that builds and publishes the server app in a Docker image to GitHub

Run setup.py on devcontainer attach

This is the corollary to #100, but for the devcontainer.

This will make the experience of getting the application running just a little faster by pre-initializing the database.

  • Rename .devcontainer/pre-commit.sh to .devcontainer/postAttach.sh (also update the devcontainer.json)
  • Add python setup.py to the post attach script

Set up Dependabot

First we'll need to login to pyup.io as the cal-itp-bot account and link the repo. We can pair on this piece.

This will generate and commit a .pyup.yml file in the repo, and we'll probably need to adjust to our needs. See the config from benefits: https://github.com/cal-itp/benefits/blob/dev/.pyup.yml

--

Use Dependabot instead and do research

Refactor configuration data from Database

The Database class (and server.json) currently contain configuration data, separate from the user data needed for verification. This configuration should be split out as a separate concern.

Refactor database class

As we move into becoming production-ready, there are a few items of cleanup in the Database class that should happen.

cc @angela-tran who discussed this with me earlier today and informed this ticket.
cc @machikoyasuda we held off on some of this to make #116 more straightforward, documenting here instead.

Method Feedback
__init__() let's not query and store all users. We may have a large list of users, and don't necessarily want to hold them all in-memory
__init__() log the given hash option as DEBUG
check_user() break up large if(...) statement checking the various states into if/elif/else etc.
check_user() query for the user here
check_user() separate calculation of the return value from the return statement
check_user() add various logging to the above refactors

Let's also verify the tests are providing adequate coverage and scoping.

Add logging to server code and setup/teardown scripts

We should configure logging for the project as a whole and add relevant statements throughout the code.

Additional context

Maybe a follow-up: figure out how to get the Python logging framework configured and replace the print statements with more appropriate log-level statements.

Originally posted by @thekaveman in #104 (review)

Initial testing

Unit Tests: Set Up

  • Add pytest, coverage dependencies
  • Set up test scripts

Unit Tests: Actual test

  • Write unit tests

Local development bug: Pre-commit install error

I am unable to get pre-commit running in my local devcontainer. This is the error log I get:

Pre-commit log

version information

pre-commit version: 2.19.0
git --version: git version 2.30.2
sys.version:
    3.9.13 (main, May 18 2022, 04:17:41) 
    [GCC 10.2.1 20210110]
sys.executable: /usr/local/bin/python
os.name: posix
sys.platform: linux

error information

An unexpected error has occurred: CalledProcessError: command: ('/root/.cache/pre-commit/repofu2cxgzn/py_env-python3.9/bin/python', '-mpip', 'install', '.')
return code: 1
expected return code: 0
stdout:
    Processing /root/.cache/pre-commit/repofu2cxgzn
      Preparing metadata (setup.py): started
      Preparing metadata (setup.py): finished with status 'done'
    Collecting ruamel.yaml>=0.15
      Using cached ruamel.yaml-0.17.21-py3-none-any.whl (109 kB)
    Collecting toml
      Using cached toml-0.10.2-py2.py3-none-any.whl (16 kB)
    Collecting ruamel.yaml.clib>=0.2.6
      Using cached ruamel.yaml.clib-0.2.6.tar.gz (180 kB)
      Preparing metadata (setup.py): started
      Preparing metadata (setup.py): finished with status 'error'
    
stderr:
      error: subprocess-exited-with-error
      
      × python setup.py egg_info did not run successfully.
      │ exit code: 1
      ╰─> [3 lines of output]
          sys.argv ['/tmp/pip-install-jxph4ctq/ruamel-yaml-clib_c3f62e16a11e4867877ec98577ee7f67/setup.py', 'egg_info', '--egg-base', '/tmp/pip-pip-egg-info-bdkc5a4p']
          test compiling /tmp/tmp_ruamel_n_ndon63/test_ruamel_yaml.c -> test_ruamel_yaml compile error: /tmp/tmp_ruamel_n_ndon63/test_ruamel_yaml.c
          Exception: command 'gcc' failed: No such file or directory
          [end of output]
      
      note: This error originates from a subprocess, and is likely not a problem with pip.
    error: metadata-generation-failed
    
    × Encountered error while generating package metadata.
    ╰─> See above for output.
    
    note: This is an issue with the package mentioned above, not pip.
    hint: See above for details.
    
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/pre_commit/error_handler.py", line 73, in error_handler
    yield
  File "/usr/local/lib/python3.9/site-packages/pre_commit/main.py", line 389, in main
    return run(args.config, store, args)
  File "/usr/local/lib/python3.9/site-packages/pre_commit/commands/run.py", line 414, in run
    install_hook_envs(to_install, store)
  File "/usr/local/lib/python3.9/site-packages/pre_commit/repository.py", line 223, in install_hook_envs
    _hook_install(hook)
  File "/usr/local/lib/python3.9/site-packages/pre_commit/repository.py", line 79, in _hook_install
    lang.install_environment(
  File "/usr/local/lib/python3.9/site-packages/pre_commit/languages/python.py", line 221, in install_environment
    helpers.run_setup_cmd(prefix, install_cmd)
  File "/usr/local/lib/python3.9/site-packages/pre_commit/languages/helpers.py", line 51, in run_setup_cmd
    cmd_output_b(*cmd, cwd=prefix.prefix_dir, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/pre_commit/util.py", line 146, in cmd_output_b
    raise CalledProcessError(returncode, cmd, retcode, stdout_b, stderr_b)
pre_commit.util.CalledProcessError: command: ('/root/.cache/pre-commit/repofu2cxgzn/py_env-python3.9/bin/python', '-mpip', 'install', '.')
return code: 1
expected return code: 0
stdout:
    Processing /root/.cache/pre-commit/repofu2cxgzn
      Preparing metadata (setup.py): started
      Preparing metadata (setup.py): finished with status 'done'
    Collecting ruamel.yaml>=0.15
      Using cached ruamel.yaml-0.17.21-py3-none-any.whl (109 kB)
    Collecting toml
      Using cached toml-0.10.2-py2.py3-none-any.whl (16 kB)
    Collecting ruamel.yaml.clib>=0.2.6
      Using cached ruamel.yaml.clib-0.2.6.tar.gz (180 kB)
      Preparing metadata (setup.py): started
      Preparing metadata (setup.py): finished with status 'error'
    
stderr:
      error: subprocess-exited-with-error
      
      × python setup.py egg_info did not run successfully.
      │ exit code: 1
      ╰─> [3 lines of output]
          sys.argv ['/tmp/pip-install-jxph4ctq/ruamel-yaml-clib_c3f62e16a11e4867877ec98577ee7f67/setup.py', 'egg_info', '--egg-base', '/tmp/pip-pip-egg-info-bdkc5a4p']
          test compiling /tmp/tmp_ruamel_n_ndon63/test_ruamel_yaml.c -> test_ruamel_yaml compile error: /tmp/tmp_ruamel_n_ndon63/test_ruamel_yaml.c
          Exception: command 'gcc' failed: No such file or directory
          [end of output]
      
      note: This error originates from a subprocess, and is likely not a problem with pip.
    error: metadata-generation-failed
    
    × Encountered error while generating package metadata.
    ╰─> See above for output.
    
    note: This is an issue with the package mentioned above, not pip.
    hint: See above for details.

Add GitHub Action for linting

Pre-Commit & Linting

  • GitHub Actions: pre-commit
  • .vscode/settings.json (for Python linting)
  • .pre-commit-config.yaml
  • .flake8
  • bin/pre-commit.sh
  • .devcontainer.json

Version and release strategy

Tracking a todo in this repository based on a number of related issues:

We currently push the server image tagged with main to GHCR for every commit to the main branch, and tagged with the commit SHA and latest for every Release.

As we're seeing elsewhere, a more structured version/release strategy could help bring clarity to the codebase. We should pick a strategy here that aligns across the Benefits portfolio (as in the linked issues).

Refactor server code into subdirectory

The app.py module currently sits at the root of the repository. This is fine for a single file, but can cause trouble when trying to import objects from the module (see e.g. #10) or when the code becomes more complicated than a single file.

Recommendation is to create a subdirectory, named eligibility_server, and place both the app.py and an (empty) __init__.py file in there.

Elsewhere (e.g. in tests) you could then:

from eligibility_server import app

Duplicate CodeQL checks

I think maybe our CodeQL workflow is defined in a way that causes duplicate checks on pull requests:

image

Research and decide on hosting

Background

The eligibility server needs to be hosted somewhere such that it can receive and respond to HTTP requests from a Benefits client instance.

The owner of the hosting environment is not as important as are the agreement(s) in place that allow for sharing of (hashed) Courtesy Card data with said hosting environment.

Decision needed

The main decision point for hosting of the Eligibility Server:

Will the server be co-located with the Benefits client app deployment (CDT/Azure), or will the server be deployed to a new environment?

Co-locating with Benefits client app

The server would just be another Azure App Service.

  • Simplest approach, no new cloud accounts / environments to build out
  • Reuse deployment strategy from Benefits app
  • Need CDT hosting contract signed
  • Need CDT approval to store hashed data

Deploy in a new cloud environment

  • Adds 1-2 weeks of build/configure time
  • Compiler would set this up and manage it (likely GCP, Google Cloud Endpoints looks like one option)
  • Could also setup Azure on behalf of MST
  • Should probably have agreement between Compiler <> MST

Initializing repo checklist

  • Be able to build container, run container
  • Be able to launch app in Debug mode in Devcontainer

Initializing repository

Required

  • README
  • LICENSE
  • .gitignore

Python

  • requirements.txt
  • app.py (transfer all files)
  • .pyup.yml
  • pyproject.toml

Docker

  • Dockerfile
  • Docker Compose file
  • .devcontainer
  • .vscode/launch.json (for Dev Container debugging)

Add GitHub action for docs + Docs

Setting up Docs

  • GitHub Actions: mkdocs
  • .aws
  • mkdocs.yml
  • docs/ *

Writing docs

  • Document how to get server itself running
  • Document how to get server running with benefits app

Document database setup process

With #58 (+#60) the server now supports reading from different file formats.

We should update the docs site to reflect these changes and new processes:

  • The environment file and configuring for different file type(s)
  • Expected file format(s)
  • Setup and teardown process for the database

Refactor Database to allow other kinds of data formats

from @thekaveman in #30

The current Database class is a simple wrapper around the server.json file.

The public API check_user(key, user, types) method is great from a consumer point of view. But the internals are tied too closely to the server.json file structure.

The database may proxy to a number of data sources:

  • server.json like file with nested data
  • .csv file of users in a flat table
  • either of the above, but from an S3 or GCP bucket
  • real database connection
  • etc.

We should refactor to allow different use-cases, perhaps through inheritance or some plug-in mechanism.

Things to figure out

  1. Establish a database abstraction: Where should the data source file be specified and saved? ((Assume there's no data.json file and how can we add more abstraction so that the code can be talking to something separate from the data))
  2. Get csv/json/etc of data into this abstraction: Getting the data into this new db format (to be specified by #1).

Make CSV options configurable

Currently when setting up from a CSV input file, some settings are assumed, but we likely want them to be configurable:

if settings.IMPORT_FILE_FORMAT == "csv":
  with open(settings.IMPORT_FILE_PATH, newline="", encoding="utf-8") as file:
    data = csv.reader(file, delimiter=";", quotechar="", quoting=csv.QUOTE_NONE)
  • delimiter may not be a semicolon ;
  • values may be quoted
  • we may want to handle newlines differently depending on the file origin

Let's make these settings that can be configured from the environment.

Add CodeQL

Add CodeQL GitHub Action for Eligibility Server

Remove "Merchant" mocks

These were added when this was part of the benefits project to help testing other API integrations outside of the core Eligibility Verification API.

Remove:

  • /static/ folder and contents
  • MerchantAuthToken
  • Database.check_merchant and related code
  • From server.json: config.request_access and merchants data

Use Flask configuration framework to load settings

Instead of getting config manually from the settings module like:

app = Flask(__name__)
app.name = settings.APP_NAME
app.config["SQLALCHEMY_DATABASE_URI"] = settings.SQLALCHEMY_DATABASE_URI
# etc.

Flask has built-in support for loading the configuration from a number of sources. We're using the Python files format, which would look something like:

app = Flask(__name__)
app.config.from_object("eligibility_server.settings")

We can also load from a number of supported file types:

import json
app.config.from_file("config.json", load=json.load)

Or directly from certain environment variables:

app.config.from_prefixed_env()

Let's use the built-in framework for loading configuration data.

See also: Flask Configuration Best Practices

Configure Docker image for production

The current Docker image (via bin/start.sh) starts the Flask development server.

This has been fine as we've only been using the server for testing. As we move to deploy the server for Courtesy Cards, we'll want to run Flask using production best-practices.

See more at: Flask - Deploying to Production.

In general, we can follow a similar pattern as in benefits:

  • nginx is the reverse proxy that accepts traffic coming into the container, and routes app traffic along
  • gunicorn is the WSGI application server, receiving app traffic from nginx and forwarding to the app (Flask)
  • bin/start.sh starts the production setup
  • From within the devcontainer, the development setup is used

Audit current docs and correct any errors

Outstanding bugs:

  • From the main directory, run coverage run -m pytests -> should be coverage run -m pytest
  • The whole tests section should be updated to add the teardown/setup and marks information

Make database setup and teardown idempotent

Currently, an attempt to restart a server Docker container will result in an IntegrityError:

+ python setup.py
Creating table...
Table created.
Importing users from /.devcontainer/server/data.csv
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1819, in _execute_context
    self.dialect.do_execute(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 732, in do_execute
    cursor.execute(statement, parameters)
sqlite3.IntegrityError: UNIQUE constraint failed: user.key

image

A separate but similar issue is that an attempt to teardown the database if it has already been torn down results in an OperationalError:

+ python teardown.py
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1819, in _execute_context
    self.dialect.do_execute(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 732, in do_execute
    cursor.execute(statement, parameters)
sqlite3.OperationalError: no such table: user

We should make the setup and teardown scripts idempotent to prevent these errors.

Additional context

#101 made it so the app container initializes the database on startup.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.