cal-itp / eligibility-server Goto Github PK

View Code? Open in Web Editor NEW

3.0 3.0 3.0 1.27 MB

Server implementation of the Eligibility Verification API

Home Page: https://docs.calitp.org/eligibility-server

License: GNU Affero General Public License v3.0

Dockerfile 2.03% Python 68.31% Shell 2.92% HCL 26.75%

eligibility-server's People

Contributors

Stargazers

Watchers

Forkers

machikoyasuda angela-tran kinguri

eligibility-server's Issues

Add GitHub Action to run pytest for PRs

https://github.com/compilerla/intake-html-table/blob/main/.github/workflows/test.yml

Handle application errors

See Flask - Handling Application Errors

Specifically, we should look at Returning API Errors as JSON.

403 (forbidden, not authenticated/authorized)
404 (endpoint not found)
500 (other application error / Exception)

Refactor Database to allow other kinds of data formats

from @thekaveman in #30

The current Database class is a simple wrapper around the server.json file.

The public API check_user(key, user, types) method is great from a consumer point of view. But the internals are tied too closely to the server.json file structure.

The database may proxy to a number of data sources:

server.json like file with nested data
.csv file of users in a flat table
either of the above, but from an S3 or GCP bucket
real database connection
etc.

We should refactor to allow different use-cases, perhaps through inheritance or some plug-in mechanism.

Things to figure out

Establish a database abstraction: Where should the data source file be specified and saved? ((Assume there's no data.json file and how can we add more abstraction so that the code can be talking to something separate from the data))
Get csv/json/etc of data into this abstraction: Getting the data into this new db format (to be specified by #1).

Endpoint to host server's public key

Supporting a request of the server's public key directly may help simplify the benefits app.

Apply calver version naming system to the eligibility server repository

Settings tests should test different environments / values

These tests are where you might want to do some of the environment monkeypatching to ensure settings variables get the correct values from the environment etc.

Originally posted by @thekaveman in #51 (comment)

Duplicate CodeQL checks

I think maybe our CodeQL workflow is defined in a way that causes duplicate checks on pull requests:

Refactor repo with .devcontainer

Move Dockerfile, devcontainer.json, docker-compose.yml, bin/pre-commit, .env.sample into a new top-level directory, /.devcontainer
Update configuration file points as necessary
Update docs as necessary

Why? To align this repository's folder architecture with benefits, eligibility-api and current best practices.

Add logging to server code and setup/teardown scripts

We should configure logging for the project as a whole and add relevant statements throughout the code.

Additional context

Maybe a follow-up: figure out how to get the Python logging framework configured and replace the print statements with more appropriate log-level statements.

Originally posted by @thekaveman in #104 (review)

Configure Docker image for production

The current Docker image (via bin/start.sh) starts the Flask development server.

This has been fine as we've only been using the server for testing. As we move to deploy the server for Courtesy Cards, we'll want to run Flask using production best-practices.

See more at: Flask - Deploying to Production.

In general, we can follow a similar pattern as in benefits:

nginx is the reverse proxy that accepts traffic coming into the container, and routes app traffic along
gunicorn is the WSGI application server, receiving app traffic from nginx and forwarding to the app (Flask)
bin/start.sh starts the production setup
From within the devcontainer, the development setup is used

Setup branch protection

We want the main branch to be protected:

disallow deletion
disallow git push --force (we should merge all changes via PR)
require the pre-commit check to pass

Settings here: https://github.com/cal-itp/eligibility-server/settings/branches

More info: https://docs.github.com/en/repositories/configuring-branches-and-merges-in-your-repository/defining-the-mergeability-of-pull-requests/about-protected-branches

Configure CODEOWNERS

So we can automate review requests, etc.

See cal-itp/benefits#388

Document database setup process

With #58 (+#60) the server now supports reading from different file formats.

We should update the docs site to reflect these changes and new processes:

The environment file and configuring for different file type(s)
Expected file format(s)
Setup and teardown process for the database

Pin conventional-pre-commit to v1

We currently pin conventional-pre-commit to v1.0.0

There have been some recent syntax and support improvements on that hook. A new tagging/versioning mechanism means v1 will always point to the latest v1.x.x tag.

Let's pin to v1 so we always get the latest.

Add GitHub Action to build/publish Docker image

Background

The docker-compose.yml over in benefits defines the server service and uses a local build path when the image doesn't yet exist on the machine:

server:
  build: ./server #<-- build the contents of the local ./server directory (which contains a Dockerfile)
  image: eligibility_verification_server:dev

Once the server code is refactored out of that repository and into this one, a local build path won't work anymore.

Proposal

GitHub Container Registry can be used to host Docker images, which can be built and published automatically via GitHub Actions.

Set up a publishing workflow for this repository that builds and publishes the server app in a Docker image to GitHub

Audit current docs and correct any errors

Outstanding bugs:

From the main directory, run coverage run -m pytests -> should be coverage run -m pytest
The whole tests section should be updated to add the teardown/setup and marks information

Server

Set up basic Flask server
Move localhost/server code

Local development bug: Pre-commit install error

I am unable to get pre-commit running in my local devcontainer. This is the error log I get:

Pre-commit log

version information

pre-commit version: 2.19.0
git --version: git version 2.30.2
sys.version:
    3.9.13 (main, May 18 2022, 04:17:41) 
    [GCC 10.2.1 20210110]
sys.executable: /usr/local/bin/python
os.name: posix
sys.platform: linux

error information

An unexpected error has occurred: CalledProcessError: command: ('/root/.cache/pre-commit/repofu2cxgzn/py_env-python3.9/bin/python', '-mpip', 'install', '.')
return code: 1
expected return code: 0
stdout:
    Processing /root/.cache/pre-commit/repofu2cxgzn
      Preparing metadata (setup.py): started
      Preparing metadata (setup.py): finished with status 'done'
    Collecting ruamel.yaml>=0.15
      Using cached ruamel.yaml-0.17.21-py3-none-any.whl (109 kB)
    Collecting toml
      Using cached toml-0.10.2-py2.py3-none-any.whl (16 kB)
    Collecting ruamel.yaml.clib>=0.2.6
      Using cached ruamel.yaml.clib-0.2.6.tar.gz (180 kB)
      Preparing metadata (setup.py): started
      Preparing metadata (setup.py): finished with status 'error'
    
stderr:
      error: subprocess-exited-with-error
      
      × python setup.py egg_info did not run successfully.
      │ exit code: 1
      ╰─> [3 lines of output]
          sys.argv ['/tmp/pip-install-jxph4ctq/ruamel-yaml-clib_c3f62e16a11e4867877ec98577ee7f67/setup.py', 'egg_info', '--egg-base', '/tmp/pip-pip-egg-info-bdkc5a4p']
          test compiling /tmp/tmp_ruamel_n_ndon63/test_ruamel_yaml.c -> test_ruamel_yaml compile error: /tmp/tmp_ruamel_n_ndon63/test_ruamel_yaml.c
          Exception: command 'gcc' failed: No such file or directory
          [end of output]
      
      note: This error originates from a subprocess, and is likely not a problem with pip.
    error: metadata-generation-failed
    
    × Encountered error while generating package metadata.
    ╰─> See above for output.
    
    note: This is an issue with the package mentioned above, not pip.
    hint: See above for details.

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/pre_commit/error_handler.py", line 73, in error_handler
    yield
  File "/usr/local/lib/python3.9/site-packages/pre_commit/main.py", line 389, in main
    return run(args.config, store, args)
  File "/usr/local/lib/python3.9/site-packages/pre_commit/commands/run.py", line 414, in run
    install_hook_envs(to_install, store)
  File "/usr/local/lib/python3.9/site-packages/pre_commit/repository.py", line 223, in install_hook_envs
    _hook_install(hook)
  File "/usr/local/lib/python3.9/site-packages/pre_commit/repository.py", line 79, in _hook_install
    lang.install_environment(
  File "/usr/local/lib/python3.9/site-packages/pre_commit/languages/python.py", line 221, in install_environment
    helpers.run_setup_cmd(prefix, install_cmd)
  File "/usr/local/lib/python3.9/site-packages/pre_commit/languages/helpers.py", line 51, in run_setup_cmd
    cmd_output_b(*cmd, cwd=prefix.prefix_dir, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/pre_commit/util.py", line 146, in cmd_output_b
    raise CalledProcessError(returncode, cmd, retcode, stdout_b, stderr_b)
pre_commit.util.CalledProcessError: command: ('/root/.cache/pre-commit/repofu2cxgzn/py_env-python3.9/bin/python', '-mpip', 'install', '.')
return code: 1
expected return code: 0
stdout:
    Processing /root/.cache/pre-commit/repofu2cxgzn
      Preparing metadata (setup.py): started
      Preparing metadata (setup.py): finished with status 'done'
    Collecting ruamel.yaml>=0.15
      Using cached ruamel.yaml-0.17.21-py3-none-any.whl (109 kB)
    Collecting toml
      Using cached toml-0.10.2-py2.py3-none-any.whl (16 kB)
    Collecting ruamel.yaml.clib>=0.2.6
      Using cached ruamel.yaml.clib-0.2.6.tar.gz (180 kB)
      Preparing metadata (setup.py): started
      Preparing metadata (setup.py): finished with status 'error'
    
stderr:
      error: subprocess-exited-with-error
      
      × python setup.py egg_info did not run successfully.
      │ exit code: 1
      ╰─> [3 lines of output]
          sys.argv ['/tmp/pip-install-jxph4ctq/ruamel-yaml-clib_c3f62e16a11e4867877ec98577ee7f67/setup.py', 'egg_info', '--egg-base', '/tmp/pip-pip-egg-info-bdkc5a4p']
          test compiling /tmp/tmp_ruamel_n_ndon63/test_ruamel_yaml.c -> test_ruamel_yaml compile error: /tmp/tmp_ruamel_n_ndon63/test_ruamel_yaml.c
          Exception: command 'gcc' failed: No such file or directory
          [end of output]
      
      note: This error originates from a subprocess, and is likely not a problem with pip.
    error: metadata-generation-failed
    
    × Encountered error while generating package metadata.
    ╰─> See above for output.
    
    note: This is an issue with the package mentioned above, not pip.
    hint: See above for details.

Normalize devcontainer settings/extensions with benefits

We have a pretty good devcontainer setup in cal-itp/benefits in terms of the extensions installed and default settings.

Let's replicate all that over here to make them more consistent.

Refactor configuration data from Database

The Database class (and server.json) currently contain configuration data, separate from the user data needed for verification. This configuration should be split out as a separate concern.

Initializing repo checklist

Be able to build container, run container
Be able to launch app in Debug mode in Devcontainer

Initializing repository

Required

README
LICENSE
.gitignore

Python

requirements.txt
app.py (transfer all files)
.pyup.yml
pyproject.toml

Docker

Dockerfile
Docker Compose file
.devcontainer
.vscode/launch.json (for Dev Container debugging)

Remove "Merchant" mocks

These were added when this was part of the benefits project to help testing other API integrations outside of the core Eligibility Verification API.

Remove:

/static/ folder and contents
MerchantAuthToken
Database.check_merchant and related code
From server.json: config.request_access and merchants data

Use Flask configuration framework to load settings

Instead of getting config manually from the settings module like:

app = Flask(__name__)
app.name = settings.APP_NAME
app.config["SQLALCHEMY_DATABASE_URI"] = settings.SQLALCHEMY_DATABASE_URI
# etc.

Flask has built-in support for loading the configuration from a number of sources. We're using the Python files format, which would look something like:

app = Flask(__name__)
app.config.from_object("eligibility_server.settings")

We can also load from a number of supported file types:

import json
app.config.from_file("config.json", load=json.load)

Or directly from certain environment variables:

app.config.from_prefixed_env()

Let's use the built-in framework for loading configuration data.

Initialize database on `server` container startup

On startup, the server container should import user data as specified by the IMPORT_FILE_PATH environment variable.

Additional context

As noted in cal-itp/benefits#468, the server container as defined by our Docker Compose configuration currently starts the Flask application but does not initialize any data.

This is because the code in Database expects User models to already have been imported.

Make sub format check configurable

https://github.com/cal-itp/eligibility-server/blob/main/eligibility_server/verify.py#L114

This line should be configurable

if re.match(r"^[A-Z]\d{7}$", sub):

Add log level configuration as a setting

We might want to configure this default level to be something different depending on the environment (local, dev, production, etc.)

What do you think about moving the entire logging configuration into the settings.py module and adding an environment variable similar to benefits?

Originally posted by @thekaveman in #116 (comment)

app.py is one giant file with a lot going on. As this project starts to mature, it makes sense to refactor the existing file into a few different modules that can grow and be maintained a little more independently.

Wait until #25 is closed to make it easier!

Initial testing

Unit Tests: Set Up

Add pytest, coverage dependencies
Set up test scripts

Unit Tests: Actual test

Write unit tests

Implement hashed lookup

Building on #30 and cal-itp/hashfields, we want to support (at least) two distinct modes of verification lookup from our source:

Normal lookup: does (id, name) exist on the list?
Hashed lookup: does (hash(id), hash(name)) exist on the list?

(note in either case, the Eligibility Request coming into the server will contain the un-hashed data)

We'll want to be able to configure:

What kind of lookup (normal or hashed)
The hash function to use (sha256, sha384, sha512, others?)

Upgrade to Python 3.10

We currently use Python 3.9 as defined in a number of places including Docker images and GitHub Actions settings.

Let's update to the more recent version 3.10.

Add CodeQL

Add CodeQL GitHub Action for Eligibility Server

Automate issues on to the Project Board

See cal-itp/benefits#570

Add repository secret for GH_PROJECT (the project number)
Create workflow file for issues
Create workflow file for Dependabot PRs

Bug: Docs Deploy GitHub Action is running all the time

Docs Deploy GitHub Action should only run when files in this list are changed:

https://github.com/cal-itp/eligibility-server/blob/main/.github/workflows/mkdocs.yml#L7-L10

Refactor database class

As we move into becoming production-ready, there are a few items of cleanup in the Database class that should happen.

cc @angela-tran who discussed this with me earlier today and informed this ticket.
cc @machikoyasuda we held off on some of this to make #116 more straightforward, documenting here instead.

Method	Feedback
`__init__()`	let's not query and store all users. We may have a large list of users, and don't necessarily want to hold them all in-memory
`__init__()`	log the given `hash` option as `DEBUG`
`check_user()`	break up large `if(...)` statement checking the various states into `if/elif/else` etc.
`check_user()`	query for the user here
`check_user()`	separate calculation of the return value from the `return` statement
`check_user()`	add various logging to the above refactors

Let's also verify the tests are providing adequate coverage and scoping.

Option to add a salt to the input data before hashing for lookup

This corresponds to cal-itp/hashfields#18

Refactor server code into subdirectory

The app.py module currently sits at the root of the repository. This is fine for a single file, but can cause trouble when trying to import objects from the module (see e.g. #10) or when the code becomes more complicated than a single file.

Recommendation is to create a subdirectory, named eligibility_server, and place both the app.py and an (empty) __init__.py file in there.

Elsewhere (e.g. in tests) you could then:

from eligibility_server import app

Run setup.py on devcontainer attach

This is the corollary to #100, but for the devcontainer.

This will make the experience of getting the application running just a little faster by pre-initializing the database.

Rename .devcontainer/pre-commit.sh to .devcontainer/postAttach.sh (also update the devcontainer.json)
Add python setup.py to the post attach script

Update black pre-commit hook

See cal-itp/benefits#360

The pre-commit Action is failing here too, e.g. run 2060074713

Set up Dependabot

~~First we'll need to login to pyup.io as the cal-itp-bot account and link the repo. We can pair on this piece.~~

~~This will generate and commit a .pyup.yml file in the repo, and we'll probably need to adjust to our needs. See the config from benefits: https://github.com/cal-itp/benefits/blob/dev/.pyup.yml~~

Use Dependabot instead and do research

Refactor eligibility verification into reusable package

This is a corollary to cal-itp/benefits#141, with which this work must coordinate. See that issue for the general overview.

Server specifics

eligibility_server/verify.py is where the server-side implementation lives.

Some of the code there is setting up the flask_restful resource and dealing with HTTP request processing, and that would stay. Anything concerned with creating and processing the Eligibility Request and Response objects could be moved into the new package:

Research and decide on hosting

Background

The eligibility server needs to be hosted somewhere such that it can receive and respond to HTTP requests from a Benefits client instance.

The owner of the hosting environment is not as important as are the agreement(s) in place that allow for sharing of (hashed) Courtesy Card data with said hosting environment.

Decision needed

The main decision point for hosting of the Eligibility Server:

Will the server be co-located with the Benefits client app deployment (CDT/Azure), or will the server be deployed to a new environment?

Co-locating with Benefits client app

The server would just be another Azure App Service.

Simplest approach, no new cloud accounts / environments to build out
Reuse deployment strategy from Benefits app
Need CDT hosting contract signed
Need CDT approval to store hashed data

Deploy in a new cloud environment

Adds 1-2 weeks of build/configure time
Compiler would set this up and manage it (likely GCP, Google Cloud Endpoints looks like one option)
Could also setup Azure on behalf of MST
Should probably have agreement between Compiler <> MST

Add GitHub action for docs + Docs

Setting up Docs

GitHub Actions: mkdocs
.aws
mkdocs.yml
docs/ *

Writing docs

Document how to get server itself running
Document how to get server running with benefits app

Use dynamic localhost ports in Docker

See the original writeup in cal-itp/benefits#123 and the implementation in cal-itp/benefits#124.

Relevant docs: https://docs.calitp.org/benefits/getting-started/docker-dynamic-ports/

We should apply the same treatment here.

Add GitHub Action for linting

Pre-Commit & Linting

Update dependencies

To the latest version possible.

Flask, Flask-REST
jwcrypto
cryptography

Make test data match the schema necessary for next release

We should make the CSV and JSON data test files match the schema we need for future releases

Make CSV options configurable

Currently when setting up from a CSV input file, some settings are assumed, but we likely want them to be configurable:

if settings.IMPORT_FILE_FORMAT == "csv":
  with open(settings.IMPORT_FILE_PATH, newline="", encoding="utf-8") as file:
    data = csv.reader(file, delimiter=";", quotechar="", quoting=csv.QUOTE_NONE)

delimiter may not be a semicolon ;
values may be quoted
we may want to handle newlines differently depending on the file origin

Let's make these settings that can be configured from the environment.

Make database setup and teardown idempotent

Currently, an attempt to restart a server Docker container will result in an IntegrityError:

+ python setup.py
Creating table...
Table created.
Importing users from /.devcontainer/server/data.csv
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1819, in _execute_context
    self.dialect.do_execute(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 732, in do_execute
    cursor.execute(statement, parameters)
sqlite3.IntegrityError: UNIQUE constraint failed: user.key

A separate but similar issue is that an attempt to teardown the database if it has already been torn down results in an OperationalError:

+ python teardown.py
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1819, in _execute_context
    self.dialect.do_execute(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 732, in do_execute
    cursor.execute(statement, parameters)
sqlite3.OperationalError: no such table: user

We should make the setup and teardown scripts idempotent to prevent these errors.

Additional context

#101 made it so the app container initializes the database on startup.

Add documentation for SUB_FORMAT_REGEX

#99 introduced an environment variable to configure the expected format for the sub field from requests.

The docs should be updated to include this setting.

Version and release strategy

Tracking a todo in this repository based on a number of related issues:

We currently push the server image tagged with main to GHCR for every commit to the main branch, and tagged with the commit SHA and latest for every Release.

As we're seeing elsewhere, a more structured version/release strategy could help bring clarity to the codebase. We should pick a strategy here that aligns across the Benefits portfolio (as in the linked issues).

cal-itp / eligibility-server Goto Github PK

eligibility-server's People

Contributors

Stargazers

Watchers

Forkers

eligibility-server's Issues

Things to figure out

Additional context

Background

Proposal

Server

Pre-commit log

version information

error information

Initializing repository

Required

Python

Docker

Additional context

Unit Tests: Set Up

Unit Tests: Actual test

Server specifics

Background

Decision needed

Co-locating with Benefits client app

Deploy in a new cloud environment

Setting up Docs

Writing docs

Pre-Commit & Linting

Additional context

Recommend Projects

Recommend Topics

Recommend Org