🍬 Saves a copy of each day's OCS messages to S3 for archiving and analysis.
asdf install
npm install
- Test suite:
npm test
- Watch mode:
npm test -- --watch
- Watch mode:
- Format all files:
npm run format
- Type checks:
npm run check
- Linter checks:
npm run lint
- Run the Deploy to Dev or Deploy to Prod workflow
Running aws
commands requires setting up the AWS CLI. If your AWS
account is in the TRC team group, you should have all required permissions.
- Build a Lambda package:
./build.sh <entrypoint>
- Use
aws lambda update-function-code
to deploy the package (see the deploy action for exact options)
In the example below, replace 2022-01-01
with the date one day after the
date of the package you want to generate. Use prod
in place of dev
to run
the prod environment packager.
aws lambda invoke \
--invocation-type Event \
--function-name ocs-saver-dev-packager \
--payload '{"time": "2022-01-01T12:00:00Z"}' \
--cli-binary-format raw-in-base64-out \
/dev/stdout
This example uses async invocation, since a successful run can take longer than
the CLI's default read timeout. You'll have to use Splunk to monitor the status
of the run (filter by source=lambda:ocs-saver-dev-packager-logs
).
Note by default the packager intentionally fails when its output file already
exists. If you want to regenerate an output file, add the overwrite
option to
the payload:
{"time": "2022-01-01T12:00:00Z", "detail": {"overwrite": true}}
This repo holds two separate scripts which are deployed to AWS Lambda:
-
processor
is the record processor for a Firehose delivery stream. Firehose receives records from the OCS messages Kinesis stream, runs them through the processor in batches, and delivers the "transformed" batches to an S3 bucket, one object per batch. -
packager
is a standalone Lambda run once per day by an EventBridge rule (it can also be invoked manually). It downloads the previous day's Firehose output, concatenates it into a single file, and uploads it back to S3.
Since much of this behavior is defined in infrastructure rather than this repo, it may also help to reference the Terraform module.
flowchart LR
subgraph EventBridge
event[Daily Event Rule] -.->|invokes| packager{{Packager}}
end
subgraph Firehose
Buffer -- record batches --> processor{{Processor}}
processor -- raw OCS messages --> Delivery
end
subgraph S3
failed(["/failed/[env]/[date]/…"])
partial(["/partial/[env]/[date]/…"])
package(["/[env]/[date].tar.gz"])
end
Kinesis -- records --> Buffer
Delivery -- batches --> partial
Delivery -- bad records --> failed
partial -- prev. day batches --> packager
packager --> package