Objective: Import data from a CSV file stored in an S3 bucket into an Amazon Redshift table.
- AWS Account: Please ensure you have an AWS account with the necessary permissions
- AWS CLI: Installed and configured.
- Git: Installed.
- Terraform: Installed.
- Amazon Redshift Cluster: Set up with necessary tables.
- Amazon S3 Bucket: Store the CSV files.
-
Setup Infrastructure
- Clone Repository:
git clone https://github.com/aws-samples/step-functions-workflows-collection cd step-functions-workflows-collection/distributed-map-csv-iterator-tf
- Initialize Terraform:
terraform init
- Apply Terraform Configuration:
terraform apply
- Confirm with
yes
.
- Confirm with
- Clone Repository:
-
Upload CSV File to S3
- Upload your CSV file to the designated S3 bucket.
-
Modify State Machine
- Update the state machine definition to include steps to load data into Amazon Redshift.
- Example State Machine Definition:
{ "Comment": "A description of my state machine", "StartAt": "ProcessCSV", "States": { "ProcessCSV": { "Type": "Map", "ItemProcessor": { "ProcessorConfig": { "Mode": "DISTRIBUTED", "ExecutionType": "CHILD" }, "StartAt": "LoadToRedshift", "States": { "LoadToRedshift": { "Type": "Task", "Resource": "arn:aws:states:::redshift-data:executeStatement.sync", "Parameters": { "ClusterIdentifier": "YOUR-REDSHIFT-CLUSTER-ID", "Database": "YOUR-REDSHIFT-DB", "Sql": { "Fn::Sub": [ "COPY ${TableName} FROM 's3://${BucketName}/${FileKey}' CREDENTIALS 'aws_iam_role=${IamRole}' CSV;", { "TableName": "YOUR-TABLE-NAME", "BucketName": "YOUR-BUCKET-NAME", "FileKey": "PREFIX/metrics.csv", "IamRole": "YOUR-REDSHIFT-IAM-ROLE" } ] } }, "End": true } } }, "End": true } } }
-
Trigger State Machine
- Start Execution:
- Use the following input:
{ "BucketName": "YOUR-BUCKET-NAME", "FileKey": "PREFIX/metrics.csv" }
- Use the following input:
- Start Execution:
-
Verify Data in Redshift
- Connect to your Redshift cluster and query the table to verify the data has been loaded.
-
Cleanup Resources
- Destroy Resources:
terraform destroy
- Confirm with
yes
.
- Confirm with
- Destroy Resources: