Coder Social home page Coder Social logo

daniel-cortez-stevenson / aws-athena-udfs-h3 Goto Github PK

View Code? Open in Web Editor NEW
16.0 2.0 1.0 1.14 MB

This connector extends Amazon Athena's capability by adding UDFs (via Lambda) for selected [h3-java](https://github.com/uber/h3-java) Java functions to support geospatial indexing and queries with Uber's [H3](https://h3geo.org/)

License: Apache License 2.0

Java 100.00%
aws-athena h3 udfs geospatial geospatial-analysis

aws-athena-udfs-h3's Introduction

aws-athena-udfs-h3

This connector extends Amazon Athena's capability by adding UDFs (via Lambda) for selected h3-java Java functions to support geospatial indexing and queries with Uber's H3. A Maven Site hosted on GitHub Pages holds the API documentation for this repository.

Deploy

Option 1: Deploy the app with the AWS Console

  1. Find the App in the AWS Serverless Application Repository
  2. Click 'Deploy'

Option 2: Deploy with the AWS SAM CLI

# build
mvn clean verify package -Dpublishing=true
# deploy
sam deploy \
  --resolve-s3 \
  --stack-name aws-athena-udfs-h3-stack \
  --template-file ./template.yaml \
  --capabilities CAPABILITY_IAM

Option 3: Deploy as an AWS SAM Resource

In your AWS SAM template.yaml file:

Resources:
  AwsAthenaUdfsH3:
    Type: AWS::Serverless::Application
    Properties:
    Location:
    ApplicationId: arn:aws:serverlessrepo:us-east-1:922535613973:applications/aws-athena-udfs-h3
    SemanticVersion: 1.0.0-rc7
    Parameters:
      # The name of Lambda function, which calls the H3AthenaUDFHandler
      # LambdaFunctionName: 'h3-athena-udf-handler' # Uncomment to override default value
      # Lambda memory in MB
      # LambdaMemory: '3008' # Uncomment to override default value
      # Maximum Lambda invocation runtime in seconds
      # LambdaTimeout: '300' # Uncomment to override default value

Usage

The API is very similar to the h3-java API.

Index coordinates

USING EXTERNAL FUNCTION geo_to_h3(lat DOUBLE, lng DOUBLE, res INTEGER)
RETURNS BIGINT
LAMBDA 'h3-athena-udf-handler'
SELECT geo_to_h3(52.495999878401896, 13.414889023293945, 13) h3_index;
|h3_index          |
|------------------|
|635554602371582271|

Get the coordinates of an index

A GeoCoord in the h3-java API is represented as a well-known-text (WKT) point, which is compatible with Athena geospatial functions.

USING EXTERNAL FUNCTION h3_to_geo(h3 BIGINT)
RETURNS VARCHAR
LAMBDA 'h3-athena-udf-handler'
select h3_to_geo(635554602371582271) wkt_point;
|wkt_point                  |
|---------------------------|
|POINT (13.414849 52.496016)|

Get the string representation of an index

USING EXTERNAL FUNCTION h3_to_string(h3 BIGINT)
RETURNS VARCHAR
LAMBDA 'h3-athena-udf-handler'
SELECT h3_to_string(635554602371582271) h3_address;
h3_address     |
---------------+
8d1f18b25b9093f|

More functions

See Querying with User Defined Functions

In the AWS Athena Console with an Athena workgroup with Athena Query Engine 2 enabled, select a udf_name (any public method of the H3AthenaUDFHandler) and implement the function signature like so:

USING EXTERNAL FUNCTION udf_name(variable1 data_type[, variable2 data_type][,...])
RETURNS data_type
LAMBDA 'lambda-function-name'  -- the LambdaFunctionName of the serverless app.
SELECT  [...] udf_name(expression) [...]

Known Limitations

Most h3-java API functions have an equivalent, snake-cased method in the H3AthenaUDFHandler API. Some do not.

  • Functions returning lists of lists in the h3-java API are not supported. There is a limitation in the UserDefinedFunctionHandler that does not allow serialization of complex/nested types. These include:
    • kRings
    • kRingDistances
    • hexRange
  • Experimental I, J coordinate h3-java API functions are not supported.
  • The following UDFs do not work as expected, and should not be used:
    • get_res_0_indexes() RETURNS ARRAY<BIGINT>
      • Note: always throws NullPointerException
    • get_res_0_indexes_addresses() RETURNS ARRAY<VARCHAR>
      • Note: always throws NullPointerException

Examples

Data Sources

Open Street Maps

In the Athena console, run the query in create_planet.sql to create some test data from the current Open Street Maps database.

Then run test_udfs_planet.sql to test the H3 functions available via this application are registering and working correctly.

Facebook High Resolution Population Density Estimates

In the Athena console, run create_hrsl.sql, and then run repair_hrsl.sql to create some test data from the Facebook Data For Good Population Density dataset.

Index Data Sources

In your SQL client, run the SQL script create_hrsl_h3.sql (or run each statement individually in the Athena console).

Then run create_planet_h3.sql.

The created tables have an H3 index at resolution 15.

Useful Example Query

Get restaurants per person in Germany at H3 resolution 7 and output H3 index string for mapping with tools like Unfolded.ai by running restaurants_per_person.sql.

Go to the interactive map

Unfolded Map

Contributing

Formatting

Format your Java contributions with the spotless Maven plugin. This is done automatically when running mvn verify or mvn install. Modify pom.xml to change formatting rules.

mvn spotless:apply

GitHub Pages Site

The GitHub Pages Site is built with mvn site and is published manually. Change the contents of the site by modifying pom.xml and site.xml.

Build the site locally.

mvn -Preporting site site:stage
# Open the built site in your browser
open ./target/site/index.html

Publish the site to GitHub Pages.

mvn scm-publish:publish-scm

Publishing the UDFs to the AWS Serverless Application Repository

Publishing this code the the AWS Serverless Application Repository is done manually. New semantic versions should be published for new tagged commits in the main branch of this repository.

# build
mvn spotless:apply clean install -Dpublishing=true
# package
sam package \
  --resolve-s3 \
  --output-template-file ./target/packaged.yaml
# publish
sam publish \
  --template-file ./target/packaged.yaml \
  --semantic-version 1.0.0-rc7

More Examples

See the AWS blog post Translate and analyze text using SQL functions with Amazon Athena, Amazon Translate, and Amazon Comprehend

License

This project is licensed under the Apache-2.0 License.

aws-athena-udfs-h3's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

llpalolui01

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.