aws-samples / zero-administration-inference-with-aws-lambda-for-hugging-face Goto Github PK

View Code? Open in Web Editor NEW

61.0 4.0 23.0 257 KB

Zero administration inference with AWS Lambda for 🤗

License: Other

Python 74.91% Dockerfile 25.09%

aws lambda huggingface ecr efs docker machine-learning transformer nlp

zero-administration-inference-with-aws-lambda-for-hugging-face's Introduction

Zero administration inference with AWS Lambda for 🤗

Note: This is not production code and simply meant as a demo

Hugging Face Transformers is a popular open-source project that provides pre-trained, natural language processing (NLP) models for a wide variety of use cases. Customers with minimal machine learning experience can use pre-trained models to enhance their applications quickly using NLP. This includes tasks such as text classification, language translation, summarization, and question answering - to name a few.

Overview

Our solution consists of an AWS Cloud Development Kit (AWS CDK) script that automatically provisions container image-based Lambda functions that perform ML inference using pre-trained Hugging Face models. This solution also includes Amazon Elastic File System (EFS) storage that is attached to the Lambda functions to cache the pre-trained models and reduce inference latency.

In this architectural diagram:

Serverless inference is achieved by using Lambda functions that are based on container image
The container image is stored in an Amazon Elastic Container Registry (ECR) repository within your account
Pre-trained models are automatically downloaded from Hugging Face the first time the function is invoked
Pre-trained models are cached within Amazon Elastic File System storage in order to improve inference latency

The solution includes Python scripts for two common NLP use cases:

Sentiment analysis: Identifying if a sentence indicates positive or negative sentiment. It uses a fine-tuned model on sst2, which is a GLUE task.
Summarization: Summarizing a body of text into a shorter, representative text. It uses a Bart model that was fine-tuned on the CNN / Daily Mail dataset. For simplicity, both of these use cases are implemented using Hugging Face pipelines.

Prerequisites

The following is required to run this example:

git
AWS CDK v2
Python 3.6+
A virtual env (optional)

Deploying the example application

Clone the project to your development environment:

git clone <https://github.com/aws-samples/zero-administration-inference-with-aws-lambda-for-hugging-face.git>

Install the required dependencies:

pip install -r requirements.txt

Bootstrap the CDK. This command provisions the initial resources needed by the CDK to perform deployments:

cdk bootstrap

This command deploys the CDK application to its environment. During the deployment, the toolkit outputs progress indications:

cdk deploy

Understanding the code structure

The code is organized using the following structure:

├── inference
│   ├── Dockerfile
│   ├── sentiment.py
│   └── summarization.py
├── app.py
└── ...

The inference directory contains:

The Dockerfile used to build a custom image to be able to run PyTorch Hugging Face inference using Lambda functions
The Python scripts that perform the actual ML inference

The sentiment.py script shows how to use a Hugging Face Transformers model:

import json
from transformers import pipeline

nlp = pipeline("sentiment-analysis")

def handler(event, context):
response = {
    "statusCode": 200,
    "body": nlp(event['text'])[0]
}
return response

For each Python script in the inference directory, the CDK generates a Lambda function backed by a container image and a Python inference script.

CDK script

The CDK script is named app.py in the solution's repository. The beginning of the script creates a virtual private cloud (VPC).

vpc = ec2.Vpc(self, 'Vpc', max_azs=2)

Next, it creates the EFS file system and an access point in EFS for the cached model:

fs = efs.FileSystem(self, 'FileSystem',
vpc=vpc,
removal_policy=RemovalPolicy.DESTROY)
access_point = fs.add_access_point('MLAccessPoint',
create_acl=efs.Acl(
owner_gid='1001', owner_uid='1001', permissions='750'),
path="/export/models",
posix_user=efs.PosixUser(gid="1001", uid="1001"))

It iterates through the Python files in the inference directory:

docker_folder = os.path.dirname(os.path.realpath(__file__)) + "/inference"
pathlist = Path(docker_folder).rglob('*.py')
for path in pathlist:

And then creates the Lambda function that serves the inference requests:

base = os.path.basename(path)
filename = os.path.splitext(base)[0]
# Lambda Function from docker image
function = lambda_.DockerImageFunction(
    self, filename,
    code=lambda_.DockerImageCode.from_image_asset(docker_folder,
    cmd=[filename+".handler"]),
    memory_size=8096,
    timeout=Duration.seconds(600),
    vpc=vpc,
    filesystem=lambda_.FileSystem.from_efs_access_point(
    access_point, '/mnt/hf_models_cache'),
    environment={
        "TRANSFORMERS_CACHE": "/mnt/hf_models_cache"},
    )

Adding a translator

Optionally, you can add more models by adding Python scripts in the inference directory. For example, add the following code in a file called translate-en2fr.py:

import json
from transformers
import pipeline

en_fr_translator = pipeline('translation_en_to_fr')

def handler(event, context):
    response = {
        "statusCode": 200,
        "body": en_fr_translator(event['text'])[0]
    }
    return response

Then run:

$ cdk synth
$ cdk deploy

This creates a new endpoint to perform English to French translation.

Cleaning up

After you are finished experimenting with this project, run cdk destroy to remove all of the associated infrastructure.

License

This library is licensed under the MIT No Attribution License. See the LICENSE file. Disclaimer: Deploying the demo applications contained in this repository will potentially cause your AWS Account to be billed for services.

zero-administration-inference-with-aws-lambda-for-hugging-face's People

Contributors

Stargazers

Watchers

zero-administration-inference-with-aws-lambda-for-hugging-face's Issues

AttributeError: module 'aws_cdk' has no attribute 'cx_api' when running cdk bootstrap

I am following the instructions to deploy the model on AWS and get the following error when running cdk bootstrap:

 File "app.py", line 8, in <module>
    from aws_cdk import (
  File "/Users/alioskooei/opt/anaconda3/envs/nlp/lib/python3.7/site-packages/aws_cdk/__init__.py", line 22552, in <module>
    from . import aws_acmpca
  File "/Users/alioskooei/opt/anaconda3/envs/nlp/lib/python3.7/site-packages/aws_cdk/aws_acmpca/__init__.py", line 79, in <module>
    from ._jsii import *
  File "/Users/alioskooei/opt/anaconda3/envs/nlp/lib/python3.7/site-packages/aws_cdk/aws_acmpca/_jsii/__init__.py", line 11, in <module>
    import aws_cdk.core._jsii
  File "/Users/alioskooei/opt/anaconda3/envs/nlp/lib/python3.7/site-packages/aws_cdk/core/__init__.py", line 6643, in <module>
    class ConstructNode(metaclass=jsii.JSIIMeta, jsii_type="@aws-cdk/core.ConstructNode"):
  File "/Users/alioskooei/opt/anaconda3/envs/nlp/lib/python3.7/site-packages/aws_cdk/core/__init__.py", line 6694, in ConstructNode
    runtime_info: typing.Optional[aws_cdk.cx_api.RuntimeInfo] = None,
AttributeError: module 'aws_cdk' has no attribute 'cx_api'

Unfortunately, I could not find any tips online as to why I am seeing this error. I have installed the requirements according to the instructions and am using the following package versions:

CDK 2.12.0 (build c9786db
Node v16.3.0
Python 3.7.11
aws-cli/2.4.16
npm 7.15.1

I would appreciate any tips on how to resolve this issue. Thank you.

Amazon Elastic Compute Cloud NatGateway costs

The tutorial creates costs under Amazon Elastic Compute Cloud NatGateway.

There were two unassigned Elastic IP's on my account and I think it also builds NAT Gateway, which isn't part of free tier. Is there any way to run the tutorial without these costs?

Repeated Inferences with pipeline on lambda

Thanks for your response on Q&A question in other issue.
With regard to multiple inferences, is there any precaution to take?

I was hoping that I just just call the model repeatedly in loop.

	import json
	from transformers import pipeline
	import requests
	question_answerer = pipeline("question-answering")
	
    def handler(event, context):
	    questionsetList['questionlist']
	    answerlist = []
	    for question in questionsetList:
		    answer = question_answerer({'question':question,'context':event['context']})
		    answerlist.push(answer)
            return jsonify({"Result": answerlist})

I got the following error on lambda test event.
START RequestId: b06fd2cb-54df-4807-91c8-34ea7cfb614f Version: $LATEST
OpenBLAS WARNING - could not determine the L2 cache size on this system, assuming 256k
/usr/local/lib/python3.6/dist-packages/joblib/_multiprocessing_helpers.py:45: UserWarning: [Errno 38] Function not implemented. joblib will operate in serial mode
warnings.warn('%s. joblib will operate in serial mode' % (e,))
questions before splitting by ? mark

Why are you troubled?~ 2.Who is the person to blame? ~3. How long are you frustrated about this?
Traceback (most recent call last):
File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/function/awslambdaric/main.py", line 20, in
main(sys.argv)
File "/function/awslambdaric/main.py", line 16, in main
bootstrap.run(app_root, handler, lambda_runtime_api_addr)
File "/function/awslambdaric/bootstrap.py", line 415, in run
log_sink,
File "/function/awslambdaric/bootstrap.py", line 171, in handle_event_request
log_error(error_result, log_sink)
File "/function/awslambdaric/bootstrap.py", line 122, in log_error
log_sink.log_error(error_message_lines)
File "/function/awslambdaric/bootstrap.py", line 306, in log_error
sys.stdout.write(error_message)
File "/function/awslambdaric/bootstrap.py", line 283, in write
self.stream.write(msg)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 79-80: ordinal not in range(128)
END RequestId: b06fd2cb-54df-4807-91c8-34ea7cfb614f
REPORT RequestId: b06fd2cb-54df-4807-91c8-34ea7cfb614f Duration: 22056.43 ms Billed Duration: 22057 ms Memory Size: 8096 MB Max Memory Used: 962 MB
RequestId: b06fd2cb-54df-4807-91c8-34ea7cfb614f Error: Runtime exited with error: exit status 1
Runtime.ExitError

It appeared like I can not call the model in a loop. In other implementations without pipeline I had used model in a loop.

Please suggest if there is any specific precaution like clean up required before calling for second question.

Thanks in advance.

Input Sample for Question Answering Pipeline

Hi,
This is a great project. It worked for sentiment analysis example. However, my need is question answering use case.

I created myquestionanswer.py as below
import json
from transformers import pipeline

import json
from transformers import pipeline

summarizer = pipeline("question-answering")

def handler(event, context):
response = {
"statusCode": 200,
"body": summarizer(event['article'])[0]
}
return response

i.e. Only change I made is the string in pipeline parameter. Now it is 'question-answering'

What is the json input format to be given at Lambda test ? I tried the following. Both failed:

{ "context": "My name is Rama. Sita is his wife", "question": " what is your name?"}
{"context": questions": [ "What is the name?", "Who is his wife?"] }

I saw other huggingface examples. These aren't applicable since they directly feed into model and are not helpful.

Thanks in advance.

aws-samples / zero-administration-inference-with-aws-lambda-for-hugging-face Goto Github PK

zero-administration-inference-with-aws-lambda-for-hugging-face's Introduction

Zero administration inference with AWS Lambda for 🤗

Note: This is not production code and simply meant as a demo

Overview

Prerequisites

Deploying the example application

Understanding the code structure

CDK script

Adding a translator

Cleaning up

License

Links

zero-administration-inference-with-aws-lambda-for-hugging-face's People

Contributors

Stargazers

Watchers

Forkers

zero-administration-inference-with-aws-lambda-for-hugging-face's Issues

Recommend Projects

Recommend Topics

Recommend Org