Comments (6)
Hi @geraldstanje
Seems like you're trying to deploy a model using a Sagemaker example. SageMake uses TorchServe for model deployment but the model artifact you're creating can not directly be deployed with TorchServe. Is there a specific tutorial you're following?
Not too familiar with Sagemaker itself but the inference.py script you're providing looks like you're trying to deploy a setfit model. You will need to integrate this into a TorchServe handler and package it with the model-archiver into a tar.gz file which will add important meta information. Please have a look as our XGBoost example which should be easily adaptable to your use case as you can basically deploy any framework or library through this approach.
Let me know if you have further questions.
from serve.
@mreso thanks for pointing out - is there a simple way to convertit to run setFit models with TorchServe? can i copy the code i have into a BaseHandler and implement those functions?
does the sagemaker return the same datatype / format as the BaseHandler, what is required?
from serve.
cc @namannandan
from serve.
@geraldstanje yes, you basically follow the XGBoost example to create your own handler or if your model is a HuggingFace model from their transformers library you can just follow one of of these examples:
https://github.com/pytorch/serve/tree/master/examples/Huggingface_Transformers
Let me know if you're having problems converting your example.
from serve.
@mreso thanks - how is sagemaker than be able to use torchServe if they dont implement the ts.torch_handler.base_handler? lets say i take this as an example: https://github.com/aws/amazon-sagemaker-examples/blob/main/frameworks/pytorch/get_started_mnist_deploy.ipynb
i looked at https://github.com/pytorch/serve/tree/master/examples/xgboost_classfication.
i have the trained setfit model here: torchserve/setfit-test-model:
ls -la torchserve/setfit-test-model/1_Pooling/
total 12
drwx------ 2 ubuntu ubuntu 4096 May 13 03:52 .
drwx------ 4 ubuntu ubuntu 4096 May 13 03:52 ..
-rw------- 1 ubuntu ubuntu 296 May 13 03:52 config.json
ls -la torchserve/setfit-test-model/
total 89728
drwx------ 4 ubuntu ubuntu 4096 May 13 03:52 .
drwx------ 3 ubuntu ubuntu 4096 May 13 03:52 ..
drwx------ 2 ubuntu ubuntu 4096 May 13 03:52 1_Pooling
drwx------ 2 ubuntu ubuntu 4096 May 13 03:52 2_Normalize
-rw------- 1 ubuntu ubuntu 7586 May 13 03:52 README.md
-rw------- 1 ubuntu ubuntu 660 May 13 03:52 config.json
-rw------- 1 ubuntu ubuntu 164 May 13 03:52 config_sentence_transformers.json
-rw------- 1 ubuntu ubuntu 116 May 13 03:52 config_setfit.json
-rw------- 1 ubuntu ubuntu 90864192 May 13 03:52 model.safetensors
-rw------- 1 ubuntu ubuntu 13431 May 13 03:52 model_head.pkl
-rw------- 1 ubuntu ubuntu 349 May 13 03:52 modules.json
-rw------- 1 ubuntu ubuntu 53 May 13 03:52 sentence_bert_config.json
-rw------- 1 ubuntu ubuntu 695 May 13 03:52 special_tokens_map.json
-rw------- 1 ubuntu ubuntu 711649 May 13 03:52 tokenizer.json
-rw------- 1 ubuntu ubuntu 1433 May 13 03:52 tokenizer_config.json
-rw------- 1 ubuntu ubuntu 231508 May 13 03:52 vocab.txt
how can i create the model.pt for the torch-model-archiver?
torch-model-archiver --model-name SetFitModel --version 1.0 --serialized-file torchserve/setfit-test-model/model.pt --handler ./setfit_handler_generalized.py --extra-files "./torchserve/setfit-test-model/config.json,./torchserve/setfit-test-model/config_sentence_transformers.json,./torchserve/setfit-test-model/config_sentence_transformers.json,./torchserve/setfit-test-model/config_setfit.json,./torchserve/setfit-test-model/model.safetensors,./torchserve/setfit-test-model/model_head.pkl,./torchserve/setfit-test-model/modules.json,./torchserve/setfit-test-model/sentence_bert_config.json,./torchserve/setfit-test-model/special_tokens_map.json,./torchserve/setfit-test-model/tokenizer.json,./torchserve/setfit-test-model/tokenizer_config.json,./torchserve/setfit-test-model/vocab.txt,./1_Pooling/config.json"
from serve.
@geraldstanje to answer your question
how is sagemaker than be able to use torchServe if they dont implement the ts.torch_handler.base_handler?
The PyTorch inference containers that are compatible with SageMaker install a package called the SageMaker PyTorch Inference Toolkit which provides a handler implementation that is compatible with TorchServe and plugs in the input_fn
, predict_fn
and output_fn
that you provide in the inference.py
script above. For reference, please see
If you'd like to create a custom docker container that is SageMaker compatible, I would suggest starting out with a SageMaker PyTorch Inference Container as the base image and build on top of it. For ex: 763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-inference:2.2.0-gpu-py310-cu118-ubuntu20.04-sagemaker
.
If you would like to use TorchServe natively on SageMaker, here's an example on the same: https://github.com/aws/amazon-sagemaker-examples/blob/main/inference/torchserve/mme-gpu/torchserve_multi_model_endpoint.ipynb
Also, looking at the error logs, I see from the traceback that the model load failed because the handler was unable to find a necessary module:
ModuleNotFoundError: No module named '124636bff1364f2bbc7f71a8667110af'
Could you please check if all the required dependencies to load the model are either installed in the container or included in the model archive?
from serve.
Related Issues (20)
- Update LLM/llama2 to Llama3 HOT 1
- Update large_models/inferentia2/llama2 to Llama3
- Update large_models/tp_llama to llama3
- Update large_models/gpt_fast to llama3
- How to pass parameters from preprocessing to postprocessing when using micro-batch operations HOT 4
- Docker regression failure: test_handler_traceback_logging.py
- Exchange Llama2 against Llama3 in HF_accelerate example
- CUDA out of Memory with low Memory Utilization (CUDA error: device-side assert triggered) HOT 5
- If micro_batch_size of micro-batch is set to 1, then model inference is still batch processing? HOT 1
- question to model inference optimization HOT 1
- Duplicate base_neuronx_continuous_batching_handler.py HOT 1
- Continuous Batching does not work with newest transformer issue
- Limit resource in docker compose and worker in model HOT 2
- model archiver example very long 1 liner
- install dependency via conda
- Enable token authentication as default
- Support "model-control-mode" in configuration HOT 1
- Make torchserve-kfs docker image multiplatform
- Standardize PyTorch 2.x features config in model-config.yaml
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from serve.