Comments (12)
Do you save your tokenizer
as well in your training script? if not this would explain why it cannot be found when deploying the model.
from sagemaker-huggingface-inference-toolkit.
Yes, I've tried saving the tokenizer too. But I still get the same client error message.
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--model-dir", type=str, default=os.environ["SM_MODEL_DIR"])
...
trainer.model.save_pretrained(args.model_dir)
trainer.tokenizer.save_pretrained(args.model_dir)
from sagemaker-huggingface-inference-toolkit.
Can you please share the structure for your model.tar.gz?
from sagemaker-huggingface-inference-toolkit.
The model.tar.gz
structure is:
- checkpoint-43
- config.json
- pytorch_model.bin
- special_tokens_map.json
- tokenizer_config.json
- vocab.txt
from sagemaker-huggingface-inference-toolkit.
This should work! after adding trainer.tokenizer.save_pretrained(args.model_dir)
are you still seeing the same issue?
from sagemaker-huggingface-inference-toolkit.
Yes, I still have the same issue.
I think it can maybe be something related to the path used by the endpoint when it tries to load the tokenizer using from_pretrained
.
Also, all the code that creates the estimator, trains it, deploys it, and uses the predictor for inference, is executed in my local machine. Despite the model is trained, stored, and deployed successfully, maybe this can be related to an error in the path used in the endpoint.
from sagemaker-huggingface-inference-toolkit.
What do you mean with
Also, all the code that creates the estimator, trains it, deploys it, and uses the predictor for inference, is executed in my local machine.
Aren't you running the training on AWS? Is the model.tar.gz
uploaded to s3?
from sagemaker-huggingface-inference-toolkit.
Yes, the model is stored in the S3 bucket. The only difference is that I don't use a notebook instance in SageMaker to create, train and deploy the estimator. But that should work too, since I make sure the model is store in the correct S3 bucket.
from sagemaker-huggingface-inference-toolkit.
Yeah this shouldn't be an issue. Can you try creating a new model.tar.gz
manually?
https://huggingface.co/docs/sagemaker/inference#creating-a-model-artifact-modeltargz-for-deployment
Replace step 1 with unzipping your current archive. You can also remove your checkpoint then.
from sagemaker-huggingface-inference-toolkit.
@elozano98 Did you ever find the cause of the problem? I might be facing something similar
ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from primary with message "{
"code": 400,
"type": "InternalServerException",
"message": "Can\u0027t load config for \u0027/.sagemaker/mms/models/model\u0027. Make sure that:\n\n- \u0027/.sagemaker/mms/models/model\u0027 is a correct model identifier listed on \u0027https://huggingface.co/models\u0027\n (make sure \u0027/.sagemaker/mms/models/model\u0027 is not a path to a local directory with something else, in that case)\n\n- or \u0027/.sagemaker/mms/models/model\u0027 is the correct path to a directory containing a config.json file\n\n"
}
from sagemaker-huggingface-inference-toolkit.
@segments-tobias could you describe how you have created your model.tar.gz
?
from sagemaker-huggingface-inference-toolkit.
@philschmid I found the answer in another issue. I also accidentally zipped the directory, instead of just the contents.
from sagemaker-huggingface-inference-toolkit.
Related Issues (20)
- Using custom inference script and models from Hub HOT 1
- get_pipeline function passes Path object rather than PretrainedTokenizer
- No support for multi-GPU HOT 2
- 🏷️ invalid
- Sagemaker endpoint inferencing error with HF model loading from s3bucket with new transformer update HOT 5
- Support multiple return sequences
- HF_TASK Enviournment Variable error HOT 1
- Endpoint creation completes before custom model_fn finishes loading resources
- ARCHITECTURES_2_TASK is limiting the tasks able to be deployed with HF DLC HOT 11
- Make DEFAULT_HF_HUB_MODEL_EXPORT_DIRECTORY configurable
- InternalServerException at runtime HOT 3
- trust_remote_code=True in new Hugging Face LLM Inference Container for Amazon SageMaker HOT 2
- How to access CustomAttributes in async inferece request input_fn HOT 1
- [DOCS] List of available HF_TASK and default inference scripts HOT 4
- Dead Link for Available HF_Tasks HOT 1
- SageMaker deployment errors HOT 2
- Error on Sagemaker deployment for v1.0.1 HOT 1
- How can I delpoy a model with AWS S3 and without downloading model from hunggingface via TGI image on Sagemaker? HOT 2
- How to enable Batch inference on AWS deployed Serverless model from Hub? HOT 1
- Where is the logic for detecting custom inference.py? HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sagemaker-huggingface-inference-toolkit.