Light

SageMaker endpoint can't load huggingface tokenizer about sagemaker-huggingface-inference-toolkit HOT 12 OPEN

aws commented on June 12, 2024

SageMaker endpoint can't load huggingface tokenizer

from sagemaker-huggingface-inference-toolkit.

Comments (12)

philschmid commented on June 12, 2024

Do you save your tokenizer as well in your training script? if not this would explain why it cannot be found when deploying the model.

from sagemaker-huggingface-inference-toolkit.

elozano98 commented on June 12, 2024

Yes, I've tried saving the tokenizer too. But I still get the same client error message.

if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("--model-dir", type=str, default=os.environ["SM_MODEL_DIR"])
    ...
    trainer.model.save_pretrained(args.model_dir)
    trainer.tokenizer.save_pretrained(args.model_dir)

from sagemaker-huggingface-inference-toolkit.

philschmid commented on June 12, 2024

Can you please share the structure for your model.tar.gz?

from sagemaker-huggingface-inference-toolkit.

elozano98 commented on June 12, 2024

The model.tar.gz structure is:

checkpoint-43
config.json
pytorch_model.bin
special_tokens_map.json
tokenizer_config.json
vocab.txt

from sagemaker-huggingface-inference-toolkit.

philschmid commented on June 12, 2024

This should work! after adding trainer.tokenizer.save_pretrained(args.model_dir) are you still seeing the same issue?

from sagemaker-huggingface-inference-toolkit.

elozano98 commented on June 12, 2024

Yes, I still have the same issue.

I think it can maybe be something related to the path used by the endpoint when it tries to load the tokenizer using from_pretrained.

Also, all the code that creates the estimator, trains it, deploys it, and uses the predictor for inference, is executed in my local machine. Despite the model is trained, stored, and deployed successfully, maybe this can be related to an error in the path used in the endpoint.

from sagemaker-huggingface-inference-toolkit.

philschmid commented on June 12, 2024

What do you mean with

Also, all the code that creates the estimator, trains it, deploys it, and uses the predictor for inference, is executed in my local machine.

Aren't you running the training on AWS? Is the model.tar.gz uploaded to s3?

from sagemaker-huggingface-inference-toolkit.

elozano98 commented on June 12, 2024

Yes, the model is stored in the S3 bucket. The only difference is that I don't use a notebook instance in SageMaker to create, train and deploy the estimator. But that should work too, since I make sure the model is store in the correct S3 bucket.

from sagemaker-huggingface-inference-toolkit.

philschmid commented on June 12, 2024

Yeah this shouldn't be an issue. Can you try creating a new model.tar.gz manually?
https://huggingface.co/docs/sagemaker/inference#creating-a-model-artifact-modeltargz-for-deployment

Replace step 1 with unzipping your current archive. You can also remove your checkpoint then.

from sagemaker-huggingface-inference-toolkit.

segments-tobias commented on June 12, 2024

@elozano98 Did you ever find the cause of the problem? I might be facing something similar

ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from primary with message "{
  "code": 400,
  "type": "InternalServerException",
  "message": "Can\u0027t load config for \u0027/.sagemaker/mms/models/model\u0027. Make sure that:\n\n- \u0027/.sagemaker/mms/models/model\u0027 is a correct model identifier listed on \u0027https://huggingface.co/models\u0027\n  (make sure \u0027/.sagemaker/mms/models/model\u0027 is not a path to a local directory with something else, in that case)\n\n- or \u0027/.sagemaker/mms/models/model\u0027 is the correct path to a directory containing a config.json file\n\n"
}

from sagemaker-huggingface-inference-toolkit.

philschmid commented on June 12, 2024

@segments-tobias could you describe how you have created your model.tar.gz?

from sagemaker-huggingface-inference-toolkit.

segments-tobias commented on June 12, 2024

@philschmid I found the answer in another issue. I also accidentally zipped the directory, instead of just the contents.

from sagemaker-huggingface-inference-toolkit.

Related Issues (20)

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.