pytorch / serve Goto Github PK

View Code? Open in Web Editor NEW

4.0K 58.0 806.0 95.55 MB

Serve, optimize and scale PyTorch models in production

Home Page: https://pytorch.org/serve/

License: Apache License 2.0

Java 46.14% Python 41.65% Shell 2.34% Dockerfile 0.54% Mustache 0.07% Jupyter Notebook 1.94% CMake 0.46% C++ 6.86%

pytorch machine-learning mlops serving docker kubernetes optimization cpu gpu metrics

serve's Introduction

TorchServe

TorchServe is a flexible and easy-to-use tool for serving and scaling PyTorch models in production.

Requires python >= 3.8

curl http://127.0.0.1:8080/predictions/bert -T input.txt

🚀 Quick start with TorchServe

# Install dependencies
# cuda is optional
python ./ts_scripts/install_dependencies.py --cuda=cu121

# Latest release
pip install torchserve torch-model-archiver torch-workflow-archiver

# Nightly build
pip install torchserve-nightly torch-model-archiver-nightly torch-workflow-archiver-nightly

🚀 Quick start with TorchServe (conda)

# Install dependencies
# cuda is optional
python ./ts_scripts/install_dependencies.py --cuda=cu121

# Latest release
conda install -c pytorch torchserve torch-model-archiver torch-workflow-archiver

# Nightly build
conda install -c pytorch-nightly torchserve torch-model-archiver torch-workflow-archiver

Getting started guide

🐳 Quick Start with Docker

# Latest release
docker pull pytorch/torchserve

# Nightly build
docker pull pytorch/torchserve-nightly

Refer to torchserve docker for details.

⚡ Why TorchServe

Write once, run anywhere, on-prem, on-cloud, supports inference on CPUs, GPUs, AWS Inf1/Inf2/Trn1, Google Cloud TPUs, Nvidia MPS
Model Management API: multi model management with optimized worker to model allocation
Inference API: REST and gRPC support for batched inference
TorchServe Workflows: deploy complex DAGs with multiple interdependent models
Default way to serve PyTorch models in
- Sagemaker
- Vertex AI
- Kubernetes with support for autoscaling, session-affinity, monitoring using Grafana works on-prem, AWS EKS, Google GKE, Azure AKS
- Kserve: Supports both v1 and v2 API, autoscaling and canary deployments for A/B testing
- Kubeflow
- MLflow
Export your model for optimized inference. Torchscript out of the box, PyTorch Compiler preview, ORT and ONNX, IPEX, TensorRT, FasterTransformer, FlashAttention (Better Transformers)
Performance Guide: builtin support to optimize, benchmark, and profile PyTorch and TorchServe performance
Expressive handlers: An expressive handler architecture that makes it trivial to support inferencing for your use case with many supported out of the box
Metrics API: out-of-the-box support for system-level metrics with Prometheus exports, custom metrics,
Large Model Inference Guide: With support for GenAI, LLMs including
- SOTA GenAI performance using torch.compile
- Fast Kernels with FlashAttention v2, continuous batching and streaming response
- PyTorch Tensor Parallel preview, Pipeline Parallel
- Microsoft DeepSpeed, DeepSpeed-Mii
- Hugging Face Accelerate, Diffusers
- Running large models on AWS Sagemaker and Inferentia2
- Running Llama 2 Chatbot locally on Mac
Monitoring using Grafana and Datadog

🤔 How does TorchServe work

Model Server for PyTorch Documentation: Full documentation
TorchServe internals: How TorchServe was built
Contributing guide: How to contribute to TorchServe

🏆 Highlighted Examples

Serving Llama 2 with TorchServe
Chatbot with Llama 2 on Mac 🦙💬
🤗 HuggingFace Transformers with a Better Transformer Integration/ Flash Attention & Xformer Memory Efficient
Stable Diffusion
Model parallel inference
MultiModal models with MMF combining text, audio and video
Dual Neural Machine Translation for a complex workflow DAG
TorchServe Integrations
TorchServe Internals
TorchServe UseCases

For more examples

🛡️ TorchServe Security Policy

SECURITY.md

🤓 Learn More

https://pytorch.org/serve

🫂 Contributing

We welcome all contributions!

To learn more about how to contribute, see the contributor guide here.

📰 News

💖 All Contributors

Made with contrib.rocks.

⚖️ Disclaimer

This repository is jointly operated and maintained by Amazon, Meta and a number of individual contributors listed in the CONTRIBUTORS file. For questions directed at Meta, please send an email to [email protected]. For questions directed at Amazon, please send an email to [email protected]. For all other questions, please open up an issue in this repository here.

TorchServe acknowledges the Multi Model Server (MMS) project from which it was derived

serve's People

Contributors

Stargazers

Watchers

Forkers

github30 aaronmarkham shyamalschandra intellifora lilujunai sangkwun jeongukjae tomzhang harshbafna mycaster theofpa leo731121 saifrahmed yw3388 drockney mbrukman stjordanis seemethere jlertle ml-and-ai-repo trendingtechnology fundou shivamshriwas amirstudy youfeng243 qelover visenzeadam ultrons eslesar-aws vdantu notfound403 wuqiangch mfreidank zhangqianjin bleachzou3 pko89403 chanupen allensmile eshnil2000 muhyun maaquib otrewyi191 jamshaidsohail5 lrrm wookim3 jbryslaw hoangpq hurutoriya al-yakubovich alvations hephaex takp liuweiping2020 yxryxryxr3 dhaniram-kshirsagar lanpa mjpsl dopadak mgermain lichao88 francescosaveriozuppichini mdlglobal-atlassian-net dhruvaray deepakbabel23 prashantsail wynmew pandinosaurus yanghaha11514 linhduongtuan anuj-rathore abishekideas rishab-sharma htappen adk9 cgq0816 ashokei eliekawerk ssitb zeta1999 gunandrose4u deepakbabel mingfeima ohjho hamidshojanazeri goodhamgupta amchuz momongah amirunpri2018 dongpil therochvoices cuilunan markokostiv maheshambule sagarjounkani noorahmedds jojozep ssakhavi mengmeng96 rchavezj bibyutatsu

serve's Issues

Can't load two versions of same model from config file

I attempted to start a model server with two versions of the same model specified in the config options. Both model archives had a model name of "d161", with one of them having version number 1.0 and the other version 1.1. The two files were named d161_1_0.mar and d161_1_1.mar and they were both in the model store directory. My config.properties contained the line:

load_models=d161_1_0.mar,d161_1_1.mar

The expected outcome is that I should have both model versions available.

The actual outcome is that only the later one is loaded, according to the output of curl http://127.0.0.1:8081/models:

{
  "models": [
    {
      "modelName": "d161",
      "modelUrl": "d161_1_1.mar"
    }
  ]
}

Default image classfier model handler fails on black & white JPG

Repro: In the quick start, replace kitten.jpg with the attached file. There's a crash in image_classifier.py when it attempts to convert the image to a tensor, because it assumes 3 color channels.

Python version

Why do the instructions specify Python 2.7, when it's rapidly approaching EOL? https://pythonclock.org/

Update error message when attempting inference on model with 0 workers

On adding a model via the management-api, the default min/max workers for the model is set to 0. As a result when running prediction against the model after registering gives a 503 error with details as 'No worker is available to serve request: densenet161'. This will be confusing for user's trying to add models from the inference api.

Register model using:
curl -X POST "http://:/models?url=https://<s3_path>/densenet161.mar"
{
"status": "Model densenet161 registered"
}

Inference using:
curl -X POST http://:/predictions/densenet161 -T cutekit.jpeg
{
"code": 503,
"type": "ServiceUnavailableException",
"message": "No worker is available to serve request: densenet161"
}

Model details:
curl -X GET http://:/models/densenet161
[
{
"modelName": "densenet161",
"modelVersion": "1.0",
"modelUrl": "https:///densenet161.mar",
"runtime": "python",
"minWorkers": 0,
"maxWorkers": 0,
"batchSize": 1,
"maxBatchDelay": 100,
"loadedAtStartup": false,
"workers": []
}
]

Benchmarks have dependency on Mxnet

if [[ $1 = True ]]
then
        echo "Installing pip packages for GPU"
        sudo apt install -y nvidia-cuda-toolkit
        pip install future psutil mxnet-cu92 pillow --user
else
        echo "Installing pip packages for CPU"
        pip install future psutil mxnet pillow --user

Config changes are not preserved

Previous high-priority item from the last round of feedback:

[p0] For Operational ease, model management to support at a minimum, the ability to add new models dynamically via the API and being able to preserve those changes on restarting the server along with monitoring and tracing capabilities.

AFAICT, this is not happening.

Java concurrency crash when attempting batch processing

I have an endpoint configured thus:

[
  {
    "modelName": "d161good",
    "modelVersion": "1.0",
    "modelUrl": "d161good.mar",
    "runtime": "python",
    "minWorkers": 1,
    "maxWorkers": 1,
    "batchSize": 4,
    "maxBatchDelay": 5000,
...

When I pass in a single image, this behaves as expected. It takes about 5 seconds to return while it waits for a batch, gives up, and processes the request.

It fails consistently when passed multiple requests in rapid succession (much less than maxBatchDelay), e.g.:

curl -X POST http://127.0.0.1:8080/predictions/d161good -T kitten.jpg &
curl -X POST http://127.0.0.1:8080/predictions/d161good -T kitten2.jpg &
curl -X POST http://127.0.0.1:8080/predictions/d161good -T kitten3.jpg &
curl -X POST http://127.0.0.1:8080/predictions/d161good -T nickcage1.jpg &

The error from the logs:

2020-02-21 15:38:09,027 [INFO ] W-9000-d161good_1.0 org.pytorch.serve.wlm.WorkerThread - Backend response time: 343
2020-02-21 15:38:09,027 [INFO ] W-9000-d161good_1.0 ACCESS_LOG - /127.0.0.1:53172 "POST /predictions/d161good HTTP/1.1" 503 354
2020-02-21 15:38:09,027 [INFO ] W-9000-d161good_1.0 TS_METRICS - Requests5XX.Count:1|#Level:Host|#hostname:bradheintz-mbp,timestamp:null
2020-02-21 15:38:09,028 [DEBUG] W-9000-d161good_1.0 org.pytorch.serve.wlm.Job - Waiting time: 1, Inference time: 355
{
  "code": 503,
  "type": "InternalServerException",
  "message": "number of batch response mismatched"
}

[2]    57291 done       curl -X POST http://127.0.0.1:8080/predictions/d161good -T kitten2.jpg
2020-02-21 15:38:09,029 [WARN ] W-9000-d161good_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.util.ConcurrentModificationException
    at java.util.LinkedHashMap$LinkedHashIterator.nextNode(LinkedHashMap.java:719)
    at java.util.LinkedHashMap$LinkedKeyIterator.next(LinkedHashMap.java:742)
    at org.pytorch.serve.wlm.BatchAggregator.sendResponse(BatchAggregator.java:81)
    at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:134)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
2020-02-21 15:38:09,030 [ERROR] W-9000-d161good_1.0 org.pytorch.serve.wlm.BatchAggregator - Unexpected job: f006ec22-7c7f-4f94-8c3a-2b53692081e8

The java.util.ConcurrentModificationException is a 100% repro for me, with as few as two requests.

On running server, workers die

I finally got the server to start, but when I do, I see a lot of this kind of thrash:

2019-12-19 12:01:32,130 [DEBUG] W-9006-densenet161 org.pytorch.serve.wlm.WorkerThread - Backend worker monitoring thread interrupted or backend worker process died.
java.lang.InterruptedException
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088)
	at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:418)
	at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:128)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
2019-12-19 12:01:32,130 [WARN ] W-9006-densenet161 org.pytorch.serve.wlm.BatchAggregator - Load model failed: densenet161, error: Worker died.
2019-12-19 12:01:32,130 [DEBUG] W-9006-densenet161 org.pytorch.serve.wlm.WorkerThread - W-9006-densenet161 State change WORKER_STARTED -> WORKER_STOPPED
2019-12-19 12:01:32,131 [INFO ] W-9006-densenet161 org.pytorch.serve.wlm.WorkerThread - Retry worker: 9006 in 55 seconds.

Java 1.8.0 Python 3.6.9 macOS 10.14.6

text_classification example not working

The text_classification example is not working. Adding the model works. But on scaling the workers, one gets 500 error with 'failed to start workers".

After this error torchserve keeps trying to restart the workers and logs are flooded with errors, till one explicity scales back the workers for the model to 0.

Detailed error logs in console:
2020-02-16 02:35:53,352 [INFO ] W-9007-my_text_classifier_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - No module named 'text_classifier'
2020-02-16 02:35:53,352 [INFO ] W-9007-my_text_classifier_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Backend worker process die.
2020-02-16 02:35:53,352 [INFO ] W-9007-my_text_classifier_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Traceback (most recent call last):
2020-02-16 02:35:53,352 [INFO ] W-9007-my_text_classifier_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - File "/home/ubuntu/anaconda3/envs/serve/lib/python3.8/site-packages/ts/model_service_worker.py", line 163, in
2020-02-16 02:35:53,352 [INFO ] W-9007-my_text_classifier_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - worker.run_server()
2020-02-16 02:35:53,352 [INFO ] W-9007-my_text_classifier_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - File "/home/ubuntu/anaconda3/envs/serve/lib/python3.8/site-packages/ts/model_service_worker.py", line 141, in run_server
2020-02-16 02:35:53,352 [INFO ] W-9007-my_text_classifier_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - self.handle_connection(cl_socket)
2020-02-16 02:35:53,352 [INFO ] W-9007-my_text_classifier_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - File "/home/ubuntu/anaconda3/envs/serve/lib/python3.8/site-packages/ts/model_service_worker.py", line 105, in handle_connection
2020-02-16 02:35:53,352 [INFO ] W-9007-my_text_classifier_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - service, result, code = self.load_model(msg)
2020-02-16 02:35:53,352 [INFO ] W-9007-my_text_classifier_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - File "/home/ubuntu/anaconda3/envs/serve/lib/python3.8/site-packages/ts/model_service_worker.py", line 83, in load_model
2020-02-16 02:35:53,352 [INFO ] W-9007-my_text_classifier_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - service = model_loader.load(model_name, model_dir, handler, gpu, batch_size)
2020-02-16 02:35:53,352 [INFO ] W-9007-my_text_classifier_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - File "/home/ubuntu/anaconda3/envs/serve/lib/python3.8/site-packages/ts/model_loader.py", line 107, in load
2020-02-16 02:35:53,352 [INFO ] W-9007-my_text_classifier_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - entry_point(None, service.context)
2020-02-16 02:35:53,352 [INFO ] W-9007-my_text_classifier_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - File "/home/ubuntu/anaconda3/envs/serve/lib/python3.8/site-packages/ts/torch_handler/text_classifier.py", line 79, in handle
2020-02-16 02:35:53,352 [INFO ] W-9007-my_text_classifier_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - raise e
2020-02-16 02:35:53,352 [INFO ] W-9007-my_text_classifier_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - File "/home/ubuntu/anaconda3/envs/serve/lib/python3.8/site-packages/ts/torch_handler/text_classifier.py", line 68, in handle
2020-02-16 02:35:53,352 [INFO ] W-9007-my_text_classifier_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - _service.initialize(context)
2020-02-16 02:35:53,352 [INFO ] W-9007-my_text_classifier_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - File "/home/ubuntu/anaconda3/envs/serve/lib/python3.8/site-packages/ts/torch_handler/text_handler.py", line 20, in initialize
2020-02-16 02:35:53,352 [INFO ] epollEventLoopGroup-4-30 org.pytorch.serve.wlm.WorkerThread - 9007 Worker disconnected. WORKER_STARTED
2020-02-16 02:35:53,352 [INFO ] W-9007-my_text_classifier_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - self.source_vocab = torch.load(self.manifest['model']['sourceVocab'])
2020-02-16 02:35:53,352 [INFO ] W-9007-my_text_classifier_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - File "/home/ubuntu/anaconda3/envs/serve/lib/python3.8/site-packages/torch/serialization.py", line 525, in load
2020-02-16 02:35:53,352 [INFO ] W-9007-my_text_classifier_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - with _open_file_like(f, 'rb') as opened_file:
2020-02-16 02:35:53,352 [INFO ] W-9007-my_text_classifier_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - File "/home/ubuntu/anaconda3/envs/serve/lib/python3.8/site-packages/torch/serialization.py", line 212, in _open_file_like
2020-02-16 02:35:53,352 [DEBUG] W-9007-my_text_classifier_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker monitoring thread interrupted or backend worker process died.
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088)
at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:418)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:128)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Long wait times for first request from TorchScript model

I have two identical models, one in code + weights, the other in TorchScript. Doing inference with TorchScript takes far, far longer, which is surprising.

The setup:

The non-TorchScript model is just the DenseNet-161 model archive from the README.me quick start.

The TorchScript model is the same one, but exported to TorchScript thus:

import torch
import torchvision
d161 = torchvision.models.densenet161(pretrained=True)
tsd161 = torch.jit.script(d161)
tsd161.save('tsd161.pt')

It was then packaged with:

torch-model-archiver --model-name tsd161 --version 1.0 --serialized-file tsd161.pt --handler image_classifier

The server is started with:

torchserve --start --model-store model_store --models densenet161=densenet161.mar tsd161=tsd161.mar

This is the timing output from calling the regular model:

time curl -X POST http://127.0.0.1:8080/predictions/densenet161 -T kitten.jpg
[
  {
    "tiger_cat": 0.46933549642562866
  },
  {
    "tabby": 0.4633878469467163
  },
  {
    "Egyptian_cat": 0.06456148624420166
  },
  {
    "lynx": 0.0012828214094042778
  },
  {
    "plastic_bag": 0.00023323034110944718
  }
]
curl -X POST http://127.0.0.1:8080/predictions/densenet161 -T kitten.jpg  0.01s user 0.01s system 2% cpu 0.428 total

And from the TorchScript:

time curl -X POST http://127.0.0.1:8080/predictions/tsd161 -T kitten.jpg
[
  {
    "282": "0.46933549642562866"
  },
  {
    "281": "0.4633878469467163"
  },
  {
    "285": "0.06456148624420166"
  },
  {
    "287": "0.0012828214094042778"
  },
  {
    "728": "0.00023323034110944718"
  }
]curl -X POST http://127.0.0.1:8080/predictions/tsd161 -T kitten.jpg  0.01s user 0.01s system 0% cpu 1:16.54 total

The identical output between the two (except for the human-readable labels) shows we're dealing with the same model in both instances.

I'm marking this launch blocking, at least until we understand what's happening.

Restricting worker env var access should use whitelist, not blacklist

When restricting access to something with a functionally infinite address space - like the namespace of env vars - blacklisting is a poor practice. Every new env variable that the blacklist doesn't know about becomes a new potential vulnerability. It is impossible for the blacklist to capture all possible invalid inputs that it might want to restrict.

A better practice is to use whitelisting. It is possible to know a priori which env vars the worker needs, and hopefully a runtime failure will alert the user to an attempt to access a non-whitelisted variable.

Unable to install on GPU machine

pip install . command failed with errors when installing on a ubuntu 18.04 GPU server (aws p3.8xlarge). The gradle tests as part of the install are failing with many Backend worker monitoring thread interrupted or backend worker process died errors in the logs.

Error logs are attached.
error.txt

TorchServe fails to start multiple workers threads on multiple GPUs with large model.

On a c5.12xlarge instance, I was able to run 16 instances of the FairSeq English-to-German translation model, all simultaneously running translations. This model's weights take up about 2.5GB on disk (though its resident footprint in memory seems smaller).

Attempting a similar feat on a p3.8xlarge turned out to be impossible. I could get a single instance running, but if I attempted to get even 4 workers running, they crash repeatedly with OOMEs:

2020-02-26 02:46:34,454 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Torch worker started.
2020-02-26 02:46:34,454 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Python runtime: 3.6.6
2020-02-26 02:46:34,454 [DEBUG] W-9001-fairseq_model_1.0 org.pytorch.serve.wlm.WorkerThread - W-9001-fairseq_model_1.0 State change WORKER_STOPPED -> WORKER_STARTED
2020-02-26 02:46:34,454 [INFO ] W-9001-fairseq_model_1.0 org.pytorch.serve.wlm.WorkerThread - Connecting to: /tmp/.ts.sock.9001
2020-02-26 02:46:34,455 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Connection accepted: /tmp/.ts.sock.9001.
2020-02-26 02:46:38,734 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Backend worker process die.
2020-02-26 02:46:38,734 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Traceback (most recent call last):
2020-02-26 02:46:38,734 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/ts/model_service_worker.py", line 163, in <module>
2020-02-26 02:46:38,734 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     worker.run_server()
2020-02-26 02:46:38,734 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/ts/model_service_worker.py", line 141, in run_server
2020-02-26 02:46:38,734 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     self.handle_connection(cl_socket)
2020-02-26 02:46:38,734 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/ts/model_service_worker.py", line 105, in handle_connection
2020-02-26 02:46:38,734 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     service, result, code = self.load_model(msg)
2020-02-26 02:46:38,734 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/ts/model_service_worker.py", line 83, in load_model
2020-02-26 02:46:38,734 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     service = model_loader.load(model_name, model_dir, handler, gpu, batch_size)
2020-02-26 02:46:38,734 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/ts/model_loader.py", line 107, in load
2020-02-26 02:46:38,734 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     entry_point(None, service.context)
2020-02-26 02:46:38,734 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/tmp/models/fa2ad7fda70376da33595966d0cf3c38702ea6d1/fairseq_handler.py", line 120, in handle
2020-02-26 02:46:38,735 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     raise e
2020-02-26 02:46:38,735 [INFO ] epollEventLoopGroup-4-15 org.pytorch.serve.wlm.WorkerThread - 9001 Worker disconnected. WORKER_STARTED
2020-02-26 02:46:38,735 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/tmp/models/fa2ad7fda70376da33595966d0cf3c38702ea6d1/fairseq_handler.py", line 109, in handle
2020-02-26 02:46:38,735 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     _service.initialize(context)
2020-02-26 02:46:38,735 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/tmp/models/fa2ad7fda70376da33595966d0cf3c38702ea6d1/fairseq_handler.py", line 73, in initialize
2020-02-26 02:46:38,735 [DEBUG] W-9001-fairseq_model_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker monitoring thread interrupted or backend worker process died.
java.lang.InterruptedException
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088)
        at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:418)
        at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:128)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
2020-02-26 02:46:38,735 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     state_dict = torch.load(model_pt_path, map_location=self.device)
2020-02-26 02:46:38,735 [WARN ] W-9001-fairseq_model_1.0 org.pytorch.serve.wlm.BatchAggregator - Load model failed: fairseq_model, error: Worker died.
2020-02-26 02:46:38,735 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/torch/serialization.py", line 529, in load
2020-02-26 02:46:38,735 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
2020-02-26 02:46:38,735 [DEBUG] W-9001-fairseq_model_1.0 org.pytorch.serve.wlm.WorkerThread - W-9001-fairseq_model_1.0 State change WORKER_STARTED -> WORKER_STOPPED
2020-02-26 02:46:38,735 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/torch/serialization.py", line 702, in _legacy_load
2020-02-26 02:46:38,735 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     result = unpickler.load()
2020-02-26 02:46:38,735 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/torch/serialization.py", line 665, in persistent_load
2020-02-26 02:46:38,735 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     deserialized_objects[root_key] = restore_location(obj, location)
2020-02-26 02:46:38,735 [INFO ] W-9001-fairseq_model_1.0 org.pytorch.serve.wlm.WorkerThread - Retry worker: 9001 in 21 seconds.
2020-02-26 02:46:38,735 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/torch/serialization.py", line 740, in restore_location
2020-02-26 02:46:38,735 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     return default_restore_location(storage, str(map_location))
2020-02-26 02:46:38,735 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/torch/serialization.py", line 156, in default_restore_location
2020-02-26 02:46:38,735 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     result = fn(storage, location)
2020-02-26 02:46:38,735 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/torch/serialization.py", line 136, in _cuda_deserialize
2020-02-26 02:46:38,735 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     return storage_type(obj.size())
2020-02-26 02:46:38,735 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/torch/cuda/__init__.py", line 480, in _lazy_new
2020-02-26 02:46:38,735 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     return super(_CudaBase, cls).__new__(cls, *args, **kwargs)
2020-02-26 02:46:38,735 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - RuntimeError: CUDA out of memory. Tried to allocate 802.00 MiB (GPU 0; 15.75 GiB total capacity; 1.69 GiB already allocated; 605.12 MiB free; 1.69 GiB reserved in total by PyTorch)

On digging through the logs, it appears that it's attempting to start all workers on the same GPU. The following is the output of grep GPU ts_log.log:

2020-02-26 02:45:13,375 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 15.75 GiB total capacity; 3.03 GiB already allocated; 19.12 MiB free; 3.03 GiB reserved in total by PyTorch)
2020-02-26 02:45:13,375 [INFO ] W-9000-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 15.75 GiB total capacity; 3.16 GiB already allocated; 19.12 MiB free; 3.16 GiB reserved in total by PyTorch)
2020-02-26 02:45:13,383 [INFO ] W-9003-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 15.75 GiB total capacity; 2.60 GiB already allocated; 19.12 MiB free; 2.60 GiB reserved in total by PyTorch)
2020-02-26 02:45:26,382 [INFO ] W-9003-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - RuntimeError: CUDA out of memory. Tried to allocate 16.00 MiB (GPU 0; 15.75 GiB total capacity; 3.05 GiB already allocated; 19.12 MiB free; 3.05 GiB reserved in total by PyTorch)
2020-02-26 02:45:26,519 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - RuntimeError: CUDA out of memory. Tried to allocate 128.00 MiB (GPU 0; 15.75 GiB total capacity; 2.47 GiB already allocated; 19.12 MiB free; 2.47 GiB reserved in total by PyTorch)
2020-02-26 02:45:38,899 [INFO ] W-9003-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 15.75 GiB total capacity; 2.89 GiB already allocated; 7.12 MiB free; 2.89 GiB reserved in total by PyTorch)
2020-02-26 02:45:38,904 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 15.75 GiB total capacity; 2.65 GiB already allocated; 7.12 MiB free; 2.65 GiB reserved in total by PyTorch)
2020-02-26 02:45:52,201 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - RuntimeError: CUDA out of memory. Tried to allocate 16.00 MiB (GPU 0; 15.75 GiB total capacity; 3.05 GiB already allocated; 19.12 MiB free; 3.05 GiB reserved in total by PyTorch)
2020-02-26 02:45:59,593 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - RuntimeError: CUDA out of memory. Tried to allocate 802.00 MiB (GPU 0; 15.75 GiB total capacity; 1.69 GiB already allocated; 605.12 MiB free; 1.69 GiB reserved in total by PyTorch)
2020-02-26 02:46:08,962 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - RuntimeError: CUDA out of memory. Tried to allocate 802.00 MiB (GPU 0; 15.75 GiB total capacity; 1.69 GiB already allocated; 605.12 MiB free; 1.69 GiB reserved in total by PyTorch)
2020-02-26 02:46:21,358 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - RuntimeError: CUDA out of memory. Tried to allocate 802.00 MiB (GPU 0; 15.75 GiB total capacity; 1.69 GiB already allocated; 605.12 MiB free; 1.69 GiB reserved in total by PyTorch)
2020-02-26 02:46:38,735 [INFO ] W-9001-fairseq_model_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - RuntimeError: CUDA out of memory. Tried to allocate 802.00 MiB (GPU 0; 15.75 GiB total capacity; 1.69 GiB already allocated; 605.12 MiB free; 1.69 GiB reserved in total by PyTorch)

Note that multiple workers (W-9000, W-9001, W-9003) are shown, but only one GPU turns up (GPU 0). The p3.8xlarge has 4 GPUs.

I attempted to use arguments of the Management API, such as number_gpu=4, to fix this, but nothing worked. Same result every time.

Conda preferable to pip

Anaconda is the preferred method for installing PyTorch & Torchvision - it would be good if the installation instructions mirrored that.

TorchServe ignores batch config properties

In my config.properties file, I have the lines:

batch_size=4
max_batch_delay=200

I started TorchServe with the command line:

torchserve --start --ts-config config.properties --models d161good=d161good.mar  --model-store model_store

When I query the status of the endpoint with curl http://127.0.0.1:8081/models/d161good, I get:

[
  {
    "modelName": "d161good",
    "modelVersion": "1.0",
    "modelUrl": "d161good.mar",
    "runtime": "python",
    "minWorkers": 12,
    "maxWorkers": 12,
    "batchSize": 1,
    "maxBatchDelay": 100,
    "loadedAtStartup": true,
...

Note the "batchSize" and "maxBatchDelay" entries.

Add handler for audio models with an example

Add new handler for audio models along with an example for pre / post processing of audio data.

TorchServe failing to batch multi-image requests

My endpoint is configured thus:

[
  {
    "modelName": "d161good",
    "modelVersion": "1.0",
    "modelUrl": "d161good.mar",
    "runtime": "python",
    "minWorkers": 1,
    "maxWorkers": 1,
    "batchSize": 4,
    "maxBatchDelay": 5000,

When I make a multi-file request, it processes both inputs correctly, but does them as separate batches. For example, when I use the following command line:

curl -X POST http://127.0.0.1:8080/predictions/d161good -T "{kitten.jpg,kitten2.jpg}"

I get the following log:

2020-02-21 16:09:24,419 [INFO ] W-9000-d161good_1.0 org.pytorch.serve.wlm.WorkerThread - Backend response time: 271
2020-02-21 16:09:24,419 [INFO ] W-9000-d161good_1.0-stdout MODEL_METRICS - PredictionTime.Milliseconds:270.15|#ModelName:d161good,Level:Model|#hostname:bradheintz-mbp,requestID:3cb6117f-bcf7-4195-b515-965f9ab45e73,timestamp:1582330164
2020-02-21 16:09:24,419 [INFO ] W-9000-d161good_1.0 ACCESS_LOG - /127.0.0.1:54914 "POST /predictions/d161good HTTP/1.1" 200 5275
2020-02-21 16:09:24,419 [INFO ] W-9000-d161good_1.0 TS_METRICS - Requests2XX.Count:1|#Level:Host|#hostname:bradheintz-mbp,timestamp:null
2020-02-21 16:09:24,419 [DEBUG] W-9000-d161good_1.0 org.pytorch.serve.wlm.Job - Waiting time: 5002, Backend time: 273
2020-02-21 16:09:29,704 [INFO ] W-9000-d161good_1.0 org.pytorch.serve.wlm.WorkerThread - Backend response time: 276
2020-02-21 16:09:29,704 [INFO ] W-9000-d161good_1.0-stdout MODEL_METRICS - PredictionTime.Milliseconds:275.3|#ModelName:d161good,Level:Model|#hostname:bradheintz-mbp,requestID:af4897b6-7b65-4b13-b650-52add0a75147,timestamp:1582330169
2020-02-21 16:09:29,704 [INFO ] W-9000-d161good_1.0 ACCESS_LOG - /127.0.0.1:54914 "POST /predictions/d161good HTTP/1.1" 200 5283
2020-02-21 16:09:29,705 [INFO ] W-9000-d161good_1.0 TS_METRICS - Requests2XX.Count:1|#Level:Host|#hostname:bradheintz-mbp,timestamp:null
2020-02-21 16:09:29,705 [DEBUG] W-9000-d161good_1.0 org.pytorch.serve.wlm.Job - Waiting time: 5004, Backend time: 278

Note that it is handling them as separate requests, and not batching them - it's waiting for the maxBatchDelay to run down before each file passed in the single request.

I've verified this up to 5 files.

Undocumented options for config.properties

The model_store and load_model are mentioned briefly in the config docs, but their syntax is not explicated. I had to dig in test config files for the information.

Question: Should new model version always become default?

If I put up an image classifier model with a version of 1.0, then register another with the same name and a version of 1.1, version 1.1 automatically becomes the default. Is this intentional?

Part of the point of versioning is to be able to test new versions without taking down the old version (or appearing to by taking over the default URL). Based on this, it would seem that the current behavior is broken.

One possible fix: Include a management API config flag that determines whether the new version should become default. I'm not even all that concerned with what that flag would default to, as long as I have a way to register v1.1 without taking "default" status away from v1.0.

Add example for Custom Service

Add a new example for usage of Custom Service. Given that most user's of TorchServe will have their own custom models, this will help in jump starting things for deploying their models into production.

Undocumented and legacy endpoints

Calling curl -X OPTIONS http://localhost:8080 (not 8443 as the docs originally had it) gives an API description for inference that has a number of undocumented endpoints, including /predict (which it says is "A legacy predict entry point for each model"), /invoke (not clear what this is supposed to do if it's not prediction).

I'm not opposed to having /predict be an alias for /predictions, but in that case, we should probably have it described as an alias and not legacy cruft. Or we could just cut it. Similarly, /invoke and related endpoints should probably be documented or removed from the API description.

TorchServe startup fails to honor valid combinations of config.properties and command line

I have models in my model_store folder, and the line model_store=model_store in my config.properties file.

The following two command lines work, with the first starting no endpoints and the second starting workers for the model:

torchserve --start --ts-config config_with_batch.properties
torchserve --start --ts-config config_with_batch.properties --models d161good=d161good.mar  --model-store model_store

In the first one, it takes the model store location from config; in the second, it ignores that in favor of the command line.

This command line fails (error message in the second line):

> torchserve --start --ts-config config_with_batch.properties --models d161good=d161good.mar
--model-store is required to load model locally.

Again the model store is specified, just not on the command line. TorchServe should check the config for missing parameters before rejecting a command line.

torch-model-archiver Installation

Despite what the instructions say, I don't see any installation instructions for torch-model-archiver - and it's needed for the quick start.

It was simple enough to cd to the right folder and run setup.py, but this should still be called out correctly in the quick start.

Can't log custom metrics

I attempted to log custom metrics according to the docs. I did the following:

At the top of my custom model handler:

import ts
from ts.metrics import dimension

In the inference method:

t = time.process_time()
dim1 = dimension('byzantine pernicious', t)
self.metrics.add_metric('byzantine pernicious', t + 1, dimensions=[dim1]) # test custom metrics

This fails with the error:

2020-02-21 00:24:26,537 [INFO ] W-9023-my_tc_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Invoking custom service failed.
2020-02-21 00:24:26,537 [INFO ] W-9023-my_tc_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Traceback (most recent call last):
2020-02-21 00:24:26,537 [INFO ] W-9023-my_tc_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/ts/service.py", line 100, in predict
2020-02-21 00:24:26,537 [INFO ] W-9023-my_tc_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     ret = self._entry_point(input_batch, self.context)
2020-02-21 00:24:26,537 [INFO ] W-9023-my_tc_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/tmp/models/d37a667068d56c2de0b1c0a38aec56aa152ca6ba/text_munger.py", line 334, in handle
2020-02-21 00:24:26,537 [INFO ] W-9023-my_tc_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     raise e
2020-02-21 00:24:26,537 [INFO ] W-9023-my_tc_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/tmp/models/d37a667068d56c2de0b1c0a38aec56aa152ca6ba/text_munger.py", line 329, in handle
2020-02-21 00:24:26,537 [INFO ] W-9023-my_tc_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     data = _service.inference(data)
2020-02-21 00:24:26,537 [INFO ] W-9023-my_tc_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/tmp/models/d37a667068d56c2de0b1c0a38aec56aa152ca6ba/text_munger.py", line 300, in inference
2020-02-21 00:24:26,537 [INFO ] W-9023-my_tc_1.0 org.pytorch.serve.wlm.WorkerThread - Backend response time: 67
2020-02-21 00:24:26,537 [INFO ] W-9023-my_tc_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     dim1 = dimension('byzantine pernicious', t)
2020-02-21 00:24:26,537 [INFO ] W-9023-my_tc_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - TypeError: 'module' object is not callable

It's not really clear to me what's wrong there. I tried using it without the allegedly optional dimensions argument, and got:

2020-02-21 00:34:22,018 [INFO ] W-9017-my_tc_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Invoking custom service failed.
2020-02-21 00:34:22,018 [INFO ] W-9017-my_tc_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Traceback (most recent call last):
2020-02-21 00:34:22,018 [INFO ] W-9017-my_tc_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/ts/service.py", line 100, in predict
2020-02-21 00:34:22,018 [INFO ] W-9017-my_tc_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     ret = self._entry_point(input_batch, self.context)
2020-02-21 00:34:22,018 [INFO ] W-9017-my_tc_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/tmp/models/829d527168600113f021d498f082c0e95ba63c6c/text_munger.py", line 333, in handle
2020-02-21 00:34:22,018 [INFO ] W-9017-my_tc_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     raise e
2020-02-21 00:34:22,018 [INFO ] W-9017-my_tc_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/tmp/models/829d527168600113f021d498f082c0e95ba63c6c/text_munger.py", line 328, in handle
2020-02-21 00:34:22,018 [INFO ] W-9017-my_tc_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     data = _service.inference(data)
2020-02-21 00:34:22,018 [INFO ] W-9017-my_tc_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/tmp/models/829d527168600113f021d498f082c0e95ba63c6c/text_munger.py", line 300, in inference
2020-02-21 00:34:22,018 [INFO ] W-9017-my_tc_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     self.metrics.add_metric('byzantine pernicious', t) # test custom metrics
2020-02-21 00:34:22,018 [INFO ] W-9017-my_tc_1.0 org.pytorch.serve.wlm.WorkerThread - Backend response time: 67
2020-02-21 00:34:22,018 [INFO ] W-9017-my_tc_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/ts/metrics/metrics_store.py", line 201, in add_metric
2020-02-21 00:34:22,018 [INFO ] W-9017-my_tc_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     self._add_or_update(name, value, req_id, unit, dimensions)
2020-02-21 00:34:22,018 [INFO ] W-9017-my_tc_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/ts/metrics/metrics_store.py", line 58, in _add_or_update
2020-02-21 00:34:22,018 [INFO ] W-9017-my_tc_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     dim_str = '-'.join(dim_str)
2020-02-21 00:34:22,018 [INFO ] W-9017-my_tc_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - TypeError: sequence item 1: expected str instance, NoneType found

Examining the code made it clear that the dimensions arg was not actually optional.

Also, the documentation is ambiguous as te whether the correct object name is dimension or Dimension. It looks to bet he former.

IllegalMonitorState Exceptions on running benchmark tests

Seeing many illegalmonitorstate exceptions on running the benchmarks tests. Tested on p2.8xlarge
with the default configurations using the config.properties in the benchmark folder and adding model store path to it.

Output

(pytorch_p36) ubuntu@ip-172-31-41-247:~/serve/benchmarks$ python benchmark.py throughput --ts http://127.0.0.1:8080
Running benchmark throughput with model resnet-18
Processing jmeter output
Output available at /tmp/TSBenchmark/out/throughput/resnet-18
Report generated at /tmp/TSBenchmark/out/throughput/resnet-18/report/index.html
[{'throughput_resnet-18_Inference_Request_Average': 670,
'throughput_resnet-18_Inference_Request_Median': 526,
'throughput_resnet-18_Inference_Request_Throughput': 137.0,
'throughput_resnet-18_Inference_Request_aggregate_report_90_line': 681,
'throughput_resnet-18_Inference_Request_aggregate_report_99_line': 8585,
'throughput_resnet-18_Inference_Request_aggregate_report_error': '0.00%'}]

Jmeter reports: Don't show any errors
https://sagemaker-pt.s3-us-west-2.amazonaws.com/bench.zip

Even though hits/sec reaches 200/sec, in the metrics logged on the console there is never more than one active request at any time. So unclear if the metrics are getting logged properly. The Requests2xx.Count is always 1 as below:

2020-03-07 01:58:50,800 [INFO ] W-9015-resnet-18_1.0 TS_METRICS - Requests2XX.Count:1|#Level:Host|#hostname:ip-172-31-41-247,timestamp:null

Errors in TrochServe console logs
2020-03-07 01:58:51,109 [INFO ] W-9008-resnet-18_1.0 TS_METRICS - Requests2XX.Count:1|#Level:Host|#hostname:ip-172-31-41-247,timestamp:null
2020-03-07 01:58:51,109 [DEBUG] W-9008-resnet-18_1.0 org.pytorch.serve.wlm.Job - Waiting time: 0, Backend time: 38
2020-03-07 01:58:51,115 [INFO ] W-9023-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend response time: 40
2020-03-07 01:58:51,115 [INFO ] W-9023-resnet-18_1.0 ACCESS_LOG - /127.0.0.1:50756 "POST /predictions/resnet-18 HTTP/1.1" 200 50
2020-03-07 01:58:51,115 [INFO ] W-9023-resnet-18_1.0 TS_METRICS - Requests2XX.Count:1|#Level:Host|#hostname:ip-172-31-41-247,timestamp:null
2020-03-07 01:58:51,115 [DEBUG] W-9023-resnet-18_1.0 org.pytorch.serve.wlm.Job - Waiting time: 0, Backend time: 49
2020-03-07 01:58:51,115 [INFO ] W-9023-resnet-18_1.0-stdout MODEL_METRICS - PredictionTime.Milliseconds:36.98|#ModelName:resnet-18,Level:Model|#hostname:ip-172-31-41-247,requestID:63c3d128-d2ab-4e60-8ab7-fa9905b6de93,timestamp:1583546331
2020-03-07 01:58:51,130 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.ModelVersionedRefs - Removed model: resnet-18 version: 1.0
2020-03-07 01:58:51,131 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9031-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:51,133 [WARN ] W-9031-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:51,145 [INFO ] epollEventLoopGroup-4-32 org.pytorch.serve.wlm.WorkerThread - 9031 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:51,146 [DEBUG] W-9031-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9031-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:51,148 [DEBUG] W-9031-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:51,315 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9030-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:51,316 [WARN ] W-9030-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:51,317 [INFO ] epollEventLoopGroup-4-29 org.pytorch.serve.wlm.WorkerThread - 9030 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:51,317 [DEBUG] W-9030-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9030-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:51,318 [DEBUG] W-9030-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:51,584 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9029-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:51,585 [WARN ] W-9029-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:51,585 [INFO ] epollEventLoopGroup-4-31 org.pytorch.serve.wlm.WorkerThread - 9029 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:51,586 [DEBUG] W-9029-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9029-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:51,587 [DEBUG] W-9029-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:51,855 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9028-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:51,855 [WARN ] W-9028-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:51,856 [INFO ] epollEventLoopGroup-4-28 org.pytorch.serve.wlm.WorkerThread - 9028 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:51,857 [DEBUG] W-9028-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9028-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:51,858 [DEBUG] W-9028-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:52,124 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9027-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:52,125 [WARN ] W-9027-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:52,126 [INFO ] epollEventLoopGroup-4-30 org.pytorch.serve.wlm.WorkerThread - 9027 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:52,127 [DEBUG] W-9027-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9027-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:52,129 [DEBUG] W-9027-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:52,394 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9026-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:52,394 [WARN ] W-9026-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:52,395 [INFO ] epollEventLoopGroup-4-26 org.pytorch.serve.wlm.WorkerThread - 9026 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:52,395 [DEBUG] W-9026-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9026-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:52,396 [DEBUG] W-9026-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:52,664 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9025-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:52,664 [WARN ] W-9025-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:52,665 [INFO ] epollEventLoopGroup-4-27 org.pytorch.serve.wlm.WorkerThread - 9025 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:52,665 [DEBUG] W-9025-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9025-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:52,666 [DEBUG] W-9025-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:52,932 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9024-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:52,932 [WARN ] W-9024-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:52,933 [INFO ] epollEventLoopGroup-4-24 org.pytorch.serve.wlm.WorkerThread - 9024 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:52,933 [DEBUG] W-9024-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9024-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:52,934 [DEBUG] W-9024-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:53,202 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9023-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:53,202 [WARN ] W-9023-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:53,203 [INFO ] epollEventLoopGroup-4-25 org.pytorch.serve.wlm.WorkerThread - 9023 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:53,203 [DEBUG] W-9023-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9023-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:53,204 [DEBUG] W-9023-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:53,359 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9022-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:53,359 [WARN ] W-9022-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:53,360 [INFO ] epollEventLoopGroup-4-17 org.pytorch.serve.wlm.WorkerThread - 9022 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:53,360 [DEBUG] W-9022-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9022-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:53,361 [DEBUG] W-9022-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:53,626 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9021-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:53,626 [WARN ] W-9021-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:53,627 [INFO ] epollEventLoopGroup-4-19 org.pytorch.serve.wlm.WorkerThread - 9021 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:53,627 [DEBUG] W-9021-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9021-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:53,628 [DEBUG] W-9021-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:53,892 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9020-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:53,893 [WARN ] W-9020-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:53,893 [INFO ] epollEventLoopGroup-4-21 org.pytorch.serve.wlm.WorkerThread - 9020 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:53,893 [DEBUG] W-9020-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9020-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:53,895 [DEBUG] W-9020-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:54,158 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9019-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:54,158 [WARN ] W-9019-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:54,159 [INFO ] epollEventLoopGroup-4-18 org.pytorch.serve.wlm.WorkerThread - 9019 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:54,159 [DEBUG] W-9019-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9019-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:54,159 [DEBUG] W-9019-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:54,427 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9018-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:54,428 [INFO ] epollEventLoopGroup-4-4 org.pytorch.serve.wlm.WorkerThread - 9018 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:54,429 [DEBUG] W-9018-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Shutting down the thread .. Scaling down.
2020-03-07 01:58:54,429 [DEBUG] W-9018-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9018-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:54,429 [DEBUG] W-9018-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:54,696 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9017-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:54,697 [WARN ] W-9017-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:54,698 [INFO ] epollEventLoopGroup-4-9 org.pytorch.serve.wlm.WorkerThread - 9017 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:54,698 [DEBUG] W-9017-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9017-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:54,698 [DEBUG] W-9017-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:54,964 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9016-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:54,965 [WARN ] W-9016-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:54,965 [INFO ] epollEventLoopGroup-4-13 org.pytorch.serve.wlm.WorkerThread - 9016 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:54,965 [DEBUG] W-9016-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9016-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:54,965 [DEBUG] W-9016-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:55,235 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9015-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:55,235 [WARN ] W-9015-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:55,235 [INFO ] epollEventLoopGroup-4-22 org.pytorch.serve.wlm.WorkerThread - 9015 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:55,235 [DEBUG] W-9015-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9015-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:55,236 [DEBUG] W-9015-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:55,389 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9014-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:55,389 [WARN ] W-9014-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:55,390 [INFO ] epollEventLoopGroup-4-5 org.pytorch.serve.wlm.WorkerThread - 9014 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:55,390 [DEBUG] W-9014-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9014-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:55,390 [DEBUG] W-9014-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:55,654 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9013-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:55,654 [WARN ] W-9013-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:55,654 [INFO ] epollEventLoopGroup-4-8 org.pytorch.serve.wlm.WorkerThread - 9013 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:55,655 [DEBUG] W-9013-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9013-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:55,655 [DEBUG] W-9013-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:55,919 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9012-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:55,919 [WARN ] W-9012-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:55,920 [INFO ] epollEventLoopGroup-4-11 org.pytorch.serve.wlm.WorkerThread - 9012 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:55,920 [DEBUG] W-9012-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9012-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:55,920 [DEBUG] W-9012-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:56,185 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9011-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:56,185 [WARN ] W-9011-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:56,185 [INFO ] epollEventLoopGroup-4-14 org.pytorch.serve.wlm.WorkerThread - 9011 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:56,186 [DEBUG] W-9011-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9011-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:56,187 [DEBUG] W-9011-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:56,449 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9010-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:56,449 [WARN ] W-9010-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:56,449 [INFO ] epollEventLoopGroup-4-20 org.pytorch.serve.wlm.WorkerThread - 9010 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:56,450 [DEBUG] W-9010-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9010-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:56,450 [DEBUG] W-9010-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:56,712 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9009-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:56,712 [INFO ] epollEventLoopGroup-4-15 org.pytorch.serve.wlm.WorkerThread - 9009 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:56,712 [WARN ] W-9009-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:56,713 [DEBUG] W-9009-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9009-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:56,713 [DEBUG] W-9009-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:56,976 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9008-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:56,976 [INFO ] epollEventLoopGroup-4-1 org.pytorch.serve.wlm.WorkerThread - 9008 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:56,976 [WARN ] W-9008-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:56,977 [DEBUG] W-9008-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9008-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:56,977 [DEBUG] W-9008-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:57,240 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9007-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:57,240 [INFO ] epollEventLoopGroup-4-23 org.pytorch.serve.wlm.WorkerThread - 9007 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:57,240 [WARN ] W-9007-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:57,240 [DEBUG] W-9007-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9007-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:57,240 [DEBUG] W-9007-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:57,391 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9006-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:57,391 [INFO ] epollEventLoopGroup-4-2 org.pytorch.serve.wlm.WorkerThread - 9006 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:57,391 [WARN ] W-9006-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:57,391 [DEBUG] W-9006-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9006-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:57,392 [DEBUG] W-9006-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:57,655 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9005-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:57,656 [INFO ] epollEventLoopGroup-4-6 org.pytorch.serve.wlm.WorkerThread - 9005 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:57,656 [WARN ] W-9005-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:57,656 [DEBUG] W-9005-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9005-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:57,656 [DEBUG] W-9005-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:57,919 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9004-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:57,919 [INFO ] epollEventLoopGroup-4-16 org.pytorch.serve.wlm.WorkerThread - 9004 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:57,919 [DEBUG] W-9004-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Shutting down the thread .. Scaling down.
2020-03-07 01:58:57,920 [DEBUG] W-9004-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9004-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:57,920 [DEBUG] W-9004-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:58,182 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9003-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:58,182 [WARN ] W-9003-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:58,182 [INFO ] epollEventLoopGroup-4-7 org.pytorch.serve.wlm.WorkerThread - 9003 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:58,183 [DEBUG] W-9003-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9003-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:58,183 [DEBUG] W-9003-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:58,443 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9002-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:58,443 [WARN ] W-9002-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:58,443 [INFO ] epollEventLoopGroup-4-3 org.pytorch.serve.wlm.WorkerThread - 9002 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:58,444 [DEBUG] W-9002-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9002-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:58,444 [DEBUG] W-9002-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:58,706 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9001-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:58,706 [DEBUG] W-9001-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Shutting down the thread .. Scaling down.
2020-03-07 01:58:58,706 [INFO ] epollEventLoopGroup-4-10 org.pytorch.serve.wlm.WorkerThread - 9001 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:58,707 [DEBUG] W-9001-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9001-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:58,707 [DEBUG] W-9001-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:58,968 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9000-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:58,968 [DEBUG] W-9000-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Shutting down the thread .. Scaling down.
2020-03-07 01:58:58,968 [INFO ] epollEventLoopGroup-4-12 org.pytorch.serve.wlm.WorkerThread - 9000 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:58,968 [DEBUG] W-9000-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9000-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:58,969 [DEBUG] W-9000-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:59,344 [INFO ] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.ModelManager - Model resnet-18 unregistered.

Corresponding GPU / CPU utilizations via Cloudwatch (using the gpumon.py script from AWS)

Add integration tests into the CI build process for catching regression issues

Add basic integration tests into the CI build process for catching regression issues. Please ensure:

Basic install on a fresh machine or docker sandbox takes place as part of the sanity testing.
All API endpoints are verified for regression. It can be as simple as running a Postman script with all the endpoints and testing that models get deployed and inference is working for the examples bundled.

Java8 install steps using brew for MacOs does not work anymore

Java8 installation steps using brew for MacOs not working anymore. Please provide updated steps, It will be better to use OpenJDK instead for future proofing.

Following worked for me on Mac OS Catalina (10.15.1)

brew tap AdoptOpenJDK/openjdk
brew cask install adoptopenjdk8

Verify using:
java --version

If you get security error for installing from unverified source, add exception to the security settings from Settings-->Security & Privacy--> General -- "Allow apps download from"

NOTE: Torchserve install failed when using latest openjdk13 (gradle error), things worked for me only for jdk8. so please verify the supported version as well.

ResNet-152 example code does suboptimal things with tensors and loops

In examples/image_classifier/resnet_152_batch/resnet152_handler.py, in the preprocess() and postprocess() methods, tensors are looped over and subtensors/elements are handled individually. This processing will probably be faster and take less code using tensor broadcasting semantics and a small amount of refactoring. I'll handle it when I have a little time.

Problem loading image_classifier handler?

When I do an image classification task, I get the following error:

2020-02-18 23:29:48,871 [INFO ] W-9001-densenet161_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - No module named 'image_classifier'

This is consistent on Mac (latest MacOS) and Linux (Ubuntu 16.04 DL AMI v26). Latest master from the TorchServe repo. The thing still works and I get inference results, but that error message was a surprise.

If things are actually working correctly (which they mostly appear to be, we shouldn't be getting an error message. Conversely, if the error message is describing a legitimate issue, it should be fixed, and we should understand why things are working anyway.

Incorrect docs for --model-store option

Docs for the torchserve command line at https://github.com/pytorch/serve/blob/master/docs/server.md make the claim:

model-store: optional, A location where models are stored by default, all models in this location are loaded, the model name is same as archive or folder name.

This is in reference to the --model-store argument. This is incorrect; if it were true, I should be able to call torchserve --start --model-store model_store (where I have multiple models in the folder model_store) and get endpoints - but I get only 404s when trying to call them.

docker build failing on line 27

Docker build failing on line 27.

Error details:

Step 9/24 : ADD serve serve
ADD failed: stat /var/lib/docker/tmp/docker-builder277367370/serve: no such file or directory

Run AWS DeepLearning Benchmarks agains TS containers

AWS runs Deep Learning benchmarks that measure serving latency and throughputs against different models. Run these benchmarks against TorchServe Containers.

Here's the AWS Deep Learning Benchmark:
https://github.com/awslabs/deeplearning-benchmark/tree/master/mms

Some important files to get quick started are:
https://github.com/awslabs/deeplearning-benchmark/blob/master/benchmark_driver.py
https://github.com/awslabs/deeplearning-benchmark/blob/master/task_config_template.cfg#L360
https://github.com/awslabs/deeplearning-benchmark/blob/master/task_config_template.cfg#L384
https://github.com/awslabs/deeplearning-benchmark/blob/master/benchmark_runner.py

Document JDK version in install steps

Install stopped working after the latest changes. Failing with "FindBugs rule violations" for modelarchive.

Error details:

(myenv) ubuntu@ip-172-31-18-32:~/pt/serve$ pip install .
Processing /home/ubuntu/pt/serve
Requirement already satisfied: Pillow in /home/ubuntu/anaconda3/envs/myenv/lib/python3.8/site-packages (from torchserve==0.0.1b20200221) (7.0.0)
Requirement already satisfied: psutil in /home/ubuntu/anaconda3/envs/myenv/lib/python3.8/site-packages (from torchserve==0.0.1b20200221) (5.6.7)
Processing /home/ubuntu/.cache/pip/wheels/8e/70/28/3d6ccd6e315f65f245da085482a2e1c7d14b90b30f239e2cf4/future-0.18.2-py3-none-any.whl
Requirement already satisfied: torch in /home/ubuntu/anaconda3/envs/myenv/lib/python3.8/site-packages (from torchserve==0.0.1b20200221) (1.4.0)
Requirement already satisfied: torchvision in /home/ubuntu/anaconda3/envs/myenv/lib/python3.8/site-packages (from torchserve==0.0.1b20200221) (0.5.0)
Requirement already satisfied: torchtext in /home/ubuntu/anaconda3/envs/myenv/lib/python3.8/site-packages (from torchserve==0.0.1b20200221) (0.5.0)
Requirement already satisfied: numpy in /home/ubuntu/anaconda3/envs/myenv/lib/python3.8/site-packages (from torchvision->torchserve==0.0.1b20200221) (1.18.1)
Requirement already satisfied: six in /home/ubuntu/anaconda3/envs/myenv/lib/python3.8/site-packages (from torchvision->torchserve==0.0.1b20200221) (1.14.0)
Requirement already satisfied: tqdm in /home/ubuntu/anaconda3/envs/myenv/lib/python3.8/site-packages (from torchtext->torchserve==0.0.1b20200221) (4.42.1)
Requirement already satisfied: requests in /home/ubuntu/anaconda3/envs/myenv/lib/python3.8/site-packages (from torchtext->torchserve==0.0.1b20200221) (2.22.0)
Collecting sentencepiece
Using cached sentencepiece-0.1.85-cp38-cp38-manylinux1_x86_64.whl (1.0 MB)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /home/ubuntu/anaconda3/envs/myenv/lib/python3.8/site-packages (from requests->torchtext->torchserve==0.0.1b20200221) (1.25.8)
Requirement already satisfied: idna<2.9,>=2.5 in /home/ubuntu/anaconda3/envs/myenv/lib/python3.8/site-packages (from requests->torchtext->torchserve==0.0.1b20200221) (2.8)
Requirement already satisfied: certifi>=2017.4.17 in /home/ubuntu/anaconda3/envs/myenv/lib/python3.8/site-packages (from requests->torchtext->torchserve==0.0.1b20200221) (2019.11.28)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /home/ubuntu/anaconda3/envs/myenv/lib/python3.8/site-packages (from requests->torchtext->torchserve==0.0.1b20200221) (3.0.4)
Building wheels for collected packages: torchserve
Building wheel for torchserve (setup.py) ... error
ERROR: Command errored out with exit status 1:
command: /home/ubuntu/anaconda3/envs/myenv/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-req-build-srpt9i6p/setup.py'"'"'; file='"'"'/tmp/pip-req-build-srpt9i6p/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-kibaudnl
cwd: /tmp/pip-req-build-srpt9i6p/
Complete output (135 lines):
running bdist_wheel
running build
running build_py
running build_frontend

Task :cts:clean
Task :modelarchive:clean

Task :server:killServer
No server running!

Task :server:clean
Task :cts:compileJava NO-SOURCE
Task :cts:processResources NO-SOURCE
Task :cts:classes UP-TO-DATE
Task :cts:jar
Task :cts:assemble
Task :cts:checkstyleMain NO-SOURCE
Task :cts:compileTestJava NO-SOURCE
Task :cts:processTestResources NO-SOURCE
Task :cts:testClasses UP-TO-DATE
Task :cts:checkstyleTest NO-SOURCE
Task :cts:findbugsMain NO-SOURCE
Task :cts:findbugsTest NO-SOURCE
Task :cts:test NO-SOURCE
Task :cts:jacocoTestCoverageVerification SKIPPED
Task :cts:jacocoTestReport SKIPPED
Task :cts:pmdMain NO-SOURCE
Task :cts:pmdTest SKIPPED
Task :cts:verifyJava
Task :cts:check
Task :cts:build
Task :modelarchive:compileJava
Task :modelarchive:processResources NO-SOURCE
Task :modelarchive:classes
Task :modelarchive:jar
Task :modelarchive:assemble
Task :modelarchive:checkstyleMain
Task :modelarchive:compileTestJava
Task :modelarchive:processTestResources
Task :modelarchive:testClasses
Task :modelarchive:checkstyleTest

Task :modelarchive:findbugsMain
The following classes needed for analysis were missing:
java.lang.Object
java.lang.Exception
java.lang.Enum
java.lang.String
java.lang.Throwable
java.nio.charset.StandardCharsets
java.io.File
java.lang.Class
java.io.InputStream
java.util.regex.Pattern
java.util.regex.Matcher
java.lang.StringBuilder
java.io.FileInputStream
java.net.URL
java.net.HttpURLConnection
java.lang.System
java.io.InputStreamReader
java.io.Reader
java.security.MessageDigest
java.lang.AssertionError
java.security.DigestInputStream
java.util.Map
java.util.LinkedHashMap
java.lang.IllegalArgumentException
java.util.zip.ZipOutputStream
java.io.FileOutputStream
java.util.zip.ZipInputStream
java.util.zip.ZipEntry
java.io.OutputStream
java.lang.NoSuchFieldError
java.lang.Error
java.lang.IllegalAccessError
java.io.Serializable
java.lang.NegativeArraySizeException
java.lang.IncompatibleClassChangeError
java.lang.AbstractMethodError
java.lang.UnsatisfiedLinkError
java.net.MalformedURLException
java.security.NoSuchAlgorithmException
java.net.SocketTimeoutException

Task :modelarchive:findbugsMain FAILED

FAILURE: Build failed with an exception.

What went wrong:
Execution failed for task ':modelarchive:findbugsMain'.

FindBugs rule violations were found. See the report at: file:///tmp/pip-req-build-srpt9i6p/frontend/modelarchive/build/reports/findbugs/main.html

Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. Run with --scan to get full insights.
Get more help at https://help.gradle.org

BUILD FAILED in 3s
13 actionable tasks: 13 executed
Traceback (most recent call last):
File "", line 1, in
File "/tmp/pip-req-build-srpt9i6p/setup.py", line 137, in
setup(
File "/home/ubuntu/anaconda3/envs/myenv/lib/python3.8/site-packages/setuptools/init.py", line 144, in setup
return distutils.core.setup(**attrs)
File "/home/ubuntu/anaconda3/envs/myenv/lib/python3.8/distutils/core.py", line 148, in setup
dist.run_commands()
File "/home/ubuntu/anaconda3/envs/myenv/lib/python3.8/distutils/dist.py", line 966, in run_commands
self.run_command(cmd)
File "/home/ubuntu/anaconda3/envs/myenv/lib/python3.8/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/home/ubuntu/anaconda3/envs/myenv/lib/python3.8/site-packages/wheel/bdist_wheel.py", line 223, in run
self.run_command('build')
File "/home/ubuntu/anaconda3/envs/myenv/lib/python3.8/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/home/ubuntu/anaconda3/envs/myenv/lib/python3.8/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/home/ubuntu/anaconda3/envs/myenv/lib/python3.8/distutils/command/build.py", line 135, in run
self.run_command(cmd_name)
File "/home/ubuntu/anaconda3/envs/myenv/lib/python3.8/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/home/ubuntu/anaconda3/envs/myenv/lib/python3.8/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/tmp/pip-req-build-srpt9i6p/setup.py", line 98, in run
self.run_command('build_frontend')
File "/home/ubuntu/anaconda3/envs/myenv/lib/python3.8/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/home/ubuntu/anaconda3/envs/myenv/lib/python3.8/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/tmp/pip-req-build-srpt9i6p/setup.py", line 85, in run
subprocess.check_call('frontend/gradlew -p frontend clean build', shell=True)
File "/home/ubuntu/anaconda3/envs/myenv/lib/python3.8/subprocess.py", line 364, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'frontend/gradlew -p frontend clean build' returned non-zero exit status 1.

ERROR: Failed building wheel for torchserve
Running setup.py clean for torchserve
Failed to build torchserve
Installing collected packages: future, torchserve, sentencepiece
Running setup.py install for torchserve ... error
ERROR: Command errored out with exit status 1:
command: /home/ubuntu/anaconda3/envs/myenv/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-req-build-srpt9i6p/setup.py'"'"'; file='"'"'/tmp/pip-req-build-srpt9i6p/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record /tmp/pip-record-mk_hdekw/install-record.txt --single-version-externally-managed --compile --install-headers /home/ubuntu/anaconda3/envs/myenv/include/python3.8/torchserve
cwd: /tmp/pip-req-build-srpt9i6p/
Complete output (135 lines):
running install
running build
running build_py
running build_frontend
> Task :cts:clean
> Task :modelarchive:clean

> Task :server:killServer
No server running!

> Task :server:clean UP-TO-DATE
> Task :cts:compileJava NO-SOURCE
> Task :cts:processResources NO-SOURCE
> Task :cts:classes UP-TO-DATE
> Task :cts:jar
> Task :cts:assemble
> Task :cts:checkstyleMain NO-SOURCE
> Task :cts:compileTestJava NO-SOURCE
> Task :cts:processTestResources NO-SOURCE
> Task :cts:testClasses UP-TO-DATE
> Task :cts:checkstyleTest NO-SOURCE
> Task :cts:findbugsMain NO-SOURCE
> Task :cts:findbugsTest NO-SOURCE
> Task :cts:test NO-SOURCE
> Task :cts:jacocoTestCoverageVerification SKIPPED
> Task :cts:jacocoTestReport SKIPPED
> Task :cts:pmdMain NO-SOURCE
> Task :cts:pmdTest SKIPPED
> Task :cts:verifyJava
> Task :cts:check
> Task :cts:build
> Task :modelarchive:compileJava
> Task :modelarchive:processResources NO-SOURCE
> Task :modelarchive:classes
> Task :modelarchive:jar
> Task :modelarchive:assemble
> Task :modelarchive:checkstyleMain
> Task :modelarchive:compileTestJava
> Task :modelarchive:processTestResources
> Task :modelarchive:testClasses
> Task :modelarchive:checkstyleTest

> Task :modelarchive:findbugsMain FAILED
The following classes needed for analysis were missing:
  java.lang.Object
  java.lang.Exception
  java.lang.Enum
  java.lang.String
  java.lang.Throwable
  java.nio.charset.StandardCharsets
  java.io.File
  java.lang.Class
  java.io.InputStream
  java.util.regex.Pattern
  java.util.regex.Matcher
  java.lang.StringBuilder
  java.io.FileInputStream
  java.net.URL
  java.net.HttpURLConnection
  java.lang.System
  java.io.InputStreamReader
  java.io.Reader
  java.security.MessageDigest
  java.lang.AssertionError
  java.security.DigestInputStream
  java.util.Map
  java.util.LinkedHashMap
  java.lang.IllegalArgumentException
  java.util.zip.ZipOutputStream
  java.io.FileOutputStream
  java.util.zip.ZipInputStream
  java.util.zip.ZipEntry
  java.io.OutputStream
  java.lang.NoSuchFieldError
  java.lang.Error
  java.lang.IllegalAccessError
  java.io.Serializable
  java.lang.NegativeArraySizeException
  java.lang.IncompatibleClassChangeError
  java.lang.AbstractMethodError
  java.lang.UnsatisfiedLinkError
  java.net.MalformedURLException
  java.security.NoSuchAlgorithmException
  java.net.SocketTimeoutException

FAILURE: Build failed with an exception.

* What went wrong:
Execution failed for task ':modelarchive:findbugsMain'.
> FindBugs rule violations were found. See the report at: file:///tmp/pip-req-build-srpt9i6p/frontend/modelarchive/build/reports/findbugs/main.html

* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. Run with --scan to get full insights.

* Get more help at https://help.gradle.org

BUILD FAILED in 3s
13 actionable tasks: 12 executed, 1 up-to-date
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/tmp/pip-req-build-srpt9i6p/setup.py", line 137, in <module>
    setup(
  File "/home/ubuntu/anaconda3/envs/myenv/lib/python3.8/site-packages/setuptools/__init__.py", line 144, in setup
    return distutils.core.setup(**attrs)
  File "/home/ubuntu/anaconda3/envs/myenv/lib/python3.8/distutils/core.py", line 148, in setup
    dist.run_commands()
  File "/home/ubuntu/anaconda3/envs/myenv/lib/python3.8/distutils/dist.py", line 966, in run_commands
    self.run_command(cmd)
  File "/home/ubuntu/anaconda3/envs/myenv/lib/python3.8/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/home/ubuntu/anaconda3/envs/myenv/lib/python3.8/site-packages/setuptools/command/install.py", line 61, in run
    return orig.install.run(self)
  File "/home/ubuntu/anaconda3/envs/myenv/lib/python3.8/distutils/command/install.py", line 545, in run
    self.run_command('build')
  File "/home/ubuntu/anaconda3/envs/myenv/lib/python3.8/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/home/ubuntu/anaconda3/envs/myenv/lib/python3.8/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/home/ubuntu/anaconda3/envs/myenv/lib/python3.8/distutils/command/build.py", line 135, in run
    self.run_command(cmd_name)
  File "/home/ubuntu/anaconda3/envs/myenv/lib/python3.8/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/home/ubuntu/anaconda3/envs/myenv/lib/python3.8/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/tmp/pip-req-build-srpt9i6p/setup.py", line 98, in run
    self.run_command('build_frontend')
  File "/home/ubuntu/anaconda3/envs/myenv/lib/python3.8/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/home/ubuntu/anaconda3/envs/myenv/lib/python3.8/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/tmp/pip-req-build-srpt9i6p/setup.py", line 85, in run
    subprocess.check_call('frontend/gradlew -p frontend clean build', shell=True)
  File "/home/ubuntu/anaconda3/envs/myenv/lib/python3.8/subprocess.py", line 364, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'frontend/gradlew -p frontend clean build' returned non-zero exit status 1.
----------------------------------------

ERROR: Command errored out with exit status 1: /home/ubuntu/anaconda3/envs/myenv/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-req-build-srpt9i6p/setup.py'"'"'; file='"'"'/tmp/pip-req-build-srpt9i6p/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record /tmp/pip-record-mk_hdekw/install-record.txt --single-version-externally-managed --compile --install-headers /home/ubuntu/anaconda3/envs/myenv/include/python3.8/torchserve Check the logs for full command output

Security issue loading model archives from arbitrary URL

There is a significant security issue with fetching new models via URL, in that model archives may contain arbitrary code, which will be run by TorchServe with no verification. Vulnerabilities include replacing a model at a target URL (with one containing malicious code), or registering such a model by getting access to the Management API, and active MITM attacks that insert their own model archives. (Other attacks are possible, such as inducing someone to pull a compromised model archive from the command line, e.g. via social engineering methods.) Model archive code has access to anything the TorchServe process does, including the model store directory, which may contain significant IP for the company running TorchServe. If the server is not locked down well, it could be a beachhead for a more aggressive network incursion.

Suggested mitigations (from security engineers and others within Facebook):

Only allow fetching model archives from a whitelist of URLs in the server config.
Require all such hosts to be https:// and perform proper cert validation and give the option for cert pinning
Checksum or code signing on model archives
Authentication on the Management API

This will not be a launch blocker for the experimental release, but will be for 1.0.

SageMaker support for gpu and cpu dockerfiles

Update cpu and gpu dockerfiles so that Docker Images work on SageMaker (eg add SageMaker inference toolkit)

Update doc/scripts with s3 url based model archive/mar files

Docs and few scripts are still pointing to local mar files i.e. at present, the user has to build mar files themselves. Need to update the docs/scripts with required S3 mar file urls for readymade mar files.

benchmark dependencies install script failing on fresh ubuntu 18.04

The 'install_dependencies.sh' script is not working on a fresh ubuntu 18.04 machine. Tested for both GPU and CPU installs.

Error details:

brew install jmeter --with-plugins
Usage: brew install [options] formula
....
Error: invalid option: --with-plugins
true
wget https://jmeter-plugins.org/get/ -O /home/ubuntu/.linuxbrew/Cellar/jmeter/5.2.1/libexec/lib/ext/jmeter-plugins-manager-1.3.jar
/home/ubuntu/.linuxbrew/Cellar/jmeter/5.2.1/libexec/lib/ext/jmeter-plugins-manager-1.3.jar: No such file or directory

Serve installation fails on SageMaker Notebook and Cloud9

Following steps in Readme and the following errors occur - environment is Cloud9 and Amazon SageMaker Notebooks.
`
(venv) chzar:~/environment/serve/serve-setup/serve (master) $ pip install .
Processing /home/ec2-user/environment/serve/serve-setup/serve
Requirement already satisfied: Pillow in /home/ec2-user/environment/serve/serve-setup/venv/lib/python3.6/dist-packages (from torchserve==0.0.1b20200308) (7.0.0)
Collecting psutil (from torchserve==0.0.1b20200308)
Downloading https://files.pythonhosted.org/packages/c4/b8/3512f0e93e0db23a71d82485ba256071ebef99b227351f0f5540f744af41/psutil-5.7.0.tar.gz (449kB)
100% |████████████████████████████████| 450kB 20.8MB/s
Collecting future (from torchserve==0.0.1b20200308)
Downloading https://files.pythonhosted.org/packages/45/0b/38b06fd9b92dc2b68d58b75f900e97884c45bedd2ff83203d933cf5851c9/future-0.18.2.tar.gz (829kB)
100% |████████████████████████████████| 829kB 21.7MB/s
Requirement already satisfied: torch in /home/ec2-user/environment/serve/serve-setup/venv/lib/python3.6/dist-packages (from torchserve==0.0.1b20200308) (1.4.0)
Requirement already satisfied: torchvision in /home/ec2-user/environment/serve/serve-setup/venv/lib/python3.6/dist-packages (from torchserve==0.0.1b20200308) (0.5.0)
Collecting torchtext (from torchserve==0.0.1b20200308)
Downloading https://files.pythonhosted.org/packages/79/ef/54b8da26f37787f5c670ae2199329e7dccf195c060b25628d99e587dac51/torchtext-0.5.0-py3-none-any.whl (73kB)
100% |████████████████████████████████| 81kB 23.4MB/s
Requirement already satisfied: six in /home/ec2-user/environment/serve/serve-setup/venv/lib/python3.6/dist-packages (from torchvision->torchserve==0.0.1b20200308) (1.14.0)
Requirement already satisfied: numpy in /home/ec2-user/environment/serve/serve-setup/venv/lib/python3.6/dist-packages (from torchvision->torchserve==0.0.1b20200308) (1.18.1)
Collecting sentencepiece (from torchtext->torchserve==0.0.1b20200308)
Downloading https://files.pythonhosted.org/packages/74/f4/2d5214cbf13d06e7cb2c20d84115ca25b53ea76fa1f0ade0e3c9749de214/sentencepiece-0.1.85-cp36-cp36m-manylinux1_x86_64.whl (1.0MB)
100% |████████████████████████████████| 1.0MB 21.5MB/s
Collecting requests (from torchtext->torchserve==0.0.1b20200308)
Downloading https://files.pythonhosted.org/packages/1a/70/1935c770cb3be6e3a8b78ced23d7e0f3b187f5cbfab4749523ed65d7c9b1/requests-2.23.0-py2.py3-none-any.whl (58kB)
100% |████████████████████████████████| 61kB 23.7MB/s
Collecting tqdm (from torchtext->torchserve==0.0.1b20200308)
Downloading https://files.pythonhosted.org/packages/47/55/fd9170ba08a1a64a18a7f8a18f088037316f2a41be04d2fe6ece5a653e8f/tqdm-4.43.0-py2.py3-none-any.whl (59kB)
100% |████████████████████████████████| 61kB 24.2MB/s
Collecting urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 (from requests->torchtext->torchserve==0.0.1b20200308)
Downloading https://files.pythonhosted.org/packages/e8/74/6e4f91745020f967d09332bb2b8b9b10090957334692eb88ea4afe91b77f/urllib3-1.25.8-py2.py3-none-any.whl (125kB)
100% |████████████████████████████████| 133kB 32.5MB/s
Collecting certifi>=2017.4.17 (from requests->torchtext->torchserve==0.0.1b20200308)
Downloading https://files.pythonhosted.org/packages/b9/63/df50cac98ea0d5b006c55a399c3bf1db9da7b5a24de7890bc9cfd5dd9e99/certifi-2019.11.28-py2.py3-none-any.whl (156kB)
100% |████████████████████████████████| 163kB 31.7MB/s
Collecting chardet<4,>=3.0.2 (from requests->torchtext->torchserve==0.0.1b20200308)
Downloading https://files.pythonhosted.org/packages/bc/a9/01ffebfb562e4274b6487b4bb1ddec7ca55ec7510b22e4c51f14098443b8/chardet-3.0.4-py2.py3-none-any.whl (133kB)
100% |████████████████████████████████| 143kB 32.0MB/s
Collecting idna<3,>=2.5 (from requests->torchtext->torchserve==0.0.1b20200308)
Downloading https://files.pythonhosted.org/packages/89/e3/afebe61c546d18fb1709a61bee788254b40e736cff7271c7de5de2dc4128/idna-2.9-py2.py3-none-any.whl (58kB)
100% |████████████████████████████████| 61kB 26.6MB/s
Installing collected packages: psutil, future, sentencepiece, urllib3, certifi, chardet, idna, requests, tqdm, torchtext, torchserve
Running setup.py install for psutil ... done
Running setup.py install for future ... done
Running setup.py install for torchserve ... error
Complete output from command /home/ec2-user/environment/serve/serve-setup/venv/bin/python3 -u -c "import setuptools, tokenize;file='/tmp/pip-req-build-lpcdwxb3/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /tmp/pip-record-3aonh65v/install-record.txt --single-version-externally-managed --compile --install-headers /home/ec2-user/environment/serve/serve-setup/venv/include/site/python3.6/torchserve:
running install
running build
running build_py
running build_frontend
Downloading https://services.gradle.org/distributions/gradle-4.9-bin.zip
........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
Unzipping /home/ec2-user/.gradle/wrapper/dists/gradle-4.9-bin/e9cinqnqvph59rr7g70qubb4t/gradle-4.9-bin.zip to /home/ec2-user/.gradle/wrapper/dists/gradle-4.9-bin/e9cinqnqvph59rr7g70qubb4t
Set executable permissions for: /home/ec2-user/.gradle/wrapper/dists/gradle-4.9-bin/e9cinqnqvph59rr7g70qubb4t/gradle-4.9/bin/gradle

Welcome to Gradle 4.9!

Here are the highlights of this release:
 - Experimental APIs for creating and configuring tasks lazily
 - Pass arguments to JavaExec via CLI
 - Auxiliary publication dependency support for multi-project builds
 - Improved dependency insight report

For more details see https://docs.gradle.org/4.9/release-notes.html

Starting a Gradle Daemon (subsequent builds will be faster)
Download https://plugins.gradle.org/m2/com/google/googlejavaformat/google-java-format/1.6/google-java-format-1.6.pom
Download https://plugins.gradle.org/m2/com/google/googlejavaformat/google-java-format-parent/1.6/google-java-format-parent-1.6.pom
Download https://plugins.gradle.org/m2/org/sonatype/oss/oss-parent/7/oss-parent-7.pom
Download https://plugins.gradle.org/m2/com/google/guava/guava/22.0/guava-22.0.pom
Download https://plugins.gradle.org/m2/com/google/guava/guava-parent/22.0/guava-parent-22.0.pom
Download https://plugins.gradle.org/m2/com/google/errorprone/javac-shaded/9+181-r4173-1/javac-shaded-9+181-r4173-1.pom
Download https://plugins.gradle.org/m2/com/google/code/findbugs/jsr305/1.3.9/jsr305-1.3.9.pom
Download https://plugins.gradle.org/m2/com/google/errorprone/error_prone_annotations/2.0.18/error_prone_annotations-2.0.18.pom
Download https://plugins.gradle.org/m2/org/codehaus/mojo/animal-sniffer-annotations/1.14/animal-sniffer-annotations-1.14.pom
Download https://plugins.gradle.org/m2/com/google/errorprone/error_prone_parent/2.0.18/error_prone_parent-2.0.18.pom
Download https://plugins.gradle.org/m2/org/codehaus/mojo/animal-sniffer-parent/1.14/animal-sniffer-parent-1.14.pom
Download https://plugins.gradle.org/m2/com/google/j2objc/j2objc-annotations/1.1/j2objc-annotations-1.1.pom
Download https://plugins.gradle.org/m2/org/codehaus/mojo/mojo-parent/34/mojo-parent-34.pom
Download https://plugins.gradle.org/m2/org/codehaus/codehaus-parent/4/codehaus-parent-4.pom
Download https://plugins.gradle.org/m2/com/google/guava/guava/22.0/guava-22.0.jar
Download https://plugins.gradle.org/m2/com/google/googlejavaformat/google-java-format/1.6/google-java-format-1.6.jar
Download https://plugins.gradle.org/m2/com/google/code/findbugs/jsr305/1.3.9/jsr305-1.3.9.jar
Download https://plugins.gradle.org/m2/com/google/errorprone/error_prone_annotations/2.0.18/error_prone_annotations-2.0.18.jar
Download https://plugins.gradle.org/m2/com/google/j2objc/j2objc-annotations/1.1/j2objc-annotations-1.1.jar
Download https://plugins.gradle.org/m2/org/codehaus/mojo/animal-sniffer-annotations/1.14/animal-sniffer-annotations-1.14.jar
Download https://plugins.gradle.org/m2/com/google/errorprone/javac-shaded/9+181-r4173-1/javac-shaded-9+181-r4173-1.jar

FAILURE: Build failed with an exception.

* Where:
Build file '/tmp/pip-req-build-lpcdwxb3/frontend/build.gradle' line: 28

* What went wrong:
A problem occurred evaluating root project 'frontend'.
> Could not open dsl remapped class cache for 4116txk5uyl7fsx1ml41gcwgu (/home/ec2-user/.gradle/caches/4.9/scripts-remapped/formatter_8g1qvyqrv7q4wxxey9do0wgaw/4116txk5uyl7fsx1ml41gcwgu/dsl1724d65be2ee623103b4c593e57fc0c5).
   > Could not open dsl generic class cache for script '/tmp/pip-req-build-lpcdwxb3/frontend/tools/gradle/formatter.gradle' (/home/ec2-user/.gradle/caches/4.9/scripts/4116txk5uyl7fsx1ml41gcwgu/dsl/dsl1724d65be2ee623103b4c593e57fc0c5).
      > com/google/googlejavaformat/java/Main : Unsupported major.minor version 52.0

* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. Run with --scan to get full insights.

* Get more help at https://help.gradle.org

Deprecated Gradle features were used in this build, making it incompatible with Gradle 5.0.
Use '--warning-mode all' to show the individual deprecation warnings.
See https://docs.gradle.org/4.9/userguide/command_line_interface.html#sec:command_line_warnings

BUILD FAILED in 13s
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/tmp/pip-req-build-lpcdwxb3/setup.py", line 160, in <module>
    license='Apache License Version 2.0'
  File "/home/ec2-user/environment/serve/serve-setup/venv/lib64/python3.6/dist-packages/setuptools/__init__.py", line 143, in setup
    return distutils.core.setup(**attrs)
  File "/usr/lib64/python3.6/distutils/core.py", line 148, in setup
    dist.run_commands()
  File "/usr/lib64/python3.6/distutils/dist.py", line 955, in run_commands
    self.run_command(cmd)
  File "/usr/lib64/python3.6/distutils/dist.py", line 974, in run_command
    cmd_obj.run()
  File "/home/ec2-user/environment/serve/serve-setup/venv/lib64/python3.6/dist-packages/setuptools/command/install.py", line 61, in run
    return orig.install.run(self)
  File "/usr/lib64/python3.6/distutils/command/install.py", line 593, in run
    self.run_command('build')
  File "/usr/lib64/python3.6/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/usr/lib64/python3.6/distutils/dist.py", line 974, in run_command
    cmd_obj.run()
  File "/usr/lib64/python3.6/distutils/command/build.py", line 135, in run
    self.run_command(cmd_name)
  File "/usr/lib64/python3.6/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/usr/lib64/python3.6/distutils/dist.py", line 974, in run_command
    cmd_obj.run()
  File "/tmp/pip-req-build-lpcdwxb3/setup.py", line 98, in run
    self.run_command('build_frontend')
  File "/usr/lib64/python3.6/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/usr/lib64/python3.6/distutils/dist.py", line 974, in run_command
    cmd_obj.run()
  File "/tmp/pip-req-build-lpcdwxb3/setup.py", line 85, in run
    subprocess.check_call('frontend/gradlew -p frontend clean build', shell=True)
  File "/usr/lib64/python3.6/subprocess.py", line 311, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'frontend/gradlew -p frontend clean build' returned non-zero exit status 1.

----------------------------------------

Command "/home/ec2-user/environment/serve/serve-setup/venv/bin/python3 -u -c "import setuptools, tokenize;file='/tmp/pip-req-build-lpcdwxb3/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /tmp/pip-record-3aonh65v/install-record.txt --single-version-externally-managed --compile --install-headers /home/ec2-user/environment/serve/serve-setup/venv/include/site/python3.6/torchserve" failed with error code 1 in /tmp/pip-req-build-lpcdwxb3/
You are using pip version 18.1, however version 20.0.2 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
`

pom.xml file has references to MXNet server

See https://github.com/pytorch/serve/blob/master/serving-sdk/pom.xml

This appears to be related to the serving-sdk component specifically. I'm assuming we don't want a bunch of references to MXNet Model Server or its source control in the project.

Is this SDK component going to be part of the final project? If not, is it something that can be cut, rather than leaving old stuff around?

dockerfile for gpu machines install

Please provide a gpu dockerfile for serving gpu dependent models

config steps for serving via public-ip

The default configuration works for serving models on localhost. In order to run predictions on models via the public-ip (making calls from a different host) one has to specify 0.0.0.0 or the explicit IP-address through the --ts-config file settings. This is not very clear in the documentation. Add clear instructions for new users to get started, otherwise can take a while to figure out why predictions are not working even after opening up the external security ports on the machine.

Command line does not allow starting with multiple model versions

In my model store, I have three model archives:

name densenet161, version 1.0, file densenet161.mar
name densenet161, version 1.1, file densenet161a.mar
name tsd161, version 1.0, file tsd161.mar

I attempted to use this command line to start the server:

torchserve --start --model-store model_store --models densenet161=densenet161.mar densenet161=densenet161a.mar tsd161=tsd161.mar

Expected behavior: There would be two versions of the densenet161 endpoint, as specified in the two model archive files.

Actual behavior from curl http://localhost:8081/models:

{
  "models": [
    {
      "modelName": "densenet161",
      "modelUrl": "densenet161a.mar"
    },
    {
      "modelName": "tsd161",
      "modelUrl": "tsd161.mar"
    }
  ]
}

Adding new model from s3 url via management api does not store it into the model store directory

On adding a new model using the management api, if the model path is from s3 url, the model does not get added to torchserve's model store path. One would expect that models registered via urls will be downloaded locally for faster predictions, and for persistence when torchserve is restarted.

Avoid reading GPU ID from properties for RR worker assignment

#71
mentioned about assigning GPU ID from properties explicitly which will work.
However, this will require end user to know the number of GPU devices present on an instance prior to assigning.

#71 (comment) should suffice IMO. Please look into this. Thanks.

`torch.eval()` call in the wrong place

The various included model handlers include calls to model.eval(), which is good - we want the model put in that mode for inference.

Where this call really belongs, though, is in the parent class for all handlers, where the model gets loaded. In fact, calling model.eval() should probably be the next thing that happens after the model loads - someplace like line 61 in the current version of https://github.com/pytorch/serve/blob/master/ts/torch_handler/base_handler.py

This protects against someone forgetting to add the call (and unwittingly hobbling their performance), and doesn't stop them from calling model.train() if for some reason that's what they really need.

Model Archiver instructions incomplete & incorrect in README

In /README.md, in the "Serve a Model" section, there is a command line for creating a model archive for the downloaded DenseNet-161 model:

torch-model-archiver --model-name densenet161 --model-file serve/examples/densenet_161/model.py --serialized-file densenet161-8d451a50.pth --extra-files serve/examples/index_to_name.json

This fails with the message:

torch-model-archiver: error: the following arguments are required: --handler

Also, the paths are incorrect - it should be examples/image_classifier/densenet_161 for both of the specified paths.

Example on readme uses MxNet

The example in the README uses a SqueezeNet model built with MxNet, we should update it to use PyTorch.

Following instructions, model server does not start

» torchserve --start --model-store model_store --models densenet161=densenet161.mar
» Error: Could not find or load main class org.pytorch.serve.ModelServer

Install prereqs are met:

» python --version
Python 3.6.9 :: Anaconda, Inc.
» java -version
java version "1.8.0_181"
Java(TM) SE Runtime Environment (build 1.8.0_181-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)

Also, java is in my PATH.

Incomplete OpenSSL instructions

This refers to the SSL instructions in the config doc: https://github.com/pytorch/serve/blob/master/docs/configuration.md#enable-ssl

It looks like there was an intent to create instructions for making a self-signed cert with openssl, but it is incomplete. The config instructions also look like they were copied & pasted from the keytool instructions immediately above.

Error installing

Error during install while running tests:

    2019-12-16 13:55:36,353 [WARN ] W-9008-noop-stderr org.pytorch.serve.wlm.WorkerLifeCycle -     import psutil
    2019-12-16 13:55:36,353 [WARN ] W-9008-noop-stderr org.pytorch.serve.wlm.WorkerLifeCycle - ModuleNotFoundError: No module named 'psutil'

Looks like a Python dependency issue.

pytorch / serve Goto Github PK

serve's Introduction

TorchServe

🚀 Quick start with TorchServe

🚀 Quick start with TorchServe (conda)

🐳 Quick Start with Docker

⚡ Why TorchServe

🤔 How does TorchServe work

🏆 Highlighted Examples

🛡️ TorchServe Security Policy

🤓 Learn More

🫂 Contributing

📰 News

💖 All Contributors

⚖️ Disclaimer

serve's People

Contributors

Stargazers

Watchers

Forkers

serve's Issues

Error details:

Recommend Projects

Recommend Topics

Recommend Org