Serving Service Writing
at present HyperAI Model deployment supports two deployment modes:
- Standard
predictor.py
Deployment Method - Fully customizable, bypass HyperAI Deployment of the provided framework
The custom approach is designed for advanced users who require precise control over the deployment of services. If you're not sure if you need to use a custom approach. So you don't need it. Suggest conducting it through standard methods.
Standard predictor.py
mode
Rely on
In addition to the libraries used in the business. Additional dependencies are required HyperAI-serving
. Please try to ensure that this library is updated to the latest version as much as possible.
Directory structure
Model deployment must include two parts:
predictor.py
And the files it depends on. Used for processing model requests- Model file
predictor.py
The interface. The following text will introduce them one by one. The export of model files Model exportThere is a detailed introduction.
Anything needed in predictor.py
The files referenced in the document need to be placed in the same directory or its subdirectories. For example, we need one classes.json
The file is used to store classified information. Then it can be done in the following way predictor.py
Accessing files:
import json
class Predictor:
def __init__(self):
with open('classes.json', 'r') as f:
values = json.load(f)
self.values = values
...
stay pytorch/image-classifier-resnet50 Contains complete project examples.
Predictor
Template
Predictor
The template is shown below:
import HyperAI_serving as serv
class Predictor:
def __init__(self):
"""
Responsible for loading the corresponding model and initializing metadata
"""
pass
def predict(self, json):
"""
It will be called every time a request is made
accept HTTP The requested content (`json`)
Perform necessary preprocessing (preprocess)Post prediction results,
Final post-processing of the results (postprocess)And return it to the caller
Args:
json: The requested data
Returns:
Prediction results
"""
pass
if __name__ == '__main__': # If executed directly predictor.py. But not by other files import
serv.run(Predictor) # Start providing services
among json
The parameters will be based on HTTP The requested Content-Type
Analyze the head:
- about
Content-Type: application/json
,json
Will follow JSON Convert the format into a dictionary (Dict) - about
Content-Type: application/msgpack
, Or something else MessagePack Type alias,json
Will follow and JSON Processing in the same format will (Resolve to dictionary)
Example
Here provided pytorch/object-detector of predictor.py
file:
from io import BytesIO
import requests
import torch
from PIL import Image
from torchvision import models
from torchvision import transforms
import HyperAI_serving as serv
class Predictor:
def __init__(self):
self.device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"using device: {self.device}")
model = models.detection.fasterrcnn_resnet50_fpn(pretrained=False, pretrained_backbone=False)
model.load_state_dict(torch.load("fasterrcnn_resnet50_fpn_coco-258fb6c6.pth"))
model = model.to(self.device)
model.eval()
self.preprocess = transforms. Compose([transforms. ToTensor()])
with open("coco_labels.txt") as f:
self.coco_labels = f.read().splitlines()
self.model = model
def predict(self, json):
# Only used json parameter. Its content is Dict format. Directly through json [key] Just obtain the content in the form of
threshold = float(json["threshold"])
image = requests.get(json["url"]).content
img_pil = Image.open(BytesIO(image))
img_tensor = self.preprocess(img_pil).to(self.device)
img_tensor.unsqueeze_(0)
with torch.no_grad():
pred = self.model(img_tensor)
predicted_class = [self.coco_labels [i] for i in pred[0]["labels"].cpu().tolist()]
predicted_boxes = [
[(i[0], i[1]), (i[2], i[3])] for i in pred[0]["boxes"].detach().cpu().tolist()
]
predicted_score = pred[0]["scores"].detach().cpu().tolist()
predicted_t = [predicted_score.index(x) for x in predicted_score if x > threshold]
if len(predicted_t) == 0:
return [], []
predicted_t = predicted_t[-1]
predicted_boxes = predicted_boxes[: predicted_t + 1]
predicted_class = predicted_class[: predicted_t + 1]
return predicted_boxes, predicted_class
if __name__ == '__main__':
serv.run(Predictor)
Predictor Class interface
Predictor
Class does not need to inherit from other classes. But at least two interfaces need to be provided:
__init__
__init__
The number of parameters is uncertain. There is no order requirement either. But sensitive to the names of parameters.
Each appearing parameter represents a different function:
onnx
: Will be therepredictor.py
Search for the directory where it is located*.onnx
Load the file and pass it as a parameter
Example:
class PredictorExample1:
def __init__(self):
pass
def predict(self, json):
return {'result': 'It works!'}
class PredictorExample2:
def __init__(self, onnx):
self.onnx = onnx
def preprocess(self, v):
...
def postprocess(self, v):
...
def predict(self, json):
onnx = self.onnx
m = self.preprocess(json['data'])
result = self.postprocess(onnx.run(None, {'data': m}))
return result
Predict
predict
The number of parameters is uncertain. There is no order requirement either. But sensitive to the names of parameters.
Each appearing parameter represents a different function:
json
: Parsed data. Will followContent-Type
The value will JSON perhaps MessagePack Data parsing into Python object.payload
: Withjson
equally. The difference isjson
When encountering data that cannot be parsed, it will return 400 error,payload
Will directly provide unanalyzed data.data
: Unanalyzed POST data. Always bebytes
type.params
: HTTP GET parameter. One dict object.headers
: HTTP head. One dict object.request
: Flask HTTP Request object. Please refer to Flask Usage of document retrieval.
Example:
class PredictorExample1:
def __init__(self):
pass
def predict(self, json, params):
return {'message': f'Param foo is {params.get("foo")}'}
Predict The return value of a function
predict
The return result of the method can be the following:
- One can be JSON serialized object , such as Python In the middle List. Dict. Tuple
- One
str
Type string - One
bytes
Type object - One
flask. Response
Type object
Here are some examples:
def predict(self, json):
return [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
def predict(self):
return "class 1"
def predict(self):
# bytes Return of type
array = np.random.randn(3, 3)
response = pickle.dumps(array)
return response
def predict(self):
from flask import Response
data = b"class 1"
response = Response(data, mimetype="text/plain")
return response
Fully customizable approach
If there is a presence in the deployed model start.sh
file. So all the preparation work mentioned above will be skipped. Run this file directly.
You need to complete all the initialization by yourself. And the work of starting the service.
Service requires monitoring 80
Port and processing HTTP request.