Skip to main content

Serving Service Writing

at present HyperAI Model deployment supports two deployment modes:

  1. Standard predictor.py Deployment Method
  2. Fully customizable, bypass HyperAI Deployment of the provided framework

The custom approach is designed for advanced users who require precise control over the deployment of services. If you're not sure if you need to use a custom approach. So you don't need it. Suggest conducting it through standard methods.

Standard predictor.py mode

Rely on

In addition to the libraries used in the business. Additional dependencies are required HyperAI-serving. Please try to ensure that this library is updated to the latest version as much as possible.

Directory structure

Model deployment must include two parts:

  1. predictor.py And the files it depends on. Used for processing model requests
  2. Model file

predictor.py The interface. The following text will introduce them one by one. The export of model files Model exportThere is a detailed introduction.

Anything needed in predictor.py The files referenced in the document need to be placed in the same directory or its subdirectories. For example, we need one classes.json The file is used to store classified information. Then it can be done in the following way predictor.py Accessing files:

import json

class Predictor:
def __init__(self):
with open('classes.json', 'r') as f:
values = json.load(f)
self.values = values

...

stay pytorch/image-classifier-resnet50 Contains complete project examples.

Predictor

Template

Predictor The template is shown below:

import HyperAI_serving as serv


class Predictor:
def __init__(self):
"""
Responsible for loading the corresponding model and initializing metadata
"""
pass

def predict(self, json):
"""
It will be called every time a request is made
accept HTTP The requested content (`json`)
Perform necessary preprocessing (preprocess)Post prediction results,
Final post-processing of the results (postprocess)And return it to the caller

Args:
json: The requested data

Returns:
Prediction results
"""
pass

if __name__ == '__main__': # If executed directly predictor.py. But not by other files import
serv.run(Predictor) # Start providing services

among json The parameters will be based on HTTP The requested Content-Type Analyze the head:

  • about Content-Type: application/json, json Will follow JSON Convert the format into a dictionary (Dict)
  • about Content-Type: application/msgpack, Or something else MessagePack Type alias, json Will follow and JSON Processing in the same format will (Resolve to dictionary)

Example

Here provided pytorch/object-detector of predictor.py file:

from io import BytesIO

import requests
import torch
from PIL import Image
from torchvision import models
from torchvision import transforms

import HyperAI_serving as serv


class Predictor:
def __init__(self):
self.device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"using device: {self.device}")

model = models.detection.fasterrcnn_resnet50_fpn(pretrained=False, pretrained_backbone=False)
model.load_state_dict(torch.load("fasterrcnn_resnet50_fpn_coco-258fb6c6.pth"))
model = model.to(self.device)
model.eval()

self.preprocess = transforms. Compose([transforms. ToTensor()])

with open("coco_labels.txt") as f:
self.coco_labels = f.read().splitlines()

self.model = model

def predict(self, json):
# Only used json parameter. Its content is Dict format. Directly through json [key] Just obtain the content in the form of
threshold = float(json["threshold"])
image = requests.get(json["url"]).content
img_pil = Image.open(BytesIO(image))
img_tensor = self.preprocess(img_pil).to(self.device)
img_tensor.unsqueeze_(0)

with torch.no_grad():
pred = self.model(img_tensor)

predicted_class = [self.coco_labels [i] for i in pred[0]["labels"].cpu().tolist()]
predicted_boxes = [
[(i[0], i[1]), (i[2], i[3])] for i in pred[0]["boxes"].detach().cpu().tolist()
]
predicted_score = pred[0]["scores"].detach().cpu().tolist()
predicted_t = [predicted_score.index(x) for x in predicted_score if x > threshold]
if len(predicted_t) == 0:
return [], []

predicted_t = predicted_t[-1]
predicted_boxes = predicted_boxes[: predicted_t + 1]
predicted_class = predicted_class[: predicted_t + 1]
return predicted_boxes, predicted_class


if __name__ == '__main__':
serv.run(Predictor)

Predictor Class interface

Predictor Class does not need to inherit from other classes. But at least two interfaces need to be provided:

__init__

__init__ The number of parameters is uncertain. There is no order requirement either. But sensitive to the names of parameters.

Each appearing parameter represents a different function:

  • onnx: Will be there predictor.py Search for the directory where it is located *.onnx Load the file and pass it as a parameter

Example:

class PredictorExample1:

def __init__(self):
pass

def predict(self, json):
return {'result': 'It works!'}
class PredictorExample2:

def __init__(self, onnx):
self.onnx = onnx

def preprocess(self, v):
...

def postprocess(self, v):
...

def predict(self, json):
onnx = self.onnx
m = self.preprocess(json['data'])
result = self.postprocess(onnx.run(None, {'data': m}))
return result

Predict

predict The number of parameters is uncertain. There is no order requirement either. But sensitive to the names of parameters.

Each appearing parameter represents a different function:

  • json: Parsed data. Will follow Content-Type The value will JSON perhaps MessagePack Data parsing into Python object.
  • payload: With json equally. The difference is json When encountering data that cannot be parsed, it will return 400 error, payload Will directly provide unanalyzed data.
  • data: Unanalyzed POST data. Always be bytes type.
  • params: HTTP GET parameter. One dict object.
  • headers: HTTP head. One dict object.
  • request: Flask HTTP Request object. Please refer to Flask Usage of document retrieval.

Example:

class PredictorExample1:

def __init__(self):
pass

def predict(self, json, params):
return {'message': f'Param foo is {params.get("foo")}'}

Predict The return value of a function

predict The return result of the method can be the following:

  • One can be JSON serialized object , such as Python In the middle List. Dict. Tuple
  • One str Type string
  • One bytes Type object
  • One flask. Response Type object

Here are some examples:

def predict(self, json):
return [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
def predict(self):
return "class 1"
def predict(self):
# bytes Return of type
array = np.random.randn(3, 3)
response = pickle.dumps(array)
return response
def predict(self):
from flask import Response
data = b"class 1"
response = Response(data, mimetype="text/plain")
return response

Fully customizable approach

If there is a presence in the deployed model start.sh file. So all the preparation work mentioned above will be skipped. Run this file directly. You need to complete all the initialization by yourself. And the work of starting the service.

Service requires monitoring 80 Port and processing HTTP request.