Serving Service Writing
at present HyperAI Model deployment supports two deployment modes:
- Standard 
predictor.pyDeployment Method - Fully customizable, bypass HyperAI Deployment of the provided framework
 
The custom approach is designed for advanced users who require precise control over the deployment of services. If you're not sure if you need to use a custom approach. So you don't need it. Suggest conducting it through standard methods.
Standard predictor.py mode
Rely on
In addition to the libraries used in the business. Additional dependencies are required HyperAI-serving. Please try to ensure that this library is updated to the latest version as much as possible.
Directory structure
Model deployment must include two parts:
predictor.pyAnd the files it depends on. Used for processing model requests- Model file
 
predictor.py The interface. The following text will introduce them one by one. The export of model files Model exportThere is a detailed introduction.
Anything needed in predictor.py The files referenced in the document need to be placed in the same directory or its subdirectories. For example, we need one classes.json The file is used to store classified information. Then it can be done in the following way predictor.py Accessing files:
import json
class Predictor:
    def __init__(self):
        with open('classes.json', 'r') as f:
            values = json.load(f)
        self.values = values
    ...
stay pytorch/image-classifier-resnet50 Contains complete project examples.
Predictor
Template
Predictor The template is shown below:
import HyperAI_serving as serv
class Predictor:
    def __init__(self):
        """
        Responsible for loading the corresponding model and initializing metadata
        """
        pass
    def predict(self, json):
        """
        It will be called every time a request is made
        accept HTTP The requested content (`json`)
        Perform necessary preprocessing (preprocess)Post prediction results,
        Final post-processing of the results (postprocess)And return it to the caller
        Args:
            json: The requested data
        Returns:
            Prediction results
        """
        pass
if __name__ == '__main__':  # If executed directly predictor.py. But not by other files import
    serv.run(Predictor)  # Start providing services
among json The parameters will be based on HTTP The requested Content-Type Analyze the head:
- about 
Content-Type: application/json,jsonWill follow JSON Convert the format into a dictionary (Dict) - about 
Content-Type: application/msgpack, Or something else MessagePack Type alias,jsonWill follow and JSON Processing in the same format will (Resolve to dictionary) 
Example
Here provided pytorch/object-detector of predictor.py file:
from io import BytesIO
import requests
import torch
from PIL import Image
from torchvision import models
from torchvision import transforms
import HyperAI_serving as serv
class Predictor:
    def __init__(self):
        self.device = "cuda" if torch.cuda.is_available() else "cpu"
        print(f"using device: {self.device}")
        model = models.detection.fasterrcnn_resnet50_fpn(pretrained=False, pretrained_backbone=False)
        model.load_state_dict(torch.load("fasterrcnn_resnet50_fpn_coco-258fb6c6.pth"))
        model = model.to(self.device)
        model.eval()
        self.preprocess = transforms. Compose([transforms. ToTensor()])
        with open("coco_labels.txt") as f:
            self.coco_labels = f.read().splitlines()
        self.model = model
    def predict(self, json):
        # Only used json parameter. Its content is Dict format. Directly through json [key] Just obtain the content in the form of
        threshold = float(json["threshold"])
        image = requests.get(json["url"]).content
        img_pil = Image.open(BytesIO(image))
        img_tensor = self.preprocess(img_pil).to(self.device)
        img_tensor.unsqueeze_(0)
        with torch.no_grad():
            pred = self.model(img_tensor)
        predicted_class = [self.coco_labels [i] for i in pred[0]["labels"].cpu().tolist()]
        predicted_boxes = [
            [(i[0], i[1]), (i[2], i[3])] for i in pred[0]["boxes"].detach().cpu().tolist()
        ]
        predicted_score = pred[0]["scores"].detach().cpu().tolist()
        predicted_t = [predicted_score.index(x) for x in predicted_score if x > threshold]
        if len(predicted_t) == 0:
            return [], []
        predicted_t = predicted_t[-1]
        predicted_boxes = predicted_boxes[: predicted_t + 1]
        predicted_class = predicted_class[: predicted_t + 1]
        return predicted_boxes, predicted_class
if __name__ == '__main__':
    serv.run(Predictor)
Predictor Class interface
Predictor Class does not need to inherit from other classes. But at least two interfaces need to be provided:
__init__
__init__ The number of parameters is uncertain. There is no order requirement either. But sensitive to the names of parameters.
Each appearing parameter represents a different function:
onnx: Will be therepredictor.pySearch for the directory where it is located*.onnxLoad the file and pass it as a parameter
Example:
class PredictorExample1:
    def __init__(self):
        pass
    def predict(self, json):
        return {'result': 'It works!'}
class PredictorExample2:
    def __init__(self, onnx):
        self.onnx = onnx
    def preprocess(self, v):
        ...
    def postprocess(self, v):
        ...
    def predict(self, json):
        onnx = self.onnx
        m = self.preprocess(json['data'])
        result = self.postprocess(onnx.run(None, {'data': m}))
        return result
Predict
predict The number of parameters is uncertain. There is no order requirement either. But sensitive to the names of parameters.
Each appearing parameter represents a different function:
json: Parsed data. Will followContent-TypeThe value will JSON perhaps MessagePack Data parsing into Python object.payload: Withjsonequally. The difference isjsonWhen encountering data that cannot be parsed, it will return 400 error,payloadWill directly provide unanalyzed data.data: Unanalyzed POST data. Always bebytestype.params: HTTP GET parameter. One dict object.headers: HTTP head. One dict object.request: Flask HTTP Request object. Please refer to Flask Usage of document retrieval.
Example:
class PredictorExample1:
    def __init__(self):
        pass
    def predict(self, json, params):
        return {'message': f'Param foo is {params.get("foo")}'}
Predict The return value of a function
predict The return result of the method can be the following:
- One can be JSON serialized object , such as Python In the middle List. Dict. Tuple
 - One 
strType string - One 
bytesType object - One 
flask. ResponseType object 
Here are some examples:
def predict(self, json):
    return [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
def predict(self):
    return "class 1"
def predict(self):
    # bytes Return of type
    array = np.random.randn(3, 3)
    response = pickle.dumps(array)
    return response
def predict(self):
    from flask import Response
    data = b"class 1"
    response = Response(data, mimetype="text/plain")
    return response
Fully customizable approach
If there is a presence in the deployed model start.sh file. So all the preparation work mentioned above will be skipped. Run this file directly.
You need to complete all the initialization by yourself. And the work of starting the service.
Service requires monitoring 80 Port and processing HTTP request.