Skip to main content

Introduction to Model Deployment

After completing the training of the model, it can be deployed to the server or stored on the device in some way to provide real-time model inference services to the outside world. "Model deployment (Serving)" namely HyperAI The provided server-side model inference function.

establish Serving The following steps are required:

  1. Export the trained model. The training of the model can be done in Gear - Computing power container Completed in the middle
  2. to write predictor.py Model deployment script. The writing rules are Serving Service Writing There is a detailed introduction, stay HyperAI-serving-examples There are writing examples in the model repository as a reference

among "Data binding" With [Create a computing power container](/docs/concepts/#Computing power container-gear)Similar to time. Simultaneously supporting container output and path binding in the data warehouse.

stay Model deployment - Get started quickly The article introduces a complete example of creating a model deployment.

Model data and predictor.py Storage of

Before creating model deployment. Need to train the model file and predictor.py Put it in the same directory. The location of the specific directory is not restricted:

  • Under the workspace of any computing power container
  • It can also be in a data warehouse that is simply uploaded through files
  • Even HyperAI Content of publicly available models in

The selection method during binding is very similar to the data binding method when creating a computing power container:

However, there are also the following differences compared to computing power containers:

  1. Only one directory can be bound during model deployment. Therefore, model files are required predictor.py Must belong to the same data space. This directory is bound in read-only form. No write permission for directory. Do not store any data in the directory
  2. During model deployment, specific binding options can be selected "data warehouse " or "Model output" The directory hierarchy. Instead of having to select the root directory. As shown in the following figure. Can be bound onnx/image-classifier-modelnetv2 Such subdirectories

Content requirements for model deployment:

  1. Model deployment must include predictor.py And the files it depends on (For example, model files), Used for processing model requests.
  2. (Optional)requirements.txt, dependencies.sh, conda-packages.txt Waiting for files managed by external dependencies. Used for installing dependencies other than those pre installed in the image. The detailed content can be found at Dependency managementfind.
  3. (Optional).env Used to create in predictor.py The environment variables used in. The format of each line must be VARIABLE=value This form.

predictor.py The requirements for the document are Serving Service Writing There is a detailed introduction. The export of model files Model exportThere is a detailed introduction.

Version management

Model deployment has "edition" The concept. The versions are independent of each other and can support different runtime environments. Resource types and deployment content. When deploying a new version, the old version will automatically go offline. With "data set" Similar to its version number increasing in numerical sequence.

Detailed operation in Management of Model Deployment There is an introduction.