Skip to main content

Introduction to Data Warehouse

note

before "data warehouse " Known as "data set" , This is more easily understood as annotated data in machine learning scenarios. In fact, besides placing annotated data, any other type of data can also be placed here. Including code. Training model files, etc. To avoid this ambiguity. Also for the new ones "Model deployment" Preparation for functionality. HyperAI Regarding the original "data set" The concept has been adjusted, take "data set" Change to "data warehouse "

There are currently two categories in the data warehouse:

  • data set: All data except for model related content can be placed here
  • Model: Used to store model files. Code used in conjunction with model files, etc

Creation of Data Warehouse

The creation of two types of data warehouses has two independent entrances.

Create dataset

Create Model

be careful

Also serving as a data warehouse, "Model" Below and below "data set" Projects with the same name cannot appear below.

Switching between data warehouse types

stay "set up" page. Allow switching between types of data warehouses:

Copy between data warehouses

For the convenience of managing user datasets. Apart from allowing [Create a working directory as a data warehouse version](/docs/gear/output/#Create a working directory as a data warehouse version)Outside. It is also allowed to recreate subdirectories of a data warehouse as a data warehouse version:

As shown in the above figure. Click on a directory in a data warehouse version"Copy the current directory to the dataset"You can select the specified dataset, choice"Add to existing dataset"or"Create a new dataset".

  • "Add to existing dataset"Will add the subdirectories of the current data warehouse to the selected existing dataset.
  • "Create a new dataset"We will create a new dataset version for the current data directory under the target dataset.

During the copying or creation process, the new dataset version will be marked as"Copying data"state. After completing the copy, the dataset version will be marked as"Processing completed", Ready to use.

Add to data warehouse README.md file

Each model repository version can provide a named README.md The file. Provide some explanations for the repository version of the model. This file will be displayed on the model repository version page.

Public Data Warehouse

The default data warehouse created is "Private data warehouse" , In the data warehouse "set up" The page can set the entire data warehouse as "Public Data Warehouse" . All registered users can access it through URL Accessing this data warehouse.

note

Everyone can create it "Public Data Warehouse" The number is limited. Its limitations can be "Resource utilization status" - "quota restriction" - "Public datasets" see.