Introduction to Data Warehouse
before "data warehouse " Known as "data set" , This is more easily understood as annotated data in machine learning scenarios. In fact, besides placing annotated data, any other type of data can also be placed here. Including code. Training model files, etc. To avoid this ambiguity. Also for the new ones "Model deployment" Preparation for functionality. HyperAI Regarding the original "data set" The concept has been adjusted, take "data set" Change to "data warehouse "
There are currently two categories in the data warehouse:
- data set: All data except for model related content can be placed here
- Model: Used to store model files. Code used in conjunction with model files, etc
Creation of Data Warehouse
The creation of two types of data warehouses has two independent entrances.
Create dataset
Create Model
Also serving as a data warehouse, "Model" Below and below "data set" Projects with the same name cannot appear below.
Switching between data warehouse types
stay "set up" page. Allow switching between types of data warehouses:
Copy between data warehouses
For the convenience of managing user datasets. Apart from allowing [Create a working directory as a data warehouse version](/docs/gear/output/#Create a working directory as a data warehouse version)Outside. It is also allowed to recreate subdirectories of a data warehouse as a data warehouse version:
As shown in the above figure. Click on a directory in a data warehouse version"Copy the current directory to the dataset"
You can select the specified dataset, choice"Add to existing dataset"
or"Create a new dataset"
.
"Add to existing dataset"
Will add the subdirectories of the current data warehouse to the selected existing dataset."Create a new dataset"
We will create a new dataset version for the current data directory under the target dataset.
During the copying or creation process, the new dataset version will be marked as"Copying data"
state. After completing the copy, the dataset version will be marked as"Processing completed"
, Ready to use.
Add to data warehouse README.md file
Each model repository version can provide a named README.md
The file. Provide some explanations for the repository version of the model. This file will be displayed on the model repository version page.
Public Data Warehouse
The default data warehouse created is "Private data warehouse" , In the data warehouse "set up" The page can set the entire data warehouse as "Public Data Warehouse" . All registered users can access it through URL Accessing this data warehouse.
Everyone can create it "Public Data Warehouse" The number is limited. Its limitations can be "Resource utilization status" - "quota restriction" - "Public datasets" see.