HyperAI Introduction to Automatic Modeling Data Format Specification
Introduce
HyperAI data format yes HyperAI A set of data set formatting standards defined, be used for HyperAI Automatic modeling and related products. After formatting the dataset according to this specification. Automatic modeling can use this dataset to automatically build deep learning models.
HyperAI The data format is meta.csv The main format file for the dataset. File to csv Format as the main body:
- The first line is the field type and field name. The format is: [type]_[name]
- Data samples for the second row and each subsequent row.
Field Name
- Field names are named using uppercase and lowercase English letters.
- with“*”Fields starting with the number will be ignored. The automatic modeling training process will ignore this field.
- Label As an exclusive field. Specifically referring to the labels in the training data. There can only be one field name in the field name Label.
Field type
Field type Indicate the data type of the column field. This includes simple fields: int, float, category, txt. The value of a simple field is meta.csv The value of each column corresponding to each row in the middle. The other type is complex fields: text, image, video, json. Complex fields cannot be accessed meta.csv The middle represents. So the values corresponding to complex fields are a relative path. Indicate the file corresponding to the value of this field in the dataset.
- int - Integer value
- float - Floating point number
- category - Classification value
- txt - Short text value
- text - text file. All contents in the file
- image - Image file. The format includes: jpg, png, tif
- video - video file , The format includes: mp4
- json - Complex annotated data. According to different questions. There will be corresponding definitions
Data formats for various types of problems
Object detection
Object detection due to Label Multiple field contents. So use a separate one Json File as annotation, 001.jpg It's an original picture, 001.json Then it is to modify the labeling and corresponding types of several objects in the image.
json_Label,image_Source
labels/001.json,images/001.jpg
For detailed description, please refer to object detection
Semantic segmentation
001_mask.jpg and 001.jpg It's two pictures of the same size, 001_mask.jpg Each pixel in it is 001.jpg Annotation of corresponding positions.
image_Label,image_Source
images/001_mask.jpg,images/001.jpg
Instance segmentation
Instance segmentation
FAQ
- In the annotated file name and file content. It's best to only use English, number. Characters such as underline. Avoid appearing in Chinese. To prevent unexpected coding issues.
- All coordinates in the annotation specification are relative position coordinates. As shown in the following figure. The coordinate point is(X/800, Y/600)