Disk performance
Main factors affecting performance
- The total number of disks in the storage node. The more disks there are, the better the performance
- Memory of storage nodes. The more memory there is, the more content can be cached
- Gigabit network will definitely become a bottleneck. Ensure at least a 10 gigabit network
Cluster configuration
Because the storage systems of each cluster vary greatly. The performance tests here are all based on a cluster that is currently in use. The configuration is as follows:
- Storage Node 4 individual
- There are two storage nodes 22 Block hard drive HDD
- There are two storage nodes 24 Block hard drive NVME SSD
- The intranet is 100 Gbps Ethernet
Concept Explanation
IO depth (iodepth
): The maximum size of a testing process "In flight" Number of requests (Or in other words. The number of requests that have been submitted but not yet processed). The testing process does not allow requests greater than during flight IO Continue submitting requests in depth.
Number of processes (procs
): Number of processes undergoing testing simultaneously.
read-write mode (mode
): read
Expressing Reading, write
Expressing Writing, rw
Indicates alternating reading and writing.rand
The prefix represents random read and write operations. Otherwise, it is sequential reading and writing.
Block size (block size
, bs
): Single time IO The size of the requested data block.4k
The situation occurs when reading and writing configuration files. When reading and writing fragmented small files, it may appear.4M
The situation may involve reading and writing large compressed files. Video files and other situations occur.
IOPS: IO per second. The number of requests processed per second.
Throughput: Every second, take/The amount of data written
Temporary storage space performance test command
fio --name=disktest \
--ioengine=libaio \
--iodepth=$IODEPTH \
--numjobs=$PROCS \
--rw=$MODE \
--bs=$BLOCK_SIZE \
--direct=1 --buffered=0 \
--size=2G \
--runtime=30 \
--time_based \
--group_reporting \
--output-format=json
Dataset performance testing command
fio --name=disktest \
--ioengine=libaio \
--iodepth=$IODEPTH \
--numjobs=$PROCS \
--rw=$MODE \
--bs=$BLOCK_SIZE \
--direct=1 --buffered=0 \
--filename=/input0/train.zip \ # One 14G The file
--runtime=30 \
--time_based \
--group_reporting \
--output-format=json