Save/Load Disk Data
From Jacket Wiki
When processing huge amounts of data it is unavoidable to use permanent storage - i.e. disk storage. The following is a quick test of save/load of a large 400 MB matrix. The source code for the benchmark is:
clear all; %% USER INPUT Rows = 10000; Cols = 10000; %% GENERATE REFERENCE DATA Acpu = randn(Rows,Cols,'single'); %% SAVE DATA ON 1TB AND SSD STORAGE FROM CPU tstart = tic; save('/1tb/fileX.mat', 'Acpu'); T_CPU_1tb = toc(tstart); tstart = tic; save('/ssd/fileX.mat', 'Acpu'); T_CPU_ssd = toc(tstart); % Compute data Size = 4*Rows*Cols; Thruput_CPU_1tb = Size/T_CPU_1tb; Thruput_CPU_ssd = Size/T_CPU_ssd; % Print data fprintf('\nSAVE SPEED TEST:\n'); fprintf('Matrix size: %6.2f [MB]\n', Size/1E6); fprintf('Thruput CPU/1TB: %6.2f [MB/s]\n', Thruput_CPU_1tb/1E6); fprintf('Thruput CPU/SSD: %6.2f [MB/s]\n', Thruput_CPU_ssd/1E6); clear all; %% LOAD DATA ON 1TB AND SSD STORAGE FROM CPU tstart = tic; load('/1tb/fileX.mat', 'Acpu'); T_CPU_1tb = toc(tstart); tstart = tic; load('/ssd/fileX.mat', 'Acpu'); T_CPU_ssd = toc(tstart); % Compute data [Rows,Cols] = size(Acpu); Size = 4*Rows*Cols; Thruput_CPU_1tb = Size/T_CPU_1tb; Thruput_CPU_ssd = Size/T_CPU_ssd; % Print data fprintf('\nLOAD SPEED TEST:\n'); fprintf('Matrix size: %6.2f [MB]\n', Size/1E6); fprintf('Thruput CPU/1TB: %6.2f [MB/s]\n', Thruput_CPU_1tb/1E6); fprintf('Thruput CPU/SSD: %6.2f [MB/s]\n', Thruput_CPU_ssd/1E6);
The results on a Colfax Custom work station based on an Intel Core i7 975 Extreme and an FX-3800 GPU are as follows (see details on the Colfax Reference System #4 here):
>> cpu_storage_test SAVE SPEED TEST: Matrix size: 400.00 [MB] Thruput CPU/1TB: 32.23 [MB/s] Thruput CPU/SSD: 32.21 [MB/s] LOAD SPEED TEST: Matrix size: 400.00 [MB] Thruput CPU/1TB: 166.95 [MB/s] Thruput CPU/SSD: 165.27 [MB/s]
Source code completely similar to what is shown above is used for the GPU. The only important change is that the matrix is cast to the GPU. Running this code on Colfax Reference System #4 it gives the following results:
>> gpu_storage_test SAVE SPEED TEST: Matrix size: 400.00 [MB] Thruput GPU/1TB: 30.76 [MB/s] Thruput GPU/SSD: 31.24 [MB/s] LOAD SPEED TEST: Matrix size: 400.00 [MB] Thruput GPU/1TB: 157.56 [MB/s] Thruput GPU/SSD: 156.13 [MB/s] >>
As seen from the results above, Jacket is slightly slower than using standard MATLAB. The reason is simply that there is an additional overhead. When saving data from Jacket to the disk the path is: GPU memory >> CPU memory >> disk. And the reverse for loading data. This is one additional step compared to standard MATLAB.
The Jacket team has provided a piece of code that transfer data directly between disk and GPU memory. This test has not been conducted yet but will follow. So will an analysis with different matrix sizes.
Go Home: Torben's Corner