Reference Systems
From Jacket Wiki
The different simulation results have been obtained by use of the three reference systems described below. A quick overview:
- The first reference system is a fast workstation based on a 3.33 GHz Intel Core i7 processor (975 Extreme version) with 3 channel RAM and two CUDA 1.3 capable GPUs and one CUDA 1.1 GPU (NVIDIA C1060 Tesla / NVIDIA Quadro FX3800 / NVIDIA 9800GT). This computer is running a 64 bit Windows 7 Enterprise.
- The second is an Apple MacBook Pro laptop with a 2.8 GHz Intel Core 2 Duo and an NVIDIA GeForce 9600M GT GPU. This is running an Ubuntu x64 Linux.
- The third is also an Apple MacBook Pro laptop with a 2.8 GHz Intel Core 2 Duo and an NVIDIA GeForce 9600M GT GPU. This is running the Mac OS X Snow Leopard with a 32 bit MATLAB.
- The fourth is a workstation based on a 3.33 GHz Intel Core i7 processor (975 Extreme version) with 3 channel RAM and three CUDA 1.3 capable GPUs (based on Quadro FX-3800). This computer is running a 64 bit fedora 11 64-bit Linux.
- The fifth is a workstation based on a 3.2 GHz Intel Core i7 processor (960 version) with 3 channel RAM and three CUDA 1.3 capable GPUs (based on GeForce GTX260-216). This computer is running a 64 bit Fedora Linux.
- The sixth is a Colfax GPU Cluster with one head node (2 Intel Xeon X5570 / 72 GB RAM / 12 TB storage) and eight compute nodes (2 Intel Xeon X5570 / 48 GB RAM / 2 NVIDIA C1060 Tesla). The operating system is a Red Hat Enterprise Linux and Platform HPC is used for job control and management.
- The seventh is a powerful gaming laptop - an Asus G51J with an Intel Core i7-720QM CPU and NVIDIA GeForce GTX260M GPU with 1 GB of memory.
- Number eight system is an Apple MacBook Pro laptop with a 2.66 GHz Intel Core i7 and an NVIDIA GeForce GT330M GPU. This is running the Mac OS X Snow Leopard 10.6.3 with a 32 bit MATLAB.
The exact configurations are described below.
#1: Colfax CXT-2000 (Core i7-975 / Tesla | FX-3800 | 9800GT / Windows 7)
Colfax Custom (i7-975 / Tesla | FX-3800 | 9800GT / Windows 7): The examples have mainly been run on the reference system described below, which is a fairly powerful computer system. It has been tried to make statements in the Wiki pages, which are fairly generic and should be true on most types of systems. Therefore, the benchmark is also a coarse grained score, which should give a good estimate across different computer systems.
The reference system used in the various tests consists of the following:
- Colfax Custom Workstation:
- Antec P183 chassis - See more Antec info here.
- Coolermaster Real Power Pro 1250 Watt Power Supply - See more Coolermaster info here.
- Asus P6T7 WS Supercomputer motherboard - See more Asus info here.
- Intel Core i7 975 (3.33 GHz) CPU - See Intel info here.
- 12 GB DDR3 RAM (6 x 2048 MB, 1333 MHz, unbuffered DDR3).
- 2 x Western Digital VelociRaptor WD3000HLFS in RAID 1 for operating system and applications - See Western Digital info here.
- 3 x Western Digital VelociRaptor WD3000HLFS in RAID 5 for data - See Western Digital info here.
- PNY NVIDIA Quadro FX-3800 (1 GB, 192 CUDA cores, compute 1.3) - See NVIDIA info here and PNY info here.
- NVIDIA C1060 Tesla (4 GB, 240 CUDA cores, compute 1.3) - See NVIDIA info here.
- ASUS EN9800GT (1GB GDDR3, 112 CUDA cores, compute 1.1) - see NVIDIA info here and ASUS info here.
- Software:
- Microsoft Windows 7 Enterprise x64.
- MATLAB 7.9.0.529 (R2009b).
- Jacket 1.2.2 Rel. 3170.
- CUDA Driver: 190.38.
- CUDA SDK: 2.3.
Relative scores (speed-up of GPU compared to CPU) are often used to assess Jacket performance. However, this may be somewhat misleading in some cases - in particular when comparing across different computer systems. The easiest way to a high speed-up is just to use a very slow CPU as the reference. On the other hand, the speed-up can also be useful at times as it for one specific system gives a correct picture of the improvement that can be achieved. It is therefore chosen to include both relative (speed-up) and absolute (time) values for performance. By having absolute run times for various test examples it is possible to compare with your own computer system.
The reference system described above is based on a state-of-the-art Intel Core i7 975 (3.33 GHz) with 3 channel memory access. With this CPU it can hardly be said that a high speed-up is easily achieved because a slow CPU is used. This should be considered for the examples presented.
#2: Apple MacBook Pro (Core 2 Duo 2.8GHz / 9600M GT / Ubuntu 9.04)
Apple MacBook Pro (Core 2 Duo 2.8GHz / 9600M GT / Ubuntu 9.04): However, tests have also been done on an Apple MacBook Pro laptop with an NVIDIA GeForce 9600M GT Graphics Processing Unit, and an Ubuntu Linux operating system to have results for completely different computers.
- Apple MacBook Pro:
- Intel Core 2 Duo (2.8 GHz) CPU.
- 4 GB DDR RAM.
- Apple 128 GB SSD.
- NVIDIA GeForce 9600M GT (512 MB, 32 CUDA cores, compute 1.1).
- Software:
- Ubuntu Linux 9.04 x64.
- MATLAB 7.9.0.529 (R2009b).
- Jacket 1.2.1 Rel. 2833.
- CUDA Driver: ???.??.
- CUDA SDK: 2.3.
This computer is not available any more and results from this machine will gradually be replaced with newer ones.
#3: Apple MacBook Pro (Core 2 Duo 2.8GHz / 9600M GT / OSX Snow Leopard)
Apple MacBook Pro (Core 2 Duo 2.8GHz / 9400M | 9600M GT / Mac OSX): This system is an Apple MacBook Pro laptop with two NVIDIA GeForce (9400M and 9600M GT) Graphics Processing Units, and is running the Apple Mac OSX Snow Leopard operating system:
- Apple MacBook Pro:
- Intel Core 2 Duo (2.8 GHz) CPU.
- 8 GB DDR RAM.
- Apple 128 GB SSD.
- NVIDIA GeForce 9400M (256 MB, 16 CUDA cores, compute 1.1).
- NVIDIA GeForce 9600M GT (512 MB, 32 CUDA cores, compute 1.1).
- Software:
- Mac OS X Snow Leopard (Ver.: 10.6.2, Build: 10C540).
- MATLAB 7.9.0.529 (R2009b) 32 bit.
- Jacket 1.2.1 Rel. 2833.
- CUDA Driver: ???.??.
- CUDA SDK: 2.3.
This computer will soon be replaced with the new Apple MacBook Pro with Core i7 processor.
#4: Colfax CXT-2000 (Core i7-975 / FX-580 | 3 x FX-3800 / fedora 11)
Colfax Custom (i7-975 / FX-580 | 3 x FX-3800 / fedora 11): This reference system is based on 3 GPUs of the type Quadro FX-3800. This is intended as a mid performance multi GPU system with 3 GPUs each with 192 cores and capable of using double precision variables. The reference system used in the various tests consists of the following:
- Colfax Custom Workstation:
- Antec P183 chassis - See more Antec info here.
- Coolermaster Real Power Pro 1250 Watt Power Supply - See more Coolermaster info here.
- Asus P6T7 WS Supercomputer motherboard - See more Asus info here.
- Intel Core i7 975 (3.33 GHz) CPU - See Intel info here.
- 12 GB DDR3 RAM (6 x 2048 MB, 1333 MHz, unbuffered DDR3).
- 1 x Western Digital RE3 320 GB for operating system and applications - See Western Digital info here.
- 1 x Western Digital RE3 1 TB for data - See Western Digital info here.
- 1 x Intel X25M-G2 160 GB Solid State Disk.
- 1 x PNY NVIDIA Quadro FX-580 (512 MB, 32 CUDA cores, compute 1.1) - see NVIDIA info here and PNY info here. This GPU is used for display control only.
- 3 x PNY NVIDIA Quadro FX-3800 (1 GB, 192 CUDA cores, compute 1.3) - See NVIDIA info here and PNY info here. All are used for computations only.
- Software:
- Fedora 10 Linux 64 bit.
- MATLAB 7.10.0.499 (R2010a).
- Jacket 1.2.2 Rel. 3170.
- CUDA Driver: 190.38.
- CUDA SDK: 2.3.
#5: Colfax CXT-2000 (Core i7-950 / 9800GT | 3 x GTX260 / fedora 11)
Colfax Custom (i7-950 / 9800GT | 3 x GTX260 / fedora 11): This reference system is based on 3 GPUs of the type Palit GTX260 (216 cores). This is intended as a relatively low cost system, which can be used for multi-GPU testing. The reference system used in the various tests consists of the following:
- Colfax Custom Workstation:
- Antec P183 chassis - See more Antec info here.
- Coolermaster Real Power Pro 1250 Watt Power Supply - See more Coolermaster info here.
- Asus P6T7 WS Supercomputer motherboard - See more Asus info here.
- Intel ???.
- 12 GB DDR3 RAM (6 x 2048 MB, 1333 MHz, unbuffered DDR3).
- 1 x Western Digital RE3 320 GB for operating system and applications - See Western Digital info here.
- 1 x Western Digital RE3 1 TB for data - See Western Digital info here.
- 3 x Palit GeForce GTX260 Sonic 216 SP (896 MB RAM, 216 CUDA cores, compute 1.3) - See NVIDIA info here and Palit info here.
- Software:
- Fedora 11 Linux 64 bit.
- MATLAB 7.9.0.529 (R2009b).
- Jacket 1.2.2 Rel. 3170.
- CUDA Driver: 190.38.
- CUDA SDK: 2.3.
#6: Colfax HPC GPU Cluster (16 x X5570 / 16 x Tesla / RHEL)
Colfax Cluster (16 x X5570 / 16 x Tesla / RHEL): This system is a Colfax Cluster with 16 NVIDIA C1060 Tesla Graphics Processing Unit with a 64 bit Red Hat Enterprise Linux operating system:
- Colfax GPU Cluster:
- 1 Head Node with 2 Intel Xeon X5570 CPUs with 72 GB RAM and 12 TB storage.
- 8 Compute Nodes each with 2 Intel Xeon X5570 CPUs with 48 GB RAM and 2 NVIDIA C1060 Tesla (later C2070 Fermi).
- Software:
- Red Hat Enterprise Linux.
- Platform HPC.
- MATLAB 7.9.0.529 (R2009b) 32 bit.
- Jacket 1.2.1 Rel. 2833.
- CUDA Driver: ???.??.
- CUDA SDK: 2.3.
The system is currently (April 30, 2010) being tested and will be delivered for production runs within 2-3 weeks.
#7: Asus G51J Laptop (Core i7-720QM / GTX260M)
#8: Apple MacBook Pro (Core i7 2.66GHz / GT330M / OSX Snow Leopard)
Apple MacBook Pro (Core i7 2.66GHz / GT330M / Mac OSX): This system is an Apple MacBook Pro laptop with one GeForce GT330M Graphics Processing Unit, and it is running the Apple Mac OSX Snow Leopard 10.6.3 operating system:
- Apple MacBook Pro:
- Intel Core i7 (2.66 GHz) CPU.
- 8 GB DDR RAM.
- Apple 128 GB SSD.
- NVIDIA GeForce GT330M (512 MB, 48 CUDA cores, compute 1.2).
- Software:
- Mac OS X Snow Leopard (Ver.: 10.6.3).
- MATLAB 7.10.0.499 (R2010a) 32 bit.
- Jacket 1.4 RC (Build 4496).
- CUDA Driver: 19.5.2f20.
Go Home: Torben's Corner