|
|
MATLAB® is a registered trademark of MathWorks Inc. (More info) |
Torben's Corner
From Jacket Wiki
|
"We are pleased to present you with this special section of our website - Torben's Corner. Torben became a Jacket customer in 2009 and has been generating some really cool results. Torben examines ideal Jacket usage scenarios and the intent is to provide a forum of discussion for application acceleration on the GPU. As Torben makes progress in his work, you will be able to watch his progress in these pages. Enjoy!" --John Melonakos, CEO of AccelerEyes.
"I want to thank AccelerEyes, and John in particular for his encouragement and trust to give me this opportunity. I am a Professor in Radio Frequency Electronics and Systems at Aalborg University, Denmark, where I mainly work with simulation and modeling techniques for wireless communication transceivers. I have used MATLAB for many years and started using Jacket just before the summer of 2009. I hope that "Torbens Corner" will produce material, which is of interest to the Jacket community. From the beginning I plan on making material for two areas: 1) Tips and Tricks, and 2) Jacket Performance Index. The former is as the name says a plan to show different examples of what Jacket can be used for. The latter will present a rating for how suitable different functions are to be computed by the GPU. In all cases focus will be on practical use of Jacket where all source code is public available." --Torben Larsen, Professor, Aalborg University, Denmark.
News
03-APR-2011:Examples of how the class inheritance can be used to improve the speed at which array indexing operations can be performed.
01-APR-2011: Example of how to enable CPU and GPU computing in one file - directed by the type class of the input.
06-FEB-2011: Code and examples for two small functions to time events using MATLAB and Jacket. They ensure proper use of warm-up, loop compensation, synchronization/evaluation etc.
05-DEC-2010: Code and results for moving data between CPU and GPU memory (both directions).
02-DEC-2010: A recommended procedure for ensuring that MATLAB/Jacket delivers the best possible performance on the given platform.
24-SEP-2010: A detailed description of how CUDA is installed is presented here.
09-SEP-2010: MATLAB R2010b has built-in support for using GPGPUs. This page compares MATLABs utilization of GPUs to Jacket.
03-SEP-2010: A new benchmarking scheme for Jacket functions. It has become too much of a job to regularly maintain a large number of figures for more than a handful of hardware platforms. Therefore, I have introduced a new benchmarking suite, which tests 30 functions for matrix/vector and single/double precision. The code is freely available (it takes approximately 1.5-2 hours to complete the benchmark). There is automatic generation of tables for Wiki pages and Latex.
01-SEP-2010: A small collection of commands to extract various system information into your code. Very useful when you want to document what operating system, GPU, Jacket version etc. that was used for the given results.
10-AUG-2010: This entry shows an arithmetic heavy matrix multiplication method to estimate the floating point performance seen from Jacket. A number of different GPUs are shown including the Tesla C2050 and a GeForce GTX470.
Toolboxes/Libraries
This part aims at collecting various small and large toolboxes or libraries, which can be used in different aspects of Jacket:
- Jacket: Allowing Jacket code to run on plain MATLAB installations is one of the most important issues when designing toolboxes or libraries. Obviously the code will not get any speedup advantage without Jacket' but the code should be able to fall back to using standard MATLAB then. This collection of functions and guide explains what can be done.
- Jacket_Benchmark_100: A comprehensive benchmarking suite testing more than 30 functions in matrix/vector and single/double precision. Automatically generates Latex table file and Wiki table format file of the results. The benchmark takes 1.5-2 hours to complete.
Tips and Tricks
This section presents various Tips and Tricks for Jacket. Any issue can be treated as long as it is relevant for using Jacket to perform scientific computations. User contributions or requests from users are more than welcome. These can be directed to Torben by E-mail. It is the intention that Torbens Corner will be dynamic, and that new material will enter frequently. In all cases the main focus is to show real life Jacket code, and show practical examples from Jacket users. Therefore, it is essential that all Jacket code is freely accessible. The material is organized in various groups to make it easier to identify relevant information.
General Issues
- CPU-GPU Data Transfer. Measured results are shown for transfer times from CPU to GPU and vice versa.
- Warming up. Specific guidelines for how to warm up Jacket and CUDA for computations.
- Jacket Benchmarking. Benchmarking is a necessity to optimize code and guidelines are given for how this can be done. There are several possibilities to do this the wrong way and the techniques are discussed in detail.
- Is Jacket Column or Row Major? Sometimes it is possible to organize the data such that we can decide if we do the toughest computations along rows or down columns. This is the topic of this page.
- Save/Load Disk Data: In virtually all applications involving huge amounts of data it is unavoidable to make use of disk storage. For these applications it is important how fast we can save and load data. This is the topic discussed here where a computer system, which contains both a server grade 1 TB disk as well as an Intel 160 GB solid state disk has been tested.
- What Computer/GPU Platform to Choose? This is a frequent question often put forward. Although it can't be answered just like that some experiences have been compiled here. (UNDER REVISION)
- Handling Scalars In Jacket: A benchmarking on how to best handle scalars when doing Jacket computations is presented. Some surprises may show up.
- Performance of GPU Use with MATLAB R2010b: MATLAB R2010b has built-in support for using GPGPUs. This page compares MATLABs utilization of GPUs to Jacket.
- Installation of Ubuntu/CUDA/MATLAB/Jacket: A brief but direct description on how to set up Ubuntu Linux to work with Jacket.
- CUDA On Ubuntu 10.04: A detailed description of how CUDA is installed on an Ubuntu platform is presented here.
Classical Signal Processing
- Cross Correlation: Computation of cross correlations is described in this Wiki. The procedure xcorr from the Signal Processing Toolbox is tested with Jacket as well as a similar FFT based function.
Programming Tips
- Adaptive Code: How to write code, which for example adapts to the available amount of GPU memory. This is very useful when writing code, which must be able to run on different hardware platforms.
- Making CPU/GPU enabled functions. Discusses how functions can be coded to easily allow for both CPU and GPU utilization. This is very useful when writing toolboxes for example where you don't know if the user has Jacket available.
- Making loops with gfor.
- Programming With Multiple Heterogeneous GPUs: A detailed example of how to use Jacket with a computational platform consisting of heterogeneous GPUs for solving the same computational problem is shown.
- Influence of The Performance Setting Of The CPU: Performance of the CPU and GPU when benchmarking 'Turbo Speed' processors.
Multiple (2-4) GPUs
- Warming up. Specific guidelines for how to warm up Jacket and CUDA for computations when multiple GPUs are available.
GPU Clusters (>4 GPUs)
- Warming up. Specific guidelines for how to warm up Jacket and CUDA for computations when multiple GPUs are available.
Jacket Benchmark Tables
I regret to inform that the earlier Jacket Performance Index has been abandoned. The reason is simply that it ended up taking so much time that I couldn't finalize all the tests across a handful of hardware platforms before a new version of Jacket appeared. Therefore, I have designed a new benchmarking library, which allows me to test a platform in approximately 1.5-2 hours. This currently tests about 30 functions for speed-up when comparing Jacket to plain MATLAB. The plan is to expand it over time.
The code and results for a number of platforms are available at Jacket Benchmarks.
Benchmarking types:
- Benchmarking of Functions.
- Memory transfer between CPU memory and GPU memory.
- Saving/Loading data to/from disk.
Submitting Material
You are more than welcome to submit material to Torben if you find it could be useful to the Jacket community. You are also very welcome to submit requests for issues to be discussed or analyzed. Regarding subjects it should be so that we can expect that it will be of interest to a number of Jacket users. You can contact Torben by E-mail. All contributors will be properly referenced and recognized for their contribution. For specific issues such as installation problems, either contact support at AccelerEyes directly by E-mail or use the Forums.



