Genesis GE-i940 Tesla
On September 28, 2009, a workstation Genesis GE-i940 Tesl, based on both GPGPU* and nVidia/CUDA** Technologies has been installed at DSA/LabMNCP.
It is a testbed for developing advanced simulation in the following research field:
- Stochastic simulation;
- Molecular Dynamics;
- Atmospheric and climate modeling;
- Weather forecast investigation;
- Grid/Cloud Hybrid Virtualization;
*
“GPGPU stands for General-Purpose computation on Graphics Processing Units, also known as GPU Computing. Graphics Processing Units (GPUs) are high-performance many-core processors capable of very high computation and data throughput. See more here.”
**
“NVIDIA® CUDA™ is a general purpose parallel computing architecture that leverages the parallel compute engine in NVIDIA graphics processing units (GPUs) to solve many complex computational problems in a fraction of the time required on a CPU. See more here. “
Hardware | |
---|---|
Mainboard | Asus x58/ICH10R 3 PCI-Express x16, 6 SAT, 2 SAS, 3+6 USB |
CPU | i7-940 2,93 133 GHz fsb, Quad Core 8 Mb cache |
RAM | 6 x 2Gb DRR 3 1333 DIM |
Hard Disk | 2 x 500 Gb SATA 16Mb cache 7.200 RPM |
GPU | 1 Quadro FX5800 4Gb RAM |
2 x Tesla C1060 4 Gb RAM |
Software | |
---|---|
OS: | GNU/Linux CentOs 5.3 64 Bit |
Driver: | nVidia Cuda 180.22 Linux 64bit |
VMware: | VMware-server-2.0.2 |
OUTPUT of First Test: | ||
---|---|---|
Serial simulation(ms) | GPU(ms) | |
execution time for malloc | 0.02 | 175.21 ms |
execution time for RndGnr | 51430.92 | 2283.19 |
execution time for init | 275.48 | 0.31 |
execution time for computing | 391391.12 | 329.19 ms |
execution time for I/O | 56822.77 | 64740.54 ms |
execution time for GPU/CPU | 198.43 ms |
Output using GPU,
device 0 : Quadro FX 5800 device 1 : Tesla C1060 device 2 : Tesla C1060Selected device: 2 <<<<<<<<<<<<<<<<<<
device 2 : Tesla C1060 major/minor : 1.3 compute capability Total global mem : -262144 bytes Shared block mem : 16384 bytes RegsPerBlock : 16384 WarpSize : 32 MaxThreadsPerBlock : 512 TotalConstMem : 65536 bytes ClockRate : 1296000 (kHz) deviceOverlap : 1 deviceOverlap : 1 MultiProcessorCount: 30
Using 1048576 particles 100 time steps