NVIDIA HPC GPU Computing Comparison Between CPU & GPU

1)GPU - Graphics Processing Unit - Purpose scientific & engineering Computing.
2)CPU + GPU are working together to complete task.
3)GPU , it is co-processor for accelerating the rendering of computer graphics.
4)GPUs that contain an array of uniform processors
5)Mainly applicable for
a)Parallel Computing
b)GPGPU Technologies
c)Scientific Application.
6)Calculate and Generate the positioning of GRAPHICS on a computer screen.
that can be used to implement any data-parallel algorithm.http://www.nvidia.com/docs/io/40/image.jpg
amount of parallelism application only can benefit from GPUs
7)GPUs contain Multiple cores(CUDA Core) that utilize hardware multithreading
8)GPU accelerate application running on CPU .
(By offloading some of COMPUTER-INTENSIVE and TIME CONSUMING portions of the code).
9)GPU consist of hundreds of smaller cores together.(Example 240 CUDA Cores (or) 448 CUDA cores)http://www.nvidia.com/content/tesla/images/gpu-computing-image.png
10)parallel nested loops – resulting in a GPU dynamically spawning new threads
11)these processor type are characteristic into
a)GetForce b)Quadro c)Tegra d)Tesla e)Legacy.
12)GPU cards can use lot of power -270w. Lot of power equal expose lot of heat.
13)Important Tag - OpenCL(Open Computing Language) OpenGL(Open Graphics Library),
Card is called – Host Interface Card.
GHIC – Graphics Host Interface Card.
Nvsmi tool - GPU Computer Monitoring tool
SDK – Collection of example and Documentation for GPU Computing.(Refer Pt 16)GPU Programming Language - OpenCL, CUDA(Computer Unified Device Architecture),
Applications -> System tools-> nvidia-settings
GPGPU - General Purpose On Graphics Processing Units.
16)./deviceQuery - which will show the card status, then you can try the others, e.g.,
./oceanFFT - > Show Card Status
./nbody -> show the GPU Quality.

Comparison Between CPU & GPU Computing.
CPU & GPU processor has three object
Register-Local Memory-Global Memory-Disk

Register Read / Write Read / Write
Local Memory Read / Write None
Global Memory Read / Write Read/write Note(Read-Only during computation & Write-only at end to pre-computed address)
Disk Read / Write Does not exist
NOTE : GPU can't read disk read/write. So GPU Require CPU access data from the disk to exchange data between nodes in the cluster without CPU , GPU is not possible.
NOTE: Due to that if the application is Parallelism then only GPU is benefit, if it is not parallelism it won't helpful not useful.
Main Objective Increase the Performance via TASK PARALLELISM. Increase the DATA PARALLELISM.

Independent Process (Limited Communication )
Element of vector could be divided and they do not depend one on the another.
Use the Multiple Process unit.
Number of ALU is Low
Number of ALU is High
No of Cache is High
Number of Cache is Low
Draw Circle,triangle
To draw circle,triangle, it is very difficult , we have to write  lot of codes  drawcircle
Very easy to draw circle,triangle,rendering using openGL
To performs the Mathematical Calculation
Instruct Data
NOTE : CPU allocate Job to the GPU , GPU is the accelerator
 Important Commands
1)lspci -tv | grep VGA
2)cat /proc/driver/nvidia/cards/0
3)#lsmod | grep nvidia
4)#cat /proc/driver/nvidia/version
5)#ls /dev/nvidia
6)#nvcc -v
#nvidia-smi --loop-continuously --interval=60 --filename=/var/log/nvidia.log &
#nvidia-smi -g 0 -c 1 (Set GPU 0 in exclusive access mode)
#nvidia-smi -g 1 -c 1 (Set GPU 1 in exclusive access mode)
How Cuda Core is calculating
SM (16 SM in Single GPU ) * ( Processor )
1 SM is 16 Core
16*32 = 512 Cuda Core
Latest one is SMX.
Fermi ( SM  have 32 core )
Kepler (SMX have 192 core
Note : SIMD – Single Instruct Multiple Data
MIMD – Multiple Instruct Multiple Data
To improve the system performance we can alter
Hardware or Software.

NOTE 1)when we were using GPU computing
We won't get that much level efficiencycompare to CPU computation.
GPU – 62% (we will get better performance compare to CPU but compare to the cores we won't get that much efficiency)
For Example In CPU if we use 6 core we will get 80% efficiency
but for in GPU in 448 core we will get efficiency 60%
CPU – 80% (because of of cores less compare to gpu)
NOTE 2)GPU Computing there main advantages is if we are using any application related to Graphical display,like Matlap, Animation,fluid physics simulation it that scenario only we will get the advantages over GPU. provide the programmable developed animation.

Related post