From 0a425d27c38a386c272739f20dfaca5881ddba1e Mon Sep 17 00:00:00 2001
From: "W. Trevor King" <wking@drexel.edu>
Date: Sat, 13 Nov 2010 10:09:05 -0500
Subject: [PATCH] Add GPGPU post.

---
 posts/GPGPU.mdwn | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 48 insertions(+)
 create mode 100644 posts/GPGPU.mdwn

diff --git a/posts/GPGPU.mdwn b/posts/GPGPU.mdwn
new file mode 100644
index 0000000..ef5e8db
--- /dev/null
+++ b/posts/GPGPU.mdwn
@@ -0,0 +1,48 @@
+[[!meta  title="General-purpose computing on graphics processing units"]]
+
+[GPGPU][] utilizes the number crunching speed and massive
+parallelization of your [graphics card][GPU] to accelerate
+general-purpose tasks.  When your algorithm is compatible with GPU
+hardware, the speedup of running hundreds of concurrent threads can be
+enormous.
+
+There are a number of ways to implement GPGPU, ranging from
+multi-platform frameworks such as [OpenCL][] to single-company
+frameworks such as [NVIDIA][]'s [CUDA][].  I've gotten to play around
+with [CUDA][] while TAing the [[parallel computing]] class, and its
+lots of fun.
+
+With NVIDIA (other vendors are probably similar, I'll update this as I
+learn more), each GPU *device* has a block of global memory serving a
+number of multi-processors, and each multi-processor contains several
+cores which can execute concurrent threads.
+
+Specs on NVIDIA's [GeForce GTX 580][]:
+
+* 512 cores (16 (MP) â 32 (Cores/MP))
+* 1.5 GB GDDR5 RAM
+* 192.4 GB/sec memory bandwidth
+* 1.54 GHz processor clock rate
+* 1,581.1 GFLOPs per second
+
+Zoom.
+
+The GFLOP/s computaton is `coresâclockâ2`, because (from [page 94][]
+of the CUDA programming guide) each core can exectute a single
+multiply-add operation (2 FLOPs) per cycle.  Also take a look at the
+graph of historical performance on [page 14], the table of device
+capabilities that starts on [page 111][], and the description of
+*warps* on [page 93][].
+
+[GPGPU]: http://en.wikipedia.org/wiki/GPGPU
+[GPU]: http://en.wikipedia.org/wiki/Graphics_processing_unit
+[NVIDIA]: http://www.nvidia.com/
+[CUDA]: http://en.wikipedia.org/wiki/CUDA
+[GeForce GTX 580]: http://www.nvidia.com/object/product-geforce-gtx-580-us.html
+[page 14]: http://developer.download.nvidia.com/compute/cuda/3_2/toolkit/docs/CUDA_C_Programming_Guide.pdf#page=14
+[page 93]: http://developer.download.nvidia.com/compute/cuda/3_2/toolkit/docs/CUDA_C_Programming_Guide.pdf#page=93
+[page 94]: http://developer.download.nvidia.com/compute/cuda/3_2/toolkit/docs/CUDA_C_Programming_Guide.pdf#page=94
+[page 111]: http://developer.download.nvidia.com/compute/cuda/3_2/toolkit/docs/CUDA_C_Programming_Guide.pdf#page=111
+
+[[!tag tags/hardware]]
+[[!tag tags/programming]]
-- 
2.26.2