Parallel Computing: Clusters or Graphics?

I'm working on my AI research, which could benefit greatly from parallel computing. As I'm sure there's someone here who knows more than I do on the different platforms, which one would bring better results? (I know I could use both... but time's not really on my side.)

- nVidia CUDA [w/ GF8800s]
- OpenMPI [w/ about 50-100 computers, running P4D]

Some pages linking to benchmarks would be useful.

[Sorry for the low point count, I ran out. :(]

Thanks!
holobytedAsked:
Who is Participating?
 
CallandorConnect With a Mentor Commented:
One Tesla card (c870) is 518 gigaflops, the d870 is two c870 cards and is over a teraflop, and the s870 is four c870 cards and is over 2 teraflops.  The system scales linearly with additional cards, so by extrapolation that means 80 x86 cpus and 160 x86 cpus, respectively.
0
 
CallandorCommented:
One Tesla card supposedly can run at 518 gigaflops http://xtreview.com/addcomment-id-2756-view-Nvidia-Tesla-c870,D870-and-s870.html+tesla+nvidia+benchmarks&hl=en&ct=clnk&cd=2&gl=us, which is compared to the throughput of 40 x86 processors.  There is a 4-card version for servers that is that much more powerful.   Graphics cards are designed for parallel processing of textures and have a much higher transistor count than cpus, so it is not surprising that they can outperform general purpose processors for certain applications.
0
 
holobytedAuthor Commented:
What would the higher-end Tesla card compare to? Ie, one "normal" Tesla card compares to 40 x86 CPUs (which CPUs?), what would the other be?

0
Cloud Class® Course: Microsoft Windows 7 Basic

This introductory course to Windows 7 environment will teach you about working with the Windows operating system. You will learn about basic functions including start menu; the desktop; managing files, folders, and libraries.

 
holobytedAuthor Commented:
How would 35 Pentium 4 D @ 2.00GHz compare? What would be the "rated" Xflops? Assuming peak performance.
0
 
CallandorCommented:
0
 
holobytedAuthor Commented:
If I recall correctly, P4D's went up to 3.2GHz... According to Wikipedia though, (http://en.wikipedia.org/wiki/Pentium_D), you're right.

What would be the approx. flops be for such a cluster? I'll try getting in touch w/ the owner of the 35 CPUs so I can get a real speed value. (Running OpenMPI)
0
 
CallandorCommented:
A single PentiumD 3.2 clocks in at about 600 megaflops, so 35 of them will be around 21 gigaflops.  The PentiumD cpus are much lower in performance than the newer Core2 cpus, easily trounced by even AMD's X2 offerings.
0
 
holobytedAuthor Commented:
Wow. That's actually pretty depressing... 35 systems can't even match up to one graphics card. Too bad CUDA is a pain to implement...
0
 
CallandorCommented:
Modern graphics cards are very powerful, and the ability to use them in non-graphics applications is very nice.  Think about a $200 card giving you the power of 10 modern cpus - that's quite a good deal.
0
 
holobytedAuthor Commented:
Yeah, I know. What's the GFlops on a "normal" GF8800 though? The tesla is outstanding, but that's cause it's a "small supercomputer for your workstation."

0
 
CallandorCommented:
It's about the same - 500 gigaflops: http://en.wikipedia.org/wiki/GeForce_8_Series#8800_GT, though I don't know if all of that is available for number crunching.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.