CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Software User Forums > OpenFOAM

Is Playstation 3 cluster suitable for CFD work

Register Blogs Members List Search Today's Posts Mark Forums Read

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   April 20, 2007, 08:30
Default HI, According to this websi
  #1
Senior Member
 
Pei-Ying Hsieh
Join Date: Mar 2009
Posts: 317
Rep Power: 18
hsieh is on a distinguished road
HI,

According to this website, do you think that Sony Playstation 3 is suitable for CFD job?

http://fah-web.stanford.edu/cgi-bin/...?qtype=osstats

pei
hsieh is offline   Reply With Quote

Old   April 20, 2007, 14:03
Default yep, Dr. Frank Mueller can do
  #2
Senior Member
 
Markus Hartinger
Join Date: Mar 2009
Posts: 102
Rep Power: 17
hartinger is on a distinguished road
yep, Dr. Frank Mueller can do it.
give it a go, it might help me to convince my sponsor buying one
pure research of course

http://www.netscape.com/viewstory/2007/03/13/engineer-creates-first-academic-ps3 -computing-cluster/?url=http%3A%2F%2Fwww.physorg.com%2Fnews92674403.h tml&frame=t rue
hartinger is offline   Reply With Quote

Old   April 20, 2007, 14:24
Default The 8 processing units (PE?) h
  #3
Senior Member
 
Mattijs Janssens
Join Date: Mar 2009
Posts: 1,419
Rep Power: 26
mattijs is on a distinguished road
The 8 processing units (PE?) have only 256k of memory. So the 'easy' way which would be to use them like a normal 8 processor run would only work for a tiny case. (there is actually mention of an mpi port for it - I think IBM)

The other option would be to explicitly vectorise/parallellise bits of the code. Since there is no single bottlenecks you'd have to vectorise/parallellise a lot of code to get some decent overall speedup.
mattijs is offline   Reply With Quote

Old   May 16, 2007, 04:54
Default Hi everyone, Interesting to
  #4
Senior Member
 
Vincent RIVOLA
Join Date: Mar 2009
Location: France
Posts: 283
Rep Power: 18
vinz is on a distinguished road
Hi everyone,

Interesting topic indeed. Actually, we bought a PS3 to make the test. And it's not really straight forward.
It looks like there is two kind of processors in this machine, PPE and SPE.
We installed yelow dog linux on the PS3 and compiled OpenFOAM without problems on PPE. But PPE is not so fast (kind of apple G4 or so). So, the results aren't great. To compile it on SPE, you need a software development kit from IBM which is not so easy to use.
However, we didn't give up and we're going to try to make a kind of benchmark on different codes and machines to see what we can expect from this "computer".
vinz is offline   Reply With Quote

Old   November 2, 2007, 05:38
Default Hey, just curious: Did you
  #5
New Member
 
Mads Reck
Join Date: Mar 2009
Posts: 17
Rep Power: 17
gabriel_stokes is on a distinguished road
Hey,

just curious: Did you come up with any results from the benchmarking?
gabriel_stokes is offline   Reply With Quote

Old   November 2, 2007, 07:21
Default Hi Just to add some infor
  #6
New Member
 
Marcelo M. Garcia
Join Date: Mar 2009
Location: London, UK
Posts: 10
Rep Power: 17
mgarcia is on a distinguished road
Hi

Just to add some information about PS3 and IBM Cell (processor).

Cell has one general purpose processor, PowerPC 970 (G5), and 8 co-processors (7 in case of PS3). The PPU and SPUs. Like 386 and 387, an example from the old days.

To fully use the processor, you need to write a program to each part, PPU and/or SPU. If you are using gcc (from Sony or Barcelona Supercomp. center) you have to use Posix threads.

The good news is that the new IBM SDK 3.0 come with a single source compiler (XL C/C++), with means that you program using OpenMP.

The advantage of OpenMP is that most major compilers, like Intel or Microsoft, support it.

A very good survey about Cell (PS3) in scientifc computing can be found in the article:
www.netlib.org/utk/people/JackDongarra/PAPERS/scop3.pdf

Regards

Marcelo
mgarcia is offline   Reply With Quote

Old   March 11, 2008, 15:52
Default Vincent, Any benchmarks yet
  #7
Senior Member
 
Daniel P. Combest
Join Date: Mar 2009
Location: St. Louis, USA
Posts: 621
Rep Power: 0
chegdan will become famous soon enoughchegdan will become famous soon enough
Vincent,

Any benchmarks yet?
chegdan is offline   Reply With Quote

Old   March 11, 2008, 23:18
Default Daniel, >Posted by Daniel C
  #8
New Member
 
Gunnar Vikberg
Join Date: Mar 2009
Location: WI, USA
Posts: 3
Rep Power: 17
vikbergg is on a distinguished road
Daniel,

>Posted by Daniel Combest on Tuesday, March 11, 2008 - 01:52 pm:
>Any benchmarks yet?

I would not expect to see the PS3 outperform the Intel, AMD, etc. chips at the moment. As it stands, I'm not sure there is an ongoing effort to port OpenFOAM to the PS3 in a way to take full advantage of the Synergistic Processing Elements (SPE) yet. Remember that there is a considerable amount of complexity to be dealt with, and a compilation of the current code will yield a running application, but one that only uses the Power Processing Element (PPE). Essentially, even though you are running OpenFOAM on the PS3, you are not taking full advantage of the Cell processor. The result would be very close to running OpenFOAM on a PowerPC 970 (G5 processors from Apple), if not a bit slower.

In porting the OpenFOAM code to the Cell processor, one must keep in mind the limitations of the SPE's, as well as the complexities of transferring data efficiently across the Element Interconnect Bus (EIB) so the SPE's don't starve waiting for new data to process. On the PS3 there is also the issue of how much memory is available. The specifications say 512MB, but in reality the Cell processor has 256MB, with the other 256MB being specifically available for the RSX graphics chip, which we don't have access to at the moment. Even so, the PS3's Cell processor has the capability of roughly 110 GFLOPS (considering 76% efficiency and 6 SPEs since one is used by the HyperVisor), which exceeds any other consumer chip available in the market today.

Also, while the SPE's 256KB Local Store seems like a small space, the point of it is to have a very small program, one that applies one function or a few functions only (such as add, subtract, etc.) on many data elements (Single Instruction Multiple Data - SIMD). So, while the EIB is fast with its four "highways", it's not as fast as the SPEs if the program is parallelized correctly. Actually, if you do the calculations, you will see that we cannot efficiently transfer data to every SPE available at once. The trick is to use the four communication busses to transfer program and data to the SPEs which will be done soonest, and then worry about the SPEs which are still hard at work and may take longer to finish. Essentially, you want to provide as much data as can be operated on by the current SPE program (leaving space for the result as well). As the operations are carried out, the DMA controller within the SPEs requests the next program and data elements to be transferred. At this point, the various delays (memory access, bus transfer to and from delay, results writing to memory, etc) will cause the new operations and data to arrive at the SPE a few cycles from the last execution completion (ideally, it is harder than it sounds).

Though rewriting and optimizing a program for the Cell processor may be a difficult task, the rewards are significant enough to render an attempt at the very least. This is a very interesting processor, with an incredible capacity for number crunching and data parallelization. As Pei suggested, take a look at how much processing power the few PS3's around the world brought to the Folding@Home project. The PS3s in this case are beating all the other computers together by a huge margin!!! Also, consider that formulas for fluid mechanics are parallelizable, making OpenFOAM a great candidate for it! Personally, I would be very interested in being part of the effort towards having OpenFOAM on the PS3 or IBM's QS21 blades, and taking full advantage of the Cell processor in them.

Gunnar Vikberg
vikbergg is offline   Reply With Quote

Old   March 12, 2008, 05:58
Default Hi. I'm running the tutoria
  #9
New Member
 
Marcelo M. Garcia
Join Date: Mar 2009
Location: London, UK
Posts: 10
Rep Power: 17
mgarcia is on a distinguished road
Hi.

I'm running the tutorial in a IBM QS21. So far the results are publish bellow[1]. I'm using Fedora 7 with the Barcelona Supercomputing Center kernel, and IBM SDK 3.0 - gcc-4.1.1

In case of porting to Cell processor, I would consider to use ALF to handle all (or most) of the details of DMA transfer, keeping the SPEs busy, etc, see the answer of BrianDWatt in [2].

In May IBM will release the QS22 with a new generation of Cell with a serius Double Precision performance.


[1]
====
Application blockMesh - case cavity: completed
Application icoFoam - case cavity: completed in 39.58 s ClockTime
Application blockMesh - case cavityFine: completed
Application mapFields - case cavityFine: completed
Application icoFoam - case cavityFine: completed in 144.35 s ClockTime
Application blockMesh - case cavityGrade: completed
Application mapFields - case cavityGrade: completed
Application icoFoam - case cavityGrade: completed in 16.34 s ClockTime
Application blockMesh - case cavityHighRe: completed
Application icoFoam - case cavityHighRe: completed in 121.15 s ClockTime
Application blockMesh - case cavityClipped: completed
Application mapFields - case cavityClipped: completed
Application icoFoam - case cavityClipped: completed in 9.47 s ClockTime
Application fluentMeshToFoam - case elbow: completed
Application icoFoam - case elbow: completed in 525.75 s ClockTime
Application foamMeshToFluent - case elbow: completed
Application foamDataToFluent - case elbow: completed

Application blockMesh - case cavity: completed
Application turbFoam - case cavity: completed in 1550.09 s ClockTime

Application blockMesh - case pitzDaily: completed
Application simpleFoam - case pitzDaily: completed in 28467.4 s ClockTime
Application blockMesh - case pitzDailyExptInlet: completed
Application simpleFoam - case pitzDailyExptInlet: completed in 26881.6 s ClockTime
Application blockMesh - case pitzDaily3Blocks: ** FOAM FATAL ERROR **
Application simpleFoam - case pitzDaily3Blocks: ** FOAM FATAL ERROR **

Application blockMesh - case movingCone: completed
Application icoDyMFoam - case movingCone: unconfirmed completion

Application blockMesh - case offsetCylinder: completed
Application nonNewtonianIcoFoam - case offsetCylinder: completed in 9831.44 s ClockTime

Application blockMesh - case boundaryWallFunctions: completed
Application boundaryFoam - case boundaryWallFunctions: completed in 234.68 s ClockTime
Application blockMesh - case boundaryLaunderSharma: completed
Application boundaryFoam - case boundaryLaunderSharma: completed in 381.98 s ClockTime

Application blockMesh - case damBreak: completed
Application setFields - case damBreak: completed
Application interFoam - case damBreak: completed in 6196.97 s ClockTime
Application blockMesh - case damBreakFine: completed
Application setFields - case damBreakFine: completed
Application interFoam - case damBreakFine: unconfirmed completion
Application decomposePar - case damBreakFine: unconfirmed completion
Application reconstructPar - case damBreakFine: ** FOAM FATAL ERROR **

Application blockMesh - case nozzleFlow2D: completed
Application 1 - case nozzleFlow2D: unconfirmed completion
Application cellSet - case nozzleFlow2D: unconfirmed completion
Application refineMesh - case nozzleFlow2D: ** FOAM FATAL ERROR **
Application lesInterFoam - case nozzleFlow2D: unconfirmed completion

Application blockMesh - case damBreak: completed
Application setFields - case damBreak: completed
Application rasInterFoam - case damBreak: completed in 12606.4 s ClockTime
Application blockMesh - case damBreakFine: completed
Application setFields - case damBreakFine: completed
Application rasInterFoam - case damBreakFine: unconfirmed completion
Application decomposePar - case damBreakFine: unconfirmed completion
Application reconstructPar - case damBreakFine: ** FOAM FATAL ERROR **
======


[2] http://www.ibm.com/developerworks/blogs/page/powerarchitecture?entry=020808_foru m_watch
mgarcia is offline   Reply With Quote

Old   August 16, 2015, 15:53
Default Digging in old threads
  #10
New Member
 
Join Date: Dec 2013
Posts: 4
Rep Power: 0
clockworker is on a distinguished road
Hi there,

I just stumbled upon this old thread and became curious. My googling did not provide any up-to-date information about the possibilty to use the ps3 with OpenFoam, so I am posting here. Perhaps someone can provide a recent benchmark.
Were you able to use the full capacity of the Synergistic Processing Elements?
Were you able to overcome the various delays:
Quote:
(memory access, bus transfer to and from delay, results writing to memory, etc)
I am thankful for any current information or source

Greetings and thanks for your time

Last edited by clockworker; August 18, 2015 at 06:07.
clockworker is offline   Reply With Quote

Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
which model is suitable? eric CFX 8 February 27, 2007 18:48
Is there something more suitable than GAMBIT? Owen FLUENT 1 May 18, 2005 06:02
which cfd-code is suitable? Andreas Main CFD Forum 2 November 5, 2002 14:28
Regarding a suitable scheme for BFC Abhijeet Vaidya Main CFD Forum 4 December 18, 2001 13:15


All times are GMT -4. The time now is 07:41.