Hi. I finished to run tutor
I finished to run tutorial cases in a PS3. Not all cases were successful. In case someone want to see the whole output, just let me know.
The successful ones are:
Application icoFoam - case cavity: completed in 6.05 s ClockTime
Application icoFoam - case cavityHighRe: completed in 18.63 s ClockTime
Application icoFoam - case cavityClipped: completed in 1.8 s ClockTime
Application icoFoam - case elbow: completed in 86.6 s ClockTime
Application turbFoam - case cavity: completed in 198.31 s ClockTime
Application simpleFoam - case pitzDailyExptInlet: completed in 4075.02 s ClockTime
Application nonNewtonianIcoFoam - case offsetCylinder: completed in 1765.49 s ClockTime
Application boundaryFoam - case boundaryWallFunctions: completed in 31.23 s ClockTime
Application boundaryFoam - case boundaryLaunderSharma: completed in 48.93 s ClockTime
Application interFoam - case damBreak: completed in 45.33 s ClockTime
Application rasInterFoam - case damBreak: completed in 73.91 s ClockTime
Hi Marcelo, I am very inter
I am very interested in your results since I should begin a study on PS3 capacities at the begining of next year.
I would really apreciate if you could post here or send me the whole output as you mentioned.
Did you compile OpenFOAM in a special way to use it on the PS3 or was it straight forward?
Did you also make a performance comparison with respect to another machine? and what are your feelings about that?
Thanks in advance,
Hi Vincent. To compile Open
To compile OpenFOAM I followed the wiki. You have to make an entry in the directory "rules" to specify the compiler and other stuff. Then you have to edit the "OpenFOAM/.OpenFOAM/bashrc" to include a option "ppc64" in "uname -m". Basically is to prefix every tool with "ppu", for example: gcc to ppu-gcc, or g++ to ppu-g++. I used the dummy option for MPI. I'm using Fedora 6 with SDK 2.1
I can give these files, rules and bashrc, as soon as I get access to the machine again.
I don't the results of another machine to compare with. One the reasons I published the numbers was to have some feedback. But, as noted in another thread, don't expect the results to be great because you are using just the PPU, which is a (almost) plain PowerPC 970. The interesting part would be off load some computations to the spu's.
I'm curious. What are you hopi
I'm curious. What are you hoping to achieve by compiling FOAM on the PS3 ?
I've done a fairly extensive amount of development on the PS3, in an attempt to coax it towards CFD, but there are fundamental hurdles to cross here.
First of all, each of the SPE's have a limited 256k of memory for BOTH code and data. It has a small advantage of providing the developer with explicit control over transfers to/from this memory using DMA, but this is quite a nasty beast to tame. The tools for programming this sort of architecture are just not mature enough.
IBM's premise is that asynchronous memory transfers can be overlapped with computation as a form of efficiency, but for a bandwidth-bound application like a CFD code, this is hardly the case. 256k of memory is just not enough to do anything serious like a sparse-matrix multiply. The end result is that your application has all 8 SPE's (or 6, in case of the PS3, ugh!) waiting for data, almost all of the time.
I don't know how relevant this is, but the PS3 also works most efficiently in single-precision, and that too in a mode that is not entirely IEEE754 compliant.
I'd be interested to see what you come up with, but I'm not too hopeful about it myself.
Hi Sandeep My target platfo
My target platform is IBM BladeCenter with QS21 (2 dual core Cell BE) + JS21 (Power 970), maybe a IBM Road Runner (Cell BE + x86_64) in a near future. The PS3 was just while I was configuring the blades.
My goal is to learn about Cell BE, and I don't have a clear idea of what to achieve with OpenFOAM. I'm following an IBM article about porting financial applications to Cell. The project was to port a complex C++ code to Cell. If it proves hopeless, it's OK, at least we learned a lot. But I believe that could be a good gain in performance.
I not saying that will be easy. The size of SPE local store (256KB) is a problem, which probably means that the functions will need to be split in order to fit in the LS. A lot of (and smart) DMA transfers to don't let the SPEs waiting. It is difficult to program, etc.
Dongarra et al. have being doing a very interesting research with mix (single and double) precision computations. It seems that IBM will release a new version of Cell BE in May with a much improved double precision performance.
Lets see what happens! I hope something interesting.
|All times are GMT -4. The time now is 14:56.|