CFD Online Discussion Forums

CFD Online Discussion Forums (http://www.cfd-online.com/Forums/)
-   OpenFOAM (http://www.cfd-online.com/Forums/openfoam/)
-   -   Parallel performance of icoFoam (http://www.cfd-online.com/Forums/openfoam/65027-parallel-performance-icofoam.html)

skabilan June 1, 2009 18:58

Parallel performance of icoFoam
 
Hi All,

I get linear speed-up upto 4 processors with icoFoam; when I try to use more than 4 processors, the performance decreases. Has anyone come across this issue? I will be more than happy to report the details if need be.

Regards,
Senthil

mbeaudoin June 2, 2009 05:11

Hello Senthil,

Could you give us more details on your hardware, software and problem configuration?

CPU type, amount of RAM, type of interconnect, version of OpenFOAM, size of meshes, etc.

Martin

Quote:

Originally Posted by skabilan (Post 217845)
Hi All,

I get linear speed-up upto 4 processors with icoFoam; when I try to use more than 4 processors, the performance decreases. Has anyone come across this issue? I will be more than happy to report the details if need be.

Regards,
Senthil


Rachel June 2, 2009 13:17

Hello Senthil,
What was your problem size. Many solvers will not scale when the number of cells per CPU is less than 10,000. There is a lot of communication overhead. This leads to more time in communication than actual computation. Hence the resulting performance is bad !

skabilan June 2, 2009 14:47

Hi Martin/Rachel,

Thanks for your inputs...The machine that I am using is a Dell Dual processor, quad core Intel (Harpertown running at 2.33Ghz), 16GB, No interconnect (it just uses shared memory)

Here is the output from checkMesh...

Create time

Create polyMesh for time = constant

Time = constant

Mesh stats
points: 22303
faces: 209018
internal faces: 184318
cells: 98334
boundary patches: 10
point zones: 0
face zones: 0
cell zones: 0

Number of cells of each type:
hexahedra: 0
prisms: 0
wedges: 0
pyramids: 0
tet wedges: 0
tetrahedra: 98334
polyhedra: 0

Checking topology...
Boundary definition OK.
Point usage OK.
Upper triangular ordering OK.
Topological cell zip-up check OK.
Face vertices OK.
Face-face connectivity OK.
Number of regions: 1 (OK).

Checking patch topology for multiply connected surfaces ...
Patch Faces Points Surface
inlet 111 72 ok (not multiply connected)
out2 74 52 ok (not multiply connected)
out3 62 40 ok (not multiply connected)
out4 65 44 ok (not multiply connected)
out5 44 31 ok (not multiply connected)
out6 58 38 ok (not multiply connected)
out7 51 34 ok (not multiply connected)
out8 50 34 ok (not multiply connected)
out1 46 32 ok (not multiply connected)
w1 24139 12150 ok (not multiply connected)

Checking geometry...
Domain bounding box: (-0.0609739 -0.106362 -0.025452) (0.0609426 0.106513 0.0254047)
Boundary openness (-2.2934e-17 -4.63998e-17 7.83197e-18) OK.
Max cell openness = 3.5536e-16 OK.
Max aspect ratio = 13.8974 OK.
Minumum face area = 5.56535e-09. Maximum face area = 1.18647e-05. Face area magnitudes OK.
Min volume = 2.28107e-13. Max volume = 1.27222e-08. Total volume = 5.34576e-05. Cell volumes OK.
Mesh non-orthogonality Max: 77.0421 average: 22.9327
*Number of severely non-orthogonal faces: 17.
Non-orthogonality check OK.
<<Writing 17 non-orthogonal faces to set nonOrthoFaces
Face pyramids OK.
Max skewness = 1.44383 OK.
All angles in faces OK.
All face flatness OK.

Mesh OK.

End

schmidt_d July 14, 2009 15:47

Hi,
Two things to look for here. The first is that as you decompose the domain into more and more small subdomains the surface area to volume ratio of each subdomain increases. Surface area is measured in number of faces and volume is measured in number of cells. At some point, your machine spends more time communicating than processing. I would not divide up a domain into chunks of less than 50K cells.

Secondly, shared memory machines like yours will start to have memory transfer bottlenecks, where your fast processors are spending much of their time waiting for the accesses to main memory. In unstructured codes, pre-fetching is hard, even with the special ordering OpenFOAM uses.

-David

skabilan July 16, 2009 14:43

Hi David,

Thanks for the valuable input. Makes sense...

Warm Regards,
Senthil

aspera August 13, 2009 18:13

Quote:

Originally Posted by skabilan (Post 217845)
Hi All,

I get linear speed-up upto 4 processors with icoFoam; when I try to use more than 4 processors, the performance decreases. Has anyone come across this issue? I will be more than happy to report the details if need be.

Regards,
Senthil


This happened because using Harpertown.
http://www.fluent.com/software/fluen.../truck_14m.htm
Pay attention to the line for INTEL WHITEBOX (INTEL_X5482_HTN4, 3200, RHEL5).


All times are GMT -4. The time now is 18:41.