CFD Online Discussion Forums - Superlinear speedup in OpenFOAM 13

Hi OpenFOAMers,

I just finished testing OpenFOAM speedup on an 8-CPU (16-core) Opteron machine loaded with 60 GB RAM. The results are pretty impressive given that the latency in an SMP-like system is really very low. Firstly, here are some technical details of the hardware used:

[cfd@sunfire icoFoam]$ uname -a
Linux sunfire 2.6.9-42.0.2.ELlargesmp #1 SMP Tue Aug 22 18:52:10 CDT 2006 x86_64 x86_64 x86_64 GNU/Linux

(Basically Scientific Linux 4.x)

[cfd@sunfire ~]$ cat /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 15
model : 33
model name : Dual Core AMD Opteron(tm) Processor 885
stepping : 2
cpu MHz : 2613.696
cache size : 1024 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext lm 3dnowext 3dnow pni
bogomips : 5230.07
TLB size : 1088 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp

processor : 1
vendor_id : AuthenticAMD
cpu family : 15
model : 33
model name : Dual Core AMD Opteron(tm) Processor 885
stepping : 2
cpu MHz : 2613.696
cache size : 1024 KB
physical id : 0
siblings : 2
core id : 1
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext lm 3dnowext 3dnow pni
bogomips : 5226.93
TLB size : 1088 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp

processor : 2
vendor_id : AuthenticAMD
cpu family : 15
model : 33
model name : Dual Core AMD Opteron(tm) Processor 885
stepping : 2
cpu MHz : 2613.696
cache size : 1024 KB
physical id : 1
siblings : 2
core id : 0
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext lm 3dnowext 3dnow pni
bogomips : 5226.92
TLB size : 1088 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp

processor : 3
vendor_id : AuthenticAMD
cpu family : 15
model : 33
model name : Dual Core AMD Opteron(tm) Processor 885
stepping : 2
cpu MHz : 2613.696
cache size : 1024 KB
physical id : 1
siblings : 2
core id : 1
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext lm 3dnowext 3dnow pni
bogomips : 5226.88
TLB size : 1088 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp

processor : 4
vendor_id : AuthenticAMD
cpu family : 15
model : 33
model name : Dual Core AMD Opteron(tm) Processor 885
stepping : 2
cpu MHz : 2613.696
cache size : 1024 KB
physical id : 2
siblings : 2
core id : 0
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext lm 3dnowext 3dnow pni
bogomips : 5226.90
TLB size : 1088 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp

processor : 5
vendor_id : AuthenticAMD
cpu family : 15
model : 33
model name : Dual Core AMD Opteron(tm) Processor 885
stepping : 2
cpu MHz : 2613.696
cache size : 1024 KB
physical id : 2
siblings : 2
core id : 1
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext lm 3dnowext 3dnow pni
bogomips : 5226.89
TLB size : 1088 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp

processor : 6
vendor_id : AuthenticAMD
cpu family : 15
model : 33
model name : Dual Core AMD Opteron(tm) Processor 885
stepping : 2
cpu MHz : 2613.696
cache size : 1024 KB
physical id : 3
siblings : 2
core id : 0
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext lm 3dnowext 3dnow pni
bogomips : 5226.90
TLB size : 1088 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp

processor : 7
vendor_id : AuthenticAMD
cpu family : 15
model : 33
model name : Dual Core AMD Opteron(tm) Processor 885
stepping : 2
cpu MHz : 2613.696
cache size : 1024 KB
physical id : 3
siblings : 2
core id : 1
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext lm 3dnowext 3dnow pni
bogomips : 5226.90
TLB size : 1088 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp

processor : 8
vendor_id : AuthenticAMD
cpu family : 15
model : 33
model name : Dual Core AMD Opteron(tm) Processor 885
stepping : 2
cpu MHz : 2613.696
cache size : 1024 KB
physical id : 4
siblings : 2
core id : 0
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext lm 3dnowext 3dnow pni
bogomips : 5226.90
TLB size : 1088 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp

processor : 9
vendor_id : AuthenticAMD
cpu family : 15
model : 33
model name : Dual Core AMD Opteron(tm) Processor 885
stepping : 2
cpu MHz : 2613.696
cache size : 1024 KB
physical id : 4
siblings : 2
core id : 1
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext lm 3dnowext 3dnow pni
bogomips : 5226.88
TLB size : 1088 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp

processor : 10
vendor_id : AuthenticAMD
cpu family : 15
model : 33
model name : Dual Core AMD Opteron(tm) Processor 885
stepping : 2
cpu MHz : 2613.696
cache size : 1024 KB
physical id : 5
siblings : 2
core id : 0
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext lm 3dnowext 3dnow pni
bogomips : 5226.89
TLB size : 1088 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp

processor : 11
vendor_id : AuthenticAMD
cpu family : 15
model : 33
model name : Dual Core AMD Opteron(tm) Processor 885
stepping : 2
cpu MHz : 2613.696
cache size : 1024 KB
physical id : 5
siblings : 2
core id : 1
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext lm 3dnowext 3dnow pni
bogomips : 5226.89
TLB size : 1088 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp

processor : 12
vendor_id : AuthenticAMD
cpu family : 15
model : 33
model name : Dual Core AMD Opteron(tm) Processor 885
stepping : 2
cpu MHz : 2613.696
cache size : 1024 KB
physical id : 6
siblings : 2
core id : 0
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext lm 3dnowext 3dnow pni
bogomips : 5226.88
TLB size : 1088 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp

processor : 13
vendor_id : AuthenticAMD
cpu family : 15
model : 33
model name : Dual Core AMD Opteron(tm) Processor 885
stepping : 2
cpu MHz : 2613.696
cache size : 1024 KB
physical id : 6
siblings : 2
core id : 1
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext lm 3dnowext 3dnow pni
bogomips : 5226.90
TLB size : 1088 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp

processor : 14
vendor_id : AuthenticAMD
cpu family : 15
model : 33
model name : Dual Core AMD Opteron(tm) Processor 885
stepping : 2
cpu MHz : 2613.696
cache size : 1024 KB
physical id : 7
siblings : 2
core id : 0
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext lm 3dnowext 3dnow pni
bogomips : 5226.89
TLB size : 1088 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp

processor : 15
vendor_id : AuthenticAMD
cpu family : 15
model : 33
model name : Dual Core AMD Opteron(tm) Processor 885
stepping : 2
cpu MHz : 2613.696
cache size : 1024 KB
physical id : 7
siblings : 2
core id : 1
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext lm 3dnowext 3dnow pni
bogomips : 5226.89
TLB size : 1088 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp

[cfd@sunfire ~]$ free -mot
total used free shared buffers cached
Mem: 59923 48535 11387 0 83 37052
Swap: 59839 6 59833
Total: 119763 48541 71221

checkMesh output reads:

[cfd@sunfire icoFoam]$ checkMesh . one_sq_cyl_3d_unsteady_wtavg_4_2_cpus
/*---------------------------------------------------------------------------*\
| ========= | |
| \ / F ield | OpenFOAM: The Open Source CFD Toolbox |
| \ / O peration | Version: 1.3 |
| \ / A nd | Web: http://www.openfoam.org |
| \/ M anipulation | |
\*---------------------------------------------------------------------------*/

Exec : checkMesh . one_sq_cyl_3d_unsteady_wtavg_4_2_cpus
Date : Jan 08 2007
Time : 18:48:18
Host : sunfire
PID : 13166
Root : /home/cfd/OpenFOAM/cfd-1.3/run/tutorials/icoFoam
Case : one_sq_cyl_3d_unsteady_wtavg_4_2_cpus
Nprocs : 1
Create time

Create polyMesh for time = constant

Time = constant
Boundary definition OK.

Number of points: 11042070
edges: 32811382
faces: 32498784
internal faces: 31878048
cells: 10729472
boundary patches: 4
point zones: 0
face zones: 0
cell zones: 0

Checking topology and geometry ...
Point usage check OK.

Upper triangular ordering OK.

Topological cell zip-up check OK.

Face vertices OK.

Face-face connectivity OK.

Basic topo ok ...

Checking patch topology for multiply connected surfaces ...

Patch Faces Points Surface
ChannelWalls 604352 604734 ok (not multiply connected)
ObstacleWalls 6144 6240 ok (not multiply connected)
vinlet 5120 5265 ok (not multiply connected)
poutlet 5120 5265 ok (not multiply connected)

Patch topo ok ...
Topology check done.

Domain bounding box: min = (-1.165 -0.02 -0.05) max = (0.705 0.02 0.05) meters.

Checking geometry...
Boundary openness in x-direction = -7.92611080393212e-19
Boundary openness in y-direction = -3.33563488923205e-14
Boundary openness in z-direction = 1.36264413686285e-15
Boundary closed (OK).
Max cell openness = 2.49399995957591e-21 Max aspect ratio = 1.74011094308106. All cells OK.

Minumum face area = 1.95312499999909e-07. Maximum face area = 7.44396594083542e-06. Face area magnitudes OK.

Min volume = 1.8600559925712e-10. Max volume = 7.06069406503132e-09. Total volume = 0.00747000000001155. Cell volumes OK.

Mesh non-orthogonality Max: 0 average: 0
Non-orthogonality check OK.

Face pyramids OK.

Max skewness = 2.84219262700115e-10 percent. Face skewness OK.

Minumum edge length = 0.000312499999999998. Maximum edge length = 0.00312657168090757.

All angles in faces are convex or less than 10 degrees concave.

Face flatness (1 = flat, 0 = butterfly) : average = 1 min = 1
All faces are flat in that the ratio between projected and actual area is > 0.8

Geometry check done.

Number of cells by type:
hexahedra: 10729472
prisms: 0
wedges: 0
pyramids: 0
tet wedges: 0
tetrahedra: 0
polyhedra: 0
Number of regions: 1 (OK).
Mesh OK.

Time = 0
No mesh.

End

fvSchemes reads:

/*---------------------------------------------------------------------------*\
| ========= | |
| \ / F ield | OpenFOAM: The Open Source CFD Toolbox |
| \ / O peration | Version: 1.3 |
| \ / A nd | Web: http://www.openfoam.org |
| \/ M anipulation | |
\*---------------------------------------------------------------------------*/

// FoamX Case Dictionary.

FoamFile
{
version 2.0;
format ascii;

root "/home/madhavan/OpenFOAM/madhavan-1.3/run/tutorials/icoFoam";
case "";
instance "system";
local "";

class dictionary;
object fvSchemes;
}

// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //

ddtSchemes
{
default CrankNicholson 1;
}

gradSchemes
{
default Gauss linear;
grad(p) Gauss linear;
}

divSchemes
{
default none;
div(phi,U) Gauss linear;
}

laplacianSchemes
{
default none;
laplacian(nu,U) Gauss linear corrected;
laplacian(1|A(U),p) Gauss linear corrected;
}

interpolationSchemes
{
default linear;
interpolate(HbyA) linear;
}

snGradSchemes
{
default corrected;
}

fluxRequired
{
default no;
p;
}

// ************************************************** *********************** //

And fvSolution:

/*---------------------------------------------------------------------------*\
| ========= | |
| \ / F ield | OpenFOAM: The Open Source CFD Toolbox |
| \ / O peration | Version: 1.3 |
| \ / A nd | Web: http://www.openfoam.org |
| \/ M anipulation | |
\*---------------------------------------------------------------------------*/

// FoamX Case Dictionary.

FoamFile
{
version 2.0;
format ascii;

root "/home/madhavan/OpenFOAM/madhavan-1.3/run/tutorials/icoFoam";
case "";
instance "system";
local "";

class dictionary;
object fvSolution;
}

// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //

solvers
{
// p ICCG 1e-06 0;
p AMG 1e-06 0 25;
U BICCG 1e-05 0;
}

PISO
{
momentumPredictor yes;
nCorrectors 2;
nNonOrthogonalCorrectors 0;
pRefCell 0;
pRefValue 0;
}

// ************************************************** *********************** //

The case I ran was laminar unsteady vortex shedding past a square cylinder in a rectangular channel. The solver was a slightly modified version of icoFoam. Modifications included calculation of Lift/Drag coefficients [stolen from Frank Bos ;)] and writing out time-averaged velocity/pressure and velocity probes (stolen from oodles). 19 probeLocations were defined in all the simulations. The speedup was calculated as follows:

Speedup from 'N' CPUs = (ClockTime for a serial run) / (ClockTime for a parallel run with 'N' CPUs)

The time step chosen was 0.02 seconds. This ensured that the Maximum Courant number was well below 1 (typically around 0.4). Starting from time t = 0, the simulation was run upto time t = 0.68 seconds (i.e. 34 times steps). 'writeFormat' in controlDict was set to 'binary' and 'writePrecision' to 15. Metis decomposition was used with equal processor weighting throughout. All parallel runs were dedicated (i.e. only I was using the machine). LAM MPI was used throughout. It was seen that in each of the parallel runs, the total RES memory reported by 'top' was around 10.2 GB.

Keeping in mind that dual cores are memory bandwidth limited, two types of parallel configurations were tested:

1. In the first parallel configuration, only one core from each physical processor was used. This was possible using the 'taskset' command in GNU/Linux which allows one to hard-request specific cores (i.e. override the kernel CPU affinity mask). This command also makes sure that until the process quits, it will be locked on to only the user-specified set of CPUs. Thus the maximum number of CPUs for this case was 8.

2. The second parallel configuration was composed of using 2-cores from each processor. Thus for a 4-CPU run, two physical processors were hard-requested and so on. In this configuration, one could go upto 16 CPUs in total.

The speedup results are available here [http://www.ualberta.ca/~madhavan/openfoam_speedup.eps].

It can be seen that the first kind of parallel configuration (i.e. using just 8 CPUs [one core from each physical processor]) exhibits what appears to be a case of super-linear speedup. This is explained in the following wikipedia entry:

http://en.wikipedia.org/wiki/Speedup

Has anyone experienced this with OpenFOAM before?

The second parallel configuration (i.e. 16 cores) displays acceptable speedup as well. However it should be noted that the maximum speedup in this case was around 15.2 using 16 cores. A slightly higher speedup (15.964) was obtained using just 8 CPUs when working in the first kind of parallel configuration. Also noteworthy is the fact that memory bandwidth limitation when using both cores does not seem to detrimentally impair the speedup.

A sample log file from an 8-CPU run is shown below:

/*---------------------------------------------------------------------------*\
| ========= | |
| \ / F ield | OpenFOAM: The Open Source CFD Toolbox |
| \ / O peration | Version: 1.3 |
| \ / A nd | Web: http://www.openfoam.org |
| \/ M anipulation | |
\*---------------------------------------------------------------------------*/

/*---------------------------------------------------------------------------*\
| ========= | |
| \ / F ield | OpenFOAM: The Open Source CFD Toolbox |
| \ / O peration | Version: 1.3 |
| \ / A nd | Web: http://www.openfoam.org |
| \/ M anipulation | |
\*---------------------------------------------------------------------------*/

/*---------------------------------------------------------------------------*\
| ========= | |
| \ / F ield | OpenFOAM: The Open Source CFD Toolbox |
| \ / O peration | Version: 1.3 |
| \ / A nd | Web: http://www.openfoam.org |
| \/ M anipulation | |
\*---------------------------------------------------------------------------*/

/*---------------------------------------------------------------------------*\
| ========= | |
| \ / F ield | OpenFOAM: The Open Source CFD Toolbox |
| \ / O peration | Version: 1.3 |
| \ / A nd | Web: http://www.openfoam.org |
| \/ M anipulation | |
\*---------------------------------------------------------------------------*/

Exec : icoFoam . one_sq_cyl_3d_unsteady_wtavg_4_8_cpus -parallel
Exec : icoFoam . one_sq_cyl_3d_unsteady_wtavg_4_8_cpus -parallel
Exec : icoFoam . one_sq_cyl_3d_unsteady_wtavg_4_8_cpus -parallel
Exec : icoFoam . one_sq_cyl_3d_unsteady_wtavg_4_8_cpus -parallel
/*---------------------------------------------------------------------------*\
| ========= | |
| \ / F ield | OpenFOAM: The Open Source CFD Toolbox |
| \ / O peration | Version: 1.3 |
| \ / A nd | Web: http://www.openfoam.org |
| \/ M anipulation | |
\*---------------------------------------------------------------------------*/

/*---------------------------------------------------------------------------*\
| ========= | |
| \ / F ield | OpenFOAM: The Open Source CFD Toolbox |
| \ / O peration | Version: 1.3 |
| \ / A nd | Web: http://www.openfoam.org |
| \/ M anipulation | |
\*---------------------------------------------------------------------------*/

Exec : icoFoam . one_sq_cyl_3d_unsteady_wtavg_4_8_cpus -parallel
Exec : icoFoam . one_sq_cyl_3d_unsteady_wtavg_4_8_cpus -parallel
/*---------------------------------------------------------------------------*\
| ========= | |
| \ / F ield | OpenFOAM: The Open Source CFD Toolbox |
| \ / O peration | Version: 1.3 |
| \ / A nd | Web: http://www.openfoam.org |
| \/ M anipulation | |
\*---------------------------------------------------------------------------*/

Exec : icoFoam . one_sq_cyl_3d_unsteady_wtavg_4_8_cpus -parallel
/*---------------------------------------------------------------------------*\
| ========= | |
| \ / F ield | OpenFOAM: The Open Source CFD Toolbox |
| \ / O peration | Version: 1.3 |
| \ / A nd | Web: http://www.openfoam.org |
| \/ M anipulation | |
\*---------------------------------------------------------------------------*/

Exec : icoFoam . one_sq_cyl_3d_unsteady_wtavg_4_8_cpus -parallel
[1] Date : Dec 27 2006
[1] Time : 11:17:35
[1] Host : sunfire
[1] PID : 5608
[5] Date : Dec 27 2006
[5] Time : 11:17:35
[5] Host : sunfire
[5] PID : 5612
[7] Date : Dec 27 2006
[7] Time : 11:17:35
[7] Host : sunfire
[7] PID : 5614
[3] Date : Dec 27 2006
[3] Time : 11:17:35
[3] Host : sunfire
[3] PID : 5610
[4] Date : Dec 27 2006
[4] Time : 11:17:35
[4] Host : sunfire
[4] PID : 5611
[0] Date : Dec 27 2006
[0] Time : 11:17:35
[0] Host : sunfire
[0] PID : 5607
[1] Root : /home/madhavan/OpenFOAM/madhavan-1.3/run/tutorials/icoFoam
[1] Case : one_sq_cyl_3d_unsteady_wtavg_4_8_cpus
[1] Nprocs : 8
[2] Date : Dec 27 2006
[2] Time : 11:17:35
[2] Host : sunfire
[2] PID : 5609
[6] Date : Dec 27 2006
[6] Time : 11:17:35
[6] Host : sunfire
[6] PID : 5613
[5] Root : /home/madhavan/OpenFOAM/madhavan-1.3/run/tutorials/icoFoam
[5] Case : one_sq_cyl_3d_unsteady_wtavg_4_8_cpus
[5] Nprocs : 8
[7] Root : /home/madhavan/OpenFOAM/madhavan-1.3/run/tutorials/icoFoam
[7] Case : one_sq_cyl_3d_unsteady_wtavg_4_8_cpus
[7] Nprocs : 8
[3] Root : /home/madhavan/OpenFOAM/madhavan-1.3/run/tutorials/icoFoam
[3] Case : one_sq_cyl_3d_unsteady_wtavg_4_8_cpus
[3] Nprocs : 8
[4] Root : /home/madhavan/OpenFOAM/madhavan-1.3/run/tutorials/icoFoam
[4] Case : one_sq_cyl_3d_unsteady_wtavg_4_8_cpus
[4] Nprocs : 8
[2] Root : /home/madhavan/OpenFOAM/madhavan-1.3/run/tutorials/icoFoam
[2] Case : one_sq_cyl_3d_unsteady_wtavg_4_8_cpus
[2] Nprocs : 8
[6] Root : /home/madhavan/OpenFOAM/madhavan-1.3/run/tutorials/icoFoam
[6] Case : one_sq_cyl_3d_unsteady_wtavg_4_8_cpus
[6] Nprocs : 8
[0] Root : /home/madhavan/OpenFOAM/madhavan-1.3/run/tutorials/icoFoam
[0] Case : one_sq_cyl_3d_unsteady_wtavg_4_8_cpus
[0] Nprocs : 8
[0] Slaves :
[0] 7
[0] (
[0] sunfire.5608
[0] sunfire.5609
[0] sunfire.5610
[0] sunfire.5611
[0] sunfire.5612
[0] sunfire.5613
[0] sunfire.5614
[0] )
[0]
Create time

Create mesh for time = 0

Reading transportProperties

Reading field p

Reading field U

Reading/calculating face flux field phi

Creating field Umean

Creating field pMean

Reading probeLocations

Constructing probes

Starting time loop

Time = 0.02

Mean and max Courant Numbers = 0 0.0799610193770155
BICCG: Solving for Ux, Initial residual = 0.999999999999942, Final residual = 1.72057068708726e-06, No Iterations 2
BICCG: Solving for Uy, Initial residual = 0, Final residual = 0, No Iterations 0
BICCG: Solving for Uz, Initial residual = 0, Final residual = 0, No Iterations 0
AMG: Solving for p, Initial residual = 1, Final residual = 9.48240838699873e-07, No Iterations 264
time step continuity errors : sum local = 6.34770499582916e-11, global = -4.66773069030591e-12, cumulative = -4.66773069030591e-12
AMG: Solving for p, Initial residual = 0.000327390016863783, Final residual = 9.50144270815434e-07, No Iterations 125
time step continuity errors : sum local = 7.58317575730968e-08, global = -7.09519972870107e-09, cumulative = -7.09986745939137e-09

Wall patch = 0
Wall patch name = ChannelWalls
Uav = (1 0 0)
Aref = 1
nu = nu [0 2 -1 0 0 0 0] 1.00481e-06
DragCoefficient = 2.39031097936705e-05
pressureDragCoefficient = 1.10457835730627e-19
viscDragCoefficient = 2.39031097936704e-05
LiftCoefficient = -2.7464517768576e-08

Wall patch = 1
Wall patch name = ObstacleWalls
Uav = (1 0 0)
Aref = 1
nu = nu [0 2 -1 0 0 0 0] 1.00481e-06
DragCoefficient = 1.53640797773116e-05
pressureDragCoefficient = 1.51743063164737e-05
viscDragCoefficient = 1.89773460837957e-07
LiftCoefficient = 2.19062878500774e-10

ExecutionTime = 429.61 s ClockTime = 430 s

Time = 0.04

Mean and max Courant Numbers = 0.0520366499170818 0.618464623572251
BICCG: Solving for Ux, Initial residual = 0.9503276306348, Final residual = 7.30093823030692e-07, No Iterations 4
BICCG: Solving for Uy, Initial residual = 0.336108715228218, Final residual = 7.75236516500687e-06, No Iterations 3
BICCG: Solving for Uz, Initial residual = 0.318782629311303, Final residual = 2.46974968726866e-06, No Iterations 3
AMG: Solving for p, Initial residual = 0.00142885986455076, Final residual = 9.59204590238137e-07, No Iterations 161
time step continuity errors : sum local = 3.31915427708201e-08, global = -3.85201065960417e-09, cumulative = -1.09518781189955e-08
AMG: Solving for p, Initial residual = 0.00125690267648775, Final residual = 9.90718932840224e-07, No Iterations 148
time step continuity errors : sum local = 8.61031360523968e-09, global = -1.00374894983836e-09, cumulative = -1.19556270688339e-08

Wall patch = 0
Wall patch name = ChannelWalls
Uav = (1 0 0)
Aref = 1
nu = nu [0 2 -1 0 0 0 0] 1.00481e-06
DragCoefficient = 3.11686822943573e-05
pressureDragCoefficient = -1.19036971197506e-20
viscDragCoefficient = 3.11686822943573e-05
LiftCoefficient = 5.08497633583287e-08

Wall patch = 1
Wall patch name = ObstacleWalls
Uav = (1 0 0)
Aref = 1
nu = nu [0 2 -1 0 0 0 0] 1.00481e-06
DragCoefficient = -8.73183986075596e-07
pressureDragCoefficient = -1.12158407004779e-06
viscDragCoefficient = 2.48400083972195e-07
LiftCoefficient = -4.33142310759729e-10

ExecutionTime = 773.79 s ClockTime = 774 s

Time = 0.06

Mean and max Courant Numbers = 0.0520549574469423 0.633176355927174
BICCG: Solving for Ux, Initial residual = 0.729870873945099, Final residual = 8.71082566621832e-07, No Iterations 4
BICCG: Solving for Uy, Initial residual = 0.0449089477055162, Final residual = 1.17121591863239e-06, No Iterations 3
BICCG: Solving for Uz, Initial residual = 0.429338306203659, Final residual = 2.25243687316615e-06, No Iterations 3
AMG: Solving for p, Initial residual = 0.00746482234535778, Final residual = 9.66298628563578e-07, No Iterations 172
time step continuity errors : sum local = 3.73219678904151e-09, global = 4.02713235054858e-10, cumulative = -1.1552913833779e-08
AMG: Solving for p, Initial residual = 0.000155512648150767, Final residual = 9.95720557424151e-07, No Iterations 114
time step continuity errors : sum local = 1.43764071652251e-08, global = -1.63319844581523e-09, cumulative = -1.31861122795943e-08

Wall patch = 0
Wall patch name = ChannelWalls
Uav = (1 0 0)
Aref = 1
nu = nu [0 2 -1 0 0 0 0] 1.00481e-06
DragCoefficient = 1.94601024085051e-05
pressureDragCoefficient = -4.21515010300766e-20
viscDragCoefficient = 1.94601024085052e-05
LiftCoefficient = -2.75174260612183e-08

Wall patch = 1
Wall patch name = ObstacleWalls
Uav = (1 0 0)
Aref = 1
nu = nu [0 2 -1 0 0 0 0] 1.00481e-06
DragCoefficient = -5.42732911306796e-06
pressureDragCoefficient = -5.57975173572317e-06
viscDragCoefficient = 1.52422622655207e-07
LiftCoefficient = 1.90975313446053e-10

ExecutionTime = 1096.18 s ClockTime = 1097 s

Time = 0.08

Mean and max Courant Numbers = 0.0520609078097951 0.573409966876077
BICCG: Solving for Ux, Initial residual = 0.907786134944961, Final residual = 7.13709554853266e-07, No Iterations 4
BICCG: Solving for Uy, Initial residual = 0.218107164255757, Final residual = 4.31544674425797e-06, No Iterations 3
BICCG: Solving for Uz, Initial residual = 0.483568109971064, Final residual = 9.75818435573959e-06, No Iterations 2
AMG: Solving for p, Initial residual = 0.00175055175027143, Final residual = 9.66179519860942e-07, No Iterations 159
time step continuity errors : sum local = 1.28147244147868e-08, global = 1.42689767854083e-09, cumulative = -1.17592146010535e-08
AMG: Solving for p, Initial residual = 0.00173420165308252, Final residual = 9.83219295244694e-07, No Iterations 155
time step continuity errors : sum local = 1.95495752173065e-09, global = -2.18349730379148e-10, cumulative = -1.19775643314326e-08

Wall patch = 0
Wall patch name = ChannelWalls
Uav = (1 0 0)
Aref = 1
nu = nu [0 2 -1 0 0 0 0] 1.00481e-06
DragCoefficient = 1.2096984821921e-05
pressureDragCoefficient = 9.00013655768262e-22
viscDragCoefficient = 1.2096984821921e-05
LiftCoefficient = 3.17872454814612e-08

Wall patch = 1
Wall patch name = ObstacleWalls
Uav = (1 0 0)
Aref = 1
nu = nu [0 2 -1 0 0 0 0] 1.00481e-06
DragCoefficient = 3.05866437660425e-07
pressureDragCoefficient = 2.16859882289008e-07
viscDragCoefficient = 8.9006555371417e-08
LiftCoefficient = -2.80144135822249e-10

ExecutionTime = 1445.04 s ClockTime = 1446 s

Time = 0.1

Mean and max Courant Numbers = 0.0520483817059996 0.500862085074748
BICCG: Solving for Ux, Initial residual = 0.0654195035431545, Final residual = 4.684699571965e-06, No Iterations 2
BICCG: Solving for Uy, Initial residual = 0.0133065421499664, Final residual = 9.06483525526088e-06, No Iterations 2
BICCG: Solving for Uz, Initial residual = 0.107661631606992, Final residual = 4.57633279853188e-06, No Iterations 2
AMG: Solving for p, Initial residual = 0.0147390588023079, Final residual = 9.98397594852791e-07, No Iterations 149
time step continuity errors : sum local = 2.97501995648839e-10, global = -3.17354895118249e-11, cumulative = -1.20092998209444e-08
AMG: Solving for p, Initial residual = 0.00327776568097673, Final residual = 9.97367895344957e-07, No Iterations 132
time step continuity errors : sum local = 9.72865748170479e-11, global = -1.06901988694658e-11, cumulative = -1.20199900198139e-08

Wall patch = 0
Wall patch name = ChannelWalls
Uav = (1 0 0)
Aref = 1
nu = nu [0 2 -1 0 0 0 0] 1.00481e-06
DragCoefficient = 1.14058184148547e-05
pressureDragCoefficient = 3.64388470612232e-22
viscDragCoefficient = 1.14058184148547e-05
LiftCoefficient = -4.16144498831859e-08

Wall patch = 1
Wall patch name = ObstacleWalls
Uav = (1 0 0)
Aref = 1
nu = nu [0 2 -1 0 0 0 0] 1.00481e-06
DragCoefficient = 2.70235447485773e-07
pressureDragCoefficient = 1.87830258208696e-07
viscDragCoefficient = 8.24051892770773e-08
LiftCoefficient = 2.32675948985381e-10

ExecutionTime = 1757.44 s ClockTime = 1758 s

I would appreciate if anyone shares their thoughts/comments in this regard. I just finished compiling OpenFOAM with mvapi (infiniband) support through openmpi and plan to run the same case for a comparison.