CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Software User Forums > OpenFOAM > OpenFOAM Running, Solving & CFD

IcoFoam parallel woes

Register Blogs Members List Search Today's Posts Mark Forums Read

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   July 25, 2006, 21:43
Default Hi, I'm trying the same cav
  #1
Senior Member
 
Srinath Madhavan (a.k.a pUl|)
Join Date: Mar 2009
Location: Edmonton, AB, Canada
Posts: 703
Rep Power: 21
msrinath80 is on a distinguished road
Hi,

I'm trying the same cavity case as explained in the user guide. But I've increased the mesh to 10000 cells and decreased the time step from 0.005 to 0.001. The case runs happily when in serial mode. When I decompose the domain and run in parallel, it seems to run fine for a while after which, suddenly a solution singluarity appears. The BCs and problem definition are OK as the case runs properly in serial. Any suggestions (please look at the time between 0.013 and 0.015 seconds). Thanks for your help.

PS: At Time = 0.024, I hit Ctrl + C.

Also, when running the case in parallel again (from scratch) sometimes it works fine up to 0.2 or 0.3 seconds, sometimes less, sometimes it just hangs there with no error messages as if waiting for something. Has anybody experienced such random behavior. I'm using OpenFOAM on a 64 bit IBM p360 (4-CPU) machine running Suse Linux 10.1 and lam (compiled from sources that came with OpenFOAM).


madhavan@varese:~/OpenFOAM/madhavan-1.3/run/tutorials/icoFoam> ls
Allclean cavity cavityGrade cavity_large_mesh_backup machines
Allrun cavityClipped cavity_large_mesh elbow resetFixedWallsScr

madhavan@varese:~/OpenFOAM/madhavan-1.3/run/tutorials/icoFoam> decomposePar . cavity_large_mesh

/*---------------------------------------------------------------------------*\
| ========= | |
| \ / F ield | OpenFOAM: The Open Source CFD Toolbox |
| \ / O peration | Version: 1.3 |
| \ / A nd | Web: http://www.openfoam.org |
| \/ M anipulation | |
\*---------------------------------------------------------------------------*/

Exec : decomposePar . cavity_large_mesh
Date : Jul 25 2006
Time : 19:20:21
Host : varese
PID : 32389
Root : /home/madhavan/OpenFOAM/madhavan-1.3/run/tutorials/icoFoam
Case : cavity_large_mesh
Nprocs : 1
Create time

Time = 0
Create mesh


Calculating distribution of cells
Selecting decompositionMethod simple

Finished decomposition in 0.45 s

Calculating original mesh data

Distributing cells to processors

Distributing faces to processors

Calculating processor boundary addressing

Distributing points to processors

Constructing processor meshes

Processor 0
Number of cells = 2500
Number of faces shared with processor 1 = 50
Number of faces shared with processor 2 = 50
Number of boundary faces = 5100

Processor 1
Number of cells = 2500
Number of faces shared with processor 0 = 50
Number of faces shared with processor 3 = 50
Number of boundary faces = 5100

Processor 2
Number of cells = 2500
Number of faces shared with processor 0 = 50
Number of faces shared with processor 3 = 50
Number of boundary faces = 5100

Processor 3
Number of cells = 2500
Number of faces shared with processor 1 = 50
Number of faces shared with processor 2 = 50
Number of boundary faces = 5100

Processor 0: field transfer
Processor 1: field transfer
Processor 2: field transfer
Processor 3: field transfer

End.

madhavan@varese:~/OpenFOAM/madhavan-1.3/run/tutorials/icoFoam> mpirun -np 4 icoFoam . cavity_large_mesh -parallel
/*---------------------------------------------------------------------------*\
| ========= | |
| \ / F ield | OpenFOAM: The Open Source CFD Toolbox |
| \ / O peration | Version: 1.3 |
| \ / A nd | Web: http://www.openfoam.org |
| \/ M anipulation | |
\*---------------------------------------------------------------------------*/

/*---------------------------------------------------------------------------*\
| ========= | |
| \ / F ield | OpenFOAM: The Open Source CFD Toolbox |
| \ / O peration | Version: 1.3 |
| \ / A nd | Web: http://www.openfoam.org |
| \/ M anipulation | |
\*---------------------------------------------------------------------------*/

/*---------------------------------------------------------------------------*\
| ========= | |
| \ / F ield | OpenFOAM: The Open Source CFD Toolbox |
| \ / O peration | Version: 1.3 |
| \ / A nd | Web: http://www.openfoam.org |
| \/ M anipulation | |
\*---------------------------------------------------------------------------*/

/*---------------------------------------------------------------------------*\
| ========= | |
| \ / F ield | OpenFOAM: The Open Source CFD Toolbox |
| \ / O peration | Version: 1.3 |
| \ / A nd | Web: http://www.openfoam.org |
| \/ M anipulation | |
\*---------------------------------------------------------------------------*/

Exec : icoFoam . cavity_large_mesh -parallel
Exec : icoFoam . cavity_large_mesh -parallel
Exec : icoFoam . cavity_large_mesh -parallel
Exec : icoFoam . cavity_large_mesh -parallel
[2] Date : Jul 25 2006
[2] Time : 19:21:14
[0] Date : Jul 25 2006
[0] Time : 19:21:14
[0] Host : varese
[2] Host : varese
[2] PID : 32393
[3] Date : Jul 25 2006
[3] Time : 19:21:14
[0] PID : 32391
[3] Host : varese
[3] PID : 32394
[1] Date : Jul 25 2006
[1] Time : 19:21:14
[1] Host : varese
[1] PID : 32392
[1] Root : /home/madhavan/OpenFOAM/madhavan-1.3/run/tutorials/icoFoam
[1] Case : cavity_large_mesh
[1] Nprocs : 4
[2] Root : /home/madhavan/OpenFOAM/madhavan-1.3/run/tutorials/icoFoam
[2] Case : cavity_large_mesh
[2] Nprocs : 4
[3] Root : /home/madhavan/OpenFOAM/madhavan-1.3/run/tutorials/icoFoam
[3] Case : cavity_large_mesh
[0] Root : /home/madhavan/OpenFOAM/madhavan-1.3/run/tutorials/icoFoam
[0] Case : cavity_large_mesh
[0] Nprocs : 4
[0] Slaves :
[0] 3
[3] Nprocs : 4
[0] (
[0] varese.32392
[0] varese.32393
[0] varese.32394
[0] )
[0]
Create time

Create mesh for time = 0

Reading transportProperties

Reading field p

Reading field U

Reading/calculating face flux field phi


Starting time loop

Time = 0.001

Mean and max Courant Numbers = 0 0
BICCG: Solving for Ux, Initial residual = 1, Final residual = 9.04991e-06, No Iterations 18
BICCG: Solving for Uy, Initial residual = 0, Final residual = 0, No Iterations 0
ICCG: Solving for p, Initial residual = 1, Final residual = 6.86206e-07, No Iterations 201
time step continuity errors : sum local = 4.28884e-10, global = -9.48776e-21, cumulative = -9.48776e-21
ICCG: Solving for p, Initial residual = 0.672536, Final residual = 8.34589e-07, No Iterations 200
time step continuity errors : sum local = 7.54547e-10, global = -5.38652e-20, cumulative = -6.3353e-20
ExecutionTime = 2.51 s ClockTime = 3 s

Time = 0.002

Mean and max Courant Numbers = 0.0296832 0.810069
BICCG: Solving for Ux, Initial residual = 0.119078, Final residual = 6.90333e-06, No Iterations 17
BICCG: Solving for Uy, Initial residual = 0.386927, Final residual = 6.9427e-06, No Iterations 18
ICCG: Solving for p, Initial residual = 0.722155, Final residual = 9.66376e-07, No Iterations 198
time step continuity errors : sum local = 9.31977e-10, global = -1.34216e-19, cumulative = -1.97569e-19
ICCG: Solving for p, Initial residual = 0.633029, Final residual = 9.07363e-07, No Iterations 197
time step continuity errors : sum local = 9.3323e-10, global = 1.01922e-19, cumulative = -9.56466e-20
ExecutionTime = 4.04 s ClockTime = 4 s

Time = 0.003

Mean and max Courant Numbers = 0.0449432 0.926016
BICCG: Solving for Ux, Initial residual = 0.047979, Final residual = 9.80318e-06, No Iterations 15
BICCG: Solving for Uy, Initial residual = 0.146506, Final residual = 4.93936e-06, No Iterations 17
ICCG: Solving for p, Initial residual = 0.711663, Final residual = 9.89994e-07, No Iterations 197
time step continuity errors : sum local = 7.55206e-10, global = 4.25957e-20, cumulative = -5.30509e-20
ICCG: Solving for p, Initial residual = 0.663622, Final residual = 8.84782e-07, No Iterations 197
time step continuity errors : sum local = 6.7351e-10, global = 8.68074e-21, cumulative = -4.43702e-20
ExecutionTime = 5.55 s ClockTime = 6 s

Time = 0.004

Mean and max Courant Numbers = 0.053069 0.930456
BICCG: Solving for Ux, Initial residual = 0.026967, Final residual = 8.26492e-06, No Iterations 14
BICCG: Solving for Uy, Initial residual = 0.104978, Final residual = 5.43694e-06, No Iterations 16
ICCG: Solving for p, Initial residual = 0.496791, Final residual = 9.2743e-07, No Iterations 196
time step continuity errors : sum local = 5.71826e-10, global = -6.46623e-22, cumulative = -4.50168e-20
ICCG: Solving for p, Initial residual = 0.45407, Final residual = 8.34512e-07, No Iterations 196
time step continuity errors : sum local = 5.17042e-10, global = 1.43151e-19, cumulative = 9.81339e-20
ExecutionTime = 7.07 s ClockTime = 7 s

Time = 0.005

Mean and max Courant Numbers = 0.0597545 0.939777
BICCG: Solving for Ux, Initial residual = 0.0206086, Final residual = 5.9467e-06, No Iterations 14
BICCG: Solving for Uy, Initial residual = 0.0574585, Final residual = 8.81759e-06, No Iterations 14
ICCG: Solving for p, Initial residual = 0.41837, Final residual = 7.66076e-07, No Iterations 195
time step continuity errors : sum local = 3.66536e-10, global = 1.93771e-19, cumulative = 2.91905e-19
ICCG: Solving for p, Initial residual = 0.38174, Final residual = 8.99953e-07, No Iterations 194
time step continuity errors : sum local = 4.33183e-10, global = -1.20382e-19, cumulative = 1.71523e-19
ExecutionTime = 8.55 s ClockTime = 9 s

Time = 0.006

Mean and max Courant Numbers = 0.0648963 0.946156
BICCG: Solving for Ux, Initial residual = 0.0139187, Final residual = 7.52264e-06, No Iterations 13
BICCG: Solving for Uy, Initial residual = 0.0455918, Final residual = 7.22159e-06, No Iterations 14
ICCG: Solving for p, Initial residual = 0.253609, Final residual = 8.96376e-07, No Iterations 192
time step continuity errors : sum local = 3.92679e-10, global = 5.5118e-20, cumulative = 2.26641e-19
ICCG: Solving for p, Initial residual = 0.229742, Final residual = 8.75365e-07, No Iterations 192
time step continuity errors : sum local = 3.8378e-10, global = -8.76987e-20, cumulative = 1.38943e-19
ExecutionTime = 10.02 s ClockTime = 10 s

Time = 0.007

Mean and max Courant Numbers = 0.0692539 0.949465
BICCG: Solving for Ux, Initial residual = 0.0116939, Final residual = 6.09785e-06, No Iterations 13
BICCG: Solving for Uy, Initial residual = 0.0320625, Final residual = 7.43667e-06, No Iterations 13
ICCG: Solving for p, Initial residual = 0.180605, Final residual = 9.52495e-07, No Iterations 189
time step continuity errors : sum local = 3.83623e-10, global = -1.58405e-19, cumulative = -1.94624e-20
ICCG: Solving for p, Initial residual = 0.163707, Final residual = 9.64875e-07, No Iterations 188
time step continuity errors : sum local = 3.90012e-10, global = -7.23862e-20, cumulative = -9.18486e-20
ExecutionTime = 11.47 s ClockTime = 12 s

Time = 0.008

Mean and max Courant Numbers = 0.0729413 0.952808
BICCG: Solving for Ux, Initial residual = 0.00909364, Final residual = 8.65935e-06, No Iterations 12
BICCG: Solving for Uy, Initial residual = 0.0269566, Final residual = 6.7032e-06, No Iterations 13
ICCG: Solving for p, Initial residual = 0.109123, Final residual = 8.15401e-07, No Iterations 188
time step continuity errors : sum local = 3.10857e-10, global = 1.21822e-20, cumulative = -7.96664e-20
ICCG: Solving for p, Initial residual = 0.0988474, Final residual = 9.23239e-07, No Iterations 187
time step continuity errors : sum local = 3.52258e-10, global = 1.454e-19, cumulative = 6.57332e-20
ExecutionTime = 12.91 s ClockTime = 13 s

Time = 0.009

Mean and max Courant Numbers = 0.0761604 0.954918
BICCG: Solving for Ux, Initial residual = 0.00788061, Final residual = 8.02051e-06, No Iterations 12
BICCG: Solving for Uy, Initial residual = 0.0216089, Final residual = 7.38888e-06, No Iterations 12
ICCG: Solving for p, Initial residual = 0.0751862, Final residual = 8.52566e-07, No Iterations 184
time step continuity errors : sum local = 3.22218e-10, global = -2.50286e-19, cumulative = -1.84553e-19
ICCG: Solving for p, Initial residual = 0.0683289, Final residual = 8.71176e-07, No Iterations 184
time step continuity errors : sum local = 3.2978e-10, global = 2.72231e-20, cumulative = -1.5733e-19
ExecutionTime = 14.32 s ClockTime = 14 s

Time = 0.01

Mean and max Courant Numbers = 0.078984 0.956873
BICCG: Solving for Ux, Initial residual = 0.00663957, Final residual = 5.93408e-06, No Iterations 12
BICCG: Solving for Uy, Initial residual = 0.0188549, Final residual = 6.51374e-06, No Iterations 12
ICCG: Solving for p, Initial residual = 0.0476218, Final residual = 7.87959e-07, No Iterations 183
time step continuity errors : sum local = 2.92796e-10, global = -3.27487e-20, cumulative = -1.90078e-19
ICCG: Solving for p, Initial residual = 0.0435532, Final residual = 8.46472e-07, No Iterations 182
time step continuity errors : sum local = 3.1477e-10, global = 9.94609e-20, cumulative = -9.06174e-20
ExecutionTime = 15.72 s ClockTime = 16 s

Time = 0.011

Mean and max Courant Numbers = 0.0815052 0.958373
BICCG: Solving for Ux, Initial residual = 0.00586825, Final residual = 9.88343e-06, No Iterations 11
BICCG: Solving for Uy, Initial residual = 0.0161717, Final residual = 9.9482e-06, No Iterations 11
ICCG: Solving for p, Initial residual = 0.0356458, Final residual = 9.13337e-07, No Iterations 182
time step continuity errors : sum local = 3.37795e-10, global = -2.71539e-19, cumulative = -3.62156e-19
ICCG: Solving for p, Initial residual = 0.0329001, Final residual = 7.4762e-07, No Iterations 182
time step continuity errors : sum local = 2.7672e-10, global = -1.66423e-19, cumulative = -5.28579e-19
ExecutionTime = 17.13 s ClockTime = 17 s

Time = 0.012

Mean and max Courant Numbers = 0.0837613 0.959674
BICCG: Solving for Ux, Initial residual = 0.00515329, Final residual = 8.0906e-06, No Iterations 11
BICCG: Solving for Uy, Initial residual = 0.0144256, Final residual = 7.93775e-06, No Iterations 11
ICCG: Solving for p, Initial residual = 0.0248828, Final residual = 8.89308e-07, No Iterations 179
time step continuity errors : sum local = 3.28367e-10, global = -6.074e-20, cumulative = -5.89319e-19
ICCG: Solving for p, Initial residual = 0.0232373, Final residual = 9.85799e-07, No Iterations 178
time step continuity errors : sum local = 3.64155e-10, global = 1.95186e-19, cumulative = -3.94132e-19
ExecutionTime = 18.5 s ClockTime = 19 s

Time = 0.013

Mean and max Courant Numbers = 0.0858044 0.960774
BICCG: Solving for Ux, Initial residual = 0.00462194, Final residual = 7.26935e-06, No Iterations 11
BICCG: Solving for Uy, Initial residual = 0.012806, Final residual = 7.36052e-06, No Iterations 11
ICCG: Solving for p, Initial residual = 0.0208117, Final residual = 8.85416e-07, No Iterations 179
time step continuity errors : sum local = 3.25548e-10, global = 9.66412e-20, cumulative = -2.97491e-19
ICCG: Solving for p, Initial residual = 0.0196253, Final residual = 8.99491e-07, No Iterations 179
time step continuity errors : sum local = 3.30787e-10, global = -7.37018e-21, cumulative = -3.04861e-19
ExecutionTime = 19.88 s ClockTime = 20 s

Time = 0.014

Mean and max Courant Numbers = 0.0876578 0.961719
BICCG: Solving for Ux, Initial residual = 0.00415071, Final residual = 6.38942e-06, No Iterations 11
BICCG: Solving for Uy, Initial residual = 0.0115789, Final residual = 1.44374e+95, No Iterations 1001
ICCG: Solving for p, Initial residual = 1, Final residual = 9.6264e-07, No Iterations 189
time step continuity errors : sum local = 1.29818e+87, global = 5.93881e+76, cumulative = 5.93881e+76
ICCG: Solving for p, Initial residual = 0.934444, Final residual = 9.29122e-07, No Iterations 189
time step continuity errors : sum local = 1.32389e+87, global = 5.63113e+75, cumulative = 6.50193e+76
ExecutionTime = 25.46 s ClockTime = 26 s

Time = 0.015

Mean and max Courant Numbers = 7.04549e+93 3.68526e+94
BICCG: Solving for Ux: solution singularity
BICCG: Solving for Uy: solution singularity
ICCG: Solving for p, Initial residual = 1, Final residual = 1.40439, No Iterations 5001
time step continuity errors : sum local = 1.88237e+101, global = 1.44676e+86, cumulative = 1.44676e+86
ICCG: Solving for p: solution singularity
time step continuity errors : sum local = 3.57117e+109, global = -6.875e+92, cumulative = -6.87499e+92
ExecutionTime = 39.12 s ClockTime = 39 s

Time = 0.016

Mean and max Courant Numbers = 1.45147e+109 1.83744e+112
BICCG: Solving for Ux: solution singularity
BICCG: Solving for Uy: solution singularity
ICCG: Solving for p: solution singularity
time step continuity errors : sum local = 1.67439e+110, global = 2.48722e+94, cumulative = 2.41847e+94
ICCG: Solving for p: solution singularity
time step continuity errors : sum local = 1.26382e+111, global = -1.14441e+95, cumulative = -9.0256e+94
ExecutionTime = 39.47 s ClockTime = 40 s

Time = 0.017

Mean and max Courant Numbers = 7.14503e+110 8.90101e+113
BICCG: Solving for Ux: solution singularity
BICCG: Solving for Uy: solution singularity
ICCG: Solving for p: solution singularity
time step continuity errors : sum local = 4.72906e+112, global = -1.80693e+96, cumulative = -1.89719e+96
ICCG: Solving for p: solution singularity
time step continuity errors : sum local = 2.80161e+113, global = 7.47804e+96, cumulative = 5.58085e+96
ExecutionTime = 39.82 s ClockTime = 40 s

Time = 0.018

Mean and max Courant Numbers = 1.77155e+113 4.14651e+116
BICCG: Solving for Ux: solution singularity
BICCG: Solving for Uy: solution singularity
ICCG: Solving for p: solution singularity
time step continuity errors : sum local = 6.67211e+114, global = 1.87091e+99, cumulative = 1.87649e+99
ICCG: Solving for p: solution singularity
time step continuity errors : sum local = 2.64446e+115, global = 5.34057e+99, cumulative = 7.21706e+99
ExecutionTime = 40.16 s ClockTime = 40 s

Time = 0.019

Mean and max Courant Numbers = 1.73067e+115 2.43263e+118
BICCG: Solving for Ux: solution singularity
BICCG: Solving for Uy: solution singularity
ICCG: Solving for p: solution singularity
time step continuity errors : sum local = 1.03355e+116, global = -2.80081e+99, cumulative = 4.41625e+99
ICCG: Solving for p: solution singularity
time step continuity errors : sum local = 8.83257e+116, global = 6.15057e+100, cumulative = 6.5922e+100
ExecutionTime = 40.51 s ClockTime = 41 s

Time = 0.02

Mean and max Courant Numbers = 5.19846e+116 7.38682e+119
BICCG: Solving for Ux: solution singularity
BICCG: Solving for Uy: solution singularity
ICCG: Solving for p: solution singularity
time step continuity errors : sum local = 3.63936e+117, global = 3.23517e+101, cumulative = 3.89439e+101
ICCG: Solving for p: solution singularity
time step continuity errors : sum local = 8.63794e+118, global = 5.14892e+102, cumulative = 5.53836e+102
ExecutionTime = 40.85 s ClockTime = 41 s

Time = 0.021

Mean and max Courant Numbers = 4.54631e+118 1.97763e+122
BICCG: Solving for Ux: solution singularity
BICCG: Solving for Uy: solution singularity
ICCG: Solving for p: solution singularity
time step continuity errors : sum local = 2.1432e+120, global = 1.32031e+104, cumulative = 1.3757e+104
ICCG: Solving for p: solution singularity
time step continuity errors : sum local = 4.05285e+121, global = 1.8166e+105, cumulative = 1.95417e+105
ExecutionTime = 41.2 s ClockTime = 41 s

Time = 0.022

Mean and max Courant Numbers = 2.11858e+121 8.97658e+124
BICCG: Solving for Ux: solution singularity
BICCG: Solving for Uy: solution singularity
ICCG: Solving for p: solution singularity
time step continuity errors : sum local = 1.7996e+122, global = 8.53168e+105, cumulative = 1.04858e+106
ICCG: Solving for p: solution singularity
time step continuity errors : sum local = 2.79712e+124, global = -1.02687e+108, cumulative = -1.01639e+108
ExecutionTime = 41.55 s ClockTime = 42 s

Time = 0.023

Mean and max Courant Numbers = 1.43656e+124 8.42775e+127
BICCG: Solving for Ux: solution singularity
BICCG: Solving for Uy: solution singularity
ICCG: Solving for p: solution singularity
time step continuity errors : sum local = 1.90088e+125, global = -4.21083e+109, cumulative = -4.31247e+109
ICCG: Solving for p: solution singularity
time step continuity errors : sum local = 1.54769e+126, global = -6.58994e+109, cumulative = -1.09024e+110
ExecutionTime = 41.89 s ClockTime = 42 s

Time = 0.024

Mean and max Courant Numbers = 1.07014e+126 1.86495e+129
-----------------------------------------------------------------------------
One of the processes started by mpirun has exited with a nonzero exit
code. This typically indicates that the process finished in error.
If your process did not finish in error, be sure to include a "return
0" or "exit(0)" in your C code before exiting the application.

PID 32391 failed on node n0 (127.0.0.1) due to signal 15.
-----------------------------------------------------------------------------
madhavan@varese:~/OpenFOAM/madhavan-1.3/run/tutorials/icoFoam>
msrinath80 is offline   Reply With Quote

Old   July 26, 2006, 14:09
Default Ok, I think I've solved the pr
  #2
Senior Member
 
Srinath Madhavan (a.k.a pUl|)
Join Date: Mar 2009
Location: Edmonton, AB, Canada
Posts: 703
Rep Power: 21
msrinath80 is on a distinguished road
Ok, I think I've solved the problem. Earlier, I used the following command to run icoFoam in parallel:

mpirun -np 4 icoFoam . cavity_large_mesh_parallel -parallel

and I faced those random problems...

Now, I use the command below to run icoFoam in parallel:

mpirun -ssi rpi lamd C icoFoam . cavity_large_mesh_parallel -parallel

and it seems to run without problems. Note that I did not have to specify '-np 4'. Apparently lamd and the kernel (linux) take care of spawning 4 processes as discussed here[1].

In both cases, I first do a lamboot -v machines

The machines file (again in both cases) contains:

localhost cpu=4

[1] http://www.lam-mpi.org/MailArchives/...01/04/2457.php
msrinath80 is offline   Reply With Quote

Old   July 27, 2006, 05:36
Default Can you make it crash repeated
  #3
Senior Member
 
Eugene de Villiers
Join Date: Mar 2009
Posts: 725
Rep Power: 21
eugene is on a distinguished road
Can you make it crash repeatedly and consistently when using the old command?
eugene is offline   Reply With Quote

Old   July 27, 2006, 14:13
Default If 10 times can represent "rep
  #4
Senior Member
 
Srinath Madhavan (a.k.a pUl|)
Join Date: Mar 2009
Location: Edmonton, AB, Canada
Posts: 703
Rep Power: 21
msrinath80 is on a distinguished road
If 10 times can represent "repeatedly", then yes. However, the crashes occur far apart in time steps. So with the old command, I could crash at the very beginning or even just before the last few time steps. Like I said, it appears random. Out of curiosity, why is that important? Thanks!
msrinath80 is offline   Reply With Quote

Old   July 28, 2006, 04:30
Default If the occurrance is random, i
  #5
Senior Member
 
Eugene de Villiers
Join Date: Mar 2009
Posts: 725
Rep Power: 21
eugene is on a distinguished road
If the occurrance is random, it is likely that the problem is with your hardware/drivers (or very unlikely in your case an uninitialised pointer object). If it occurs consistently it is more likely a problem with Foam or your setup.

I have run many cases on 8cpu shared memory opterons using the "mpirun -np 8" command without problems, so I doubt it is a Foam issue.
eugene is offline   Reply With Quote

Old   July 28, 2006, 15:21
Default Thanks Eugene. Like I said the
  #6
Senior Member
 
Srinath Madhavan (a.k.a pUl|)
Join Date: Mar 2009
Location: Edmonton, AB, Canada
Posts: 703
Rep Power: 21
msrinath80 is on a distinguished road
Thanks Eugene. Like I said the thing works fine if I use:

mpirun -ssi rpi lamd C icoFoam . cavity_large_mesh_parallel -parallel
msrinath80 is offline   Reply With Quote

Old   July 19, 2007, 07:02
Default FYI: I have discovered a bette
  #7
Senior Member
 
Srinath Madhavan (a.k.a pUl|)
Join Date: Mar 2009
Location: Edmonton, AB, Canada
Posts: 703
Rep Power: 21
msrinath80 is on a distinguished road
FYI: I have discovered a better rpi for shared memory that works without any problems:

mpirun C -ssi rpi usysv icoFoam_1 . unsteady_validation_refined -parallel > unsteady_validation_refined/log 2>&1 &
msrinath80 is offline   Reply With Quote

Old   July 21, 2007, 05:58
Default Update: On this ppc64 system,
  #8
Senior Member
 
Srinath Madhavan (a.k.a pUl|)
Join Date: Mar 2009
Location: Edmonton, AB, Canada
Posts: 703
Rep Power: 21
msrinath80 is on a distinguished road
Update: On this ppc64 system, although LAM 7.1.3 compiles fine, the MPI process randomly fails at different time steps even if I choose a different RPI like usysv (basically the same problem as described in the beginning). An MPICH build fails with the following error message:

gcc -m64 -o serv_p4 serv_p4.o server_ssl.o -lcrypt
bin/mpicc -o /home/madhavan/OpenFOAM/OpenFOAM-1.4/src/mpich-1.2.7p1/bin/mpichversion /home/madhavan/
OpenFOAM/OpenFOAM-1.4/src/mpich-1.2.7p1/util/mpichversion.c
/home/madhavan/OpenFOAM/OpenFOAM-1.4/src/mpich-1.2.7p1/util/mpichversion.c: In function âmainâ:
/home/madhavan/OpenFOAM/OpenFOAM-1.4/src/mpich-1.2.7p1/util/mpichversion.c:67: warning: incompatible
implicit declaration of built-in function âexitâ
collect2: ld terminated with signal 11 [Segmentation fault]
/usr/bin/ld: warning: powerpc:common architecture of input file `mpichversion.o' is incompatible wit
h powerpc:common64 output
mpichversion.o: In function `main':
mpichversion.c.text+0xe0): relocation truncated to fit: R_PPC_REL24 against `strcmp'
mpichversion.c.text+0x124): relocation truncated to fit: R_PPC_REL24 against `strcmp'
mpichversion.c.text+0x168): relocation truncated to fit: R_PPC_REL24 against `strcmp'
mpichversion.c.text+0x1ac): relocation truncated to fit: R_PPC_REL24 against `strcmp'
mpichversion.c.text+0x1f0): relocation truncated to fit: R_PPC_REL24 against `strcmp'
make[1]: *** [mpi-utils] Error 1
make: *** [mpi] Error 2


Luckily, OpenMPI 1.2.3 was built successfully and I am testing it now.
msrinath80 is offline   Reply With Quote

Old   July 21, 2007, 06:22
Default I do not know if your problem
  #9
iyer_arvind
Guest
 
Posts: n/a
I do not know if your problem is the same as mine, but, in my case , on a ppc64 - 128 node machine i had compiled lam and openFoam and I was facing a very similar problem, that the run used to stop randomly. I was able to solve the problem by compiling openFoam with the mpich libraries which i found already existing on the system.

I really don't know why it used to stop, but the above solved the problem completely.
  Reply With Quote

Old   July 22, 2007, 03:58
Default Thanks for sharing your experi
  #10
Senior Member
 
Srinath Madhavan (a.k.a pUl|)
Join Date: Mar 2009
Location: Edmonton, AB, Canada
Posts: 703
Rep Power: 21
msrinath80 is on a distinguished road
Thanks for sharing your experience. It appears that using OpenMPI 1.2.3 has solved this problem for good. By the way I easily get 2X speedup when using two processors. IBM sure did a nice job with their dual-cores
msrinath80 is offline   Reply With Quote

Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
IcoFoam in parallel Issue with speed up hjasak OpenFOAM Running, Solving & CFD 19 October 11, 2011 18:07
Density in icoFoam Densidad en icoFoam manuel OpenFOAM Running, Solving & CFD 8 September 22, 2010 05:10
OF141dev installation Woes chegdan OpenFOAM Installation 13 July 18, 2008 18:16
Problem with IcoFoam in parallel skabilan OpenFOAM Running, Solving & CFD 12 April 1, 2008 06:55
SimpleFoam woes msrinath80 OpenFOAM Bugs 2 April 13, 2007 11:15


All times are GMT -4. The time now is 22:39.