Same case run twice I got different results, what's going wrong?

lakeat · April 7, 2014, 17:51

I ran a cavity case with icoFoam in parallel using 32 cores (i.e. two nodes) twice, but I got different results. By saying "different", I mean the log file shows residual difference.

log-1:

Quote:

Time = 2e-05

Courant Number mean: 0 max: 0
smoothSolver: Solving for Ux, Initial residual = 1, Final residual = 4.22814e-13, No Iterations 1000
smoothSolver: Solving for Uy, Initial residual = 0, Final residual = 0, No Iterations 0
GAMG: Solving for p, Initial residual = 1, Final residual = 4.09553e-11, No Iterations 1000
time step continuity errors : sum local = 8.15199e-17, global = -3.71519e-21, cumulative = -3.71519e-21
GAMG: Solving for p, Initial residual = 0.717435, Final residual = 4.10012e-13, No Iterations 1000
time step continuity errors : sum local = 1.13405e-18, global = 2.562e-22, cumulative = -3.45899e-21
ExecutionTime = 20.07 s ClockTime = 22 s

log-2:

Quote:

Time = 2e-05

Courant Number mean: 0 max: 0
smoothSolver: Solving for Ux, Initial residual = 1, Final residual = 4.22814e-13, No Iterations 1000
smoothSolver: Solving for Uy, Initial residual = 0, Final residual = 0, No Iterations 0
GAMG: Solving for p, Initial residual = 1, Final residual = 4.12736e-11, No Iterations 1000
time step continuity errors : sum local = 8.22913e-17, global = -2.10757e-21, cumulative = -2.10757e-21
GAMG: Solving for p, Initial residual = 0.717435, Final residual = 4.10737e-13, No Iterations 1000
time step continuity errors : sum local = 1.13587e-18, global = 2.40503e-23, cumulative = -2.08352e-21
ExecutionTime = 20.26 s ClockTime = 22 s

You see, starting from this first step, the p final residual starts to differ. I feel very confused about this, the nodes used in both simulations belong to the same cluster, a new bought cluster. I have tried to tighten the tolerance even to 1e-30, but it does not help. The relTol is set as zero.

Any random behavior in CPU? Or can we blame AMG solver for this? (scotch partitioning method is used) Any ideas?

alexeym · April 8, 2014, 02:03

Hi,

it seems you've got rather strange settings in fvSolution. 1000 is a default value for maximum number of iterations for linear system solvers, and as GAMG can't fulfill your tolerance settings in 1000 iterations, final residual can differ. Also wrong settings for GAMG can lead to this situation, try switching to PCG and check if the behavior is the same.

lakeat · April 8, 2014, 10:05

I've never had success in using PCG solver in parallel, it blows always.

Using a smaller rolerance, p 1.0e-6, U 1.0e-5, as you can see below, the number of iterations has not exceed the maximum value 1000, however the difference is still there. Why? One iterates 120, the other 122. I don't know why there is such a random behavior in the solver.

Quote:

Time = 2e-05

Courant Number mean: 0 max: 0
smoothSolver: Solving for Ux, Initial residual = 1, Final residual = 9.97037e-06, No Iterations 324
smoothSolver: Solving for Uy, Initial residual = 0, Final residual = 0, No Iterations 0
GAMG: Solving for p, Initial residual = 1, Final residual = 8.2019e-07, No Iterations 120
time step continuity errors : sum local = 1.4654e-12, global = -2.64538e-21, cumulative = -2.64538e-21
GAMG: Solving for p, Initial residual = 0.717438, Final residual = 8.45058e-07, No Iterations 115
time step continuity errors : sum local = 2.09456e-12, global = 5.24107e-23, cumulative = -2.59297e-21
ExecutionTime = 4.77 s ClockTime = 7 s

Quote:

Time = 2e-05

Courant Number mean: 0 max: 0
smoothSolver: Solving for Ux, Initial residual = 1, Final residual = 9.97037e-06, No Iterations 324
smoothSolver: Solving for Uy, Initial residual = 0, Final residual = 0, No Iterations 0
GAMG: Solving for p, Initial residual = 1, Final residual = 8.42585e-07, No Iterations 122
time step continuity errors : sum local = 1.5012e-12, global = -4.18191e-21, cumulative = -4.18191e-21
GAMG: Solving for p, Initial residual = 0.717438, Final residual = 8.17938e-07, No Iterations 118
time step continuity errors : sum local = 2.00798e-12, global = -1.10993e-23, cumulative = -4.19301e-21
ExecutionTime = 4.51 s ClockTime = 6 s

And even if we talk about the two results I have shown in the first post, if in both times the solver cannot converge at the 1000th iteration, then why difference? I thought they should "converge" at the same "point" OR they should "diverge" at the same "point", shouldn't they?

Edit:
-----
Here are the results of using PCG solver for the pressure (using the same tolerance and relTol, that are 1.0e-6 and 0.0 for the pressure and 1.0e-5 and 0.0 for the velocity), there is still difference in the final residual. Note it blows at the 14th time step, what you see here is only a copy of the first step residual log:

Quote:

Time = 2e-05

Courant Number mean: 0 max: 0
smoothSolver: Solving for Ux, Initial residual = 1, Final residual = 9.69316e-06, No Iterations 160
smoothSolver: Solving for Uy, Initial residual = 0, Final residual = 0, No Iterations 0
DICPCG: Solving for p, Initial residual = 1, Final residual = 0.173177, No Iterations 1001
time step continuity errors : sum local = 3.25419e-07, global = 3.30872e-23, cumulative = 3.30872e-23
DICPCG: Solving for p, Initial residual = 0.839426, Final residual = 0.0686308, No Iterations 1001
time step continuity errors : sum local = 1.96116e-07, global = 5.67777e-22, cumulative = 6.00864e-22
ExecutionTime = 2.29 s ClockTime = 7 s

Quote:

Time = 2e-05

Courant Number mean: 0 max: 0
smoothSolver: Solving for Ux, Initial residual = 1, Final residual = 9.69316e-06, No Iterations 160
smoothSolver: Solving for Uy, Initial residual = 0, Final residual = 0, No Iterations 0
DICPCG: Solving for p, Initial residual = 1, Final residual = 0.173388, No Iterations 1001
time step continuity errors : sum local = 3.25795e-07, global = -2.8058e-22, cumulative = -2.8058e-22
DICPCG: Solving for p, Initial residual = 0.83943, Final residual = 0.0685477, No Iterations 1001
time step continuity errors : sum local = 1.95924e-07, global = -2.50139e-22, cumulative = -5.30719e-22
ExecutionTime = 3.48 s ClockTime = 5 s

PS: My solver settings for these cases.

Code:

    p
    {
       solver          GAMG;
       tolerance       1e-6;
       relTol          0.0;
       smoother        GaussSeidel;
       nPreSweeps      0;
       nPostSweeps     2;
       cacheAgglomeration on;
       agglomerator    faceAreaPair;
       nCellsInCoarsestLevel 20;
       mergeLevels     1;
    }

    pFinal
    {
       $p;
       smoother        DICGaussSeidel;
       tolerance       1e-6;
       relTol          0;
    };

    U
    {
       solver          smoothSolver;
       smoother        GaussSeidel;
       tolerance       1e-5;
       relTol          0;
    }

Code:

    p   
    {   
        solver          PCG;
        preconditioner  DIC;
        tolerance       1e-06;
        relTol          0.0;
    }   
        
    pFinal   
    {   
        $p$;
        relTol          0;
    }   
        
    "(U|k|B|nuTilda)"   
    {   
        solver          smoothSolver;
        smoother        symGaussSeidel;
        tolerance       1e-05;
        relTol          0;  
    }

alexeym · April 8, 2014, 10:40

Well,

In the first part of the answer you've used GAMG solver, can you show your settings for this solver? As it will do interpolation between coarse and fine meshes and maybe the difference in final residuals come from this operation.

Blowup of PCG usually means that your initial or boundary conditions do not make sense. Can you describe your case a little bit, show your fvSchemes, fvSolutions, boundary conditions? For example with cavity case from tutorial (which uses PCG and PBiCG), the residuals between runs are identical.

lakeat · April 8, 2014, 10:54

It is exactly the same case copied from the tutorial icoFoam cavity. The only difference is the original mesh is 20*20, now I refine it to 1000*1000. The fvScheme settings are also the same. Boundary conditions initial conditions have all been double checked with vimdiff, they show no difference.

I also want to add, that I have not seen this in my 4 core and 16 core decomposition. This only starts to happen as I was trying to use 32+ core. It seems nothing to do with maximum iteration number "1000", because in a 4 core simulation, if I run twice with 1e-16 0.0, two log files shows same final residuals.

alexeym · April 8, 2014, 11:02

OK.

- What decomposition method do you use?
- Does 4 and 16 core decomposition mean that you're running the case on a single node?

lakeat · April 8, 2014, 11:06

Quote:

Originally Posted by alexeym

OK.

- What decomposition method do you use?
- Does 4 and 16 core decomposition mean that you're running the case on a single node?

(1) decomposition method scotch, but I have tried simple with (32*1*1) and the final residuals are still different.

(2) 4 core: it is on my personal workstation. 16-core it's running on a high performance cluster. By using 16-core on that cluster, it means it is running on one node.

Why not try your self on your cluster, you just need to refine the cavity case grid to 1000*1000*1 and time step 2E-5, endTime set to 0.001. With 32 core, it will only take less than one minute.

alexeym · April 8, 2014, 11:11

And case blows-up only when run on more than one node? (i.e. 4 and 16 core variants converge).

lakeat · April 8, 2014, 11:14

OOPS, can anyone verify by running a small test on your computer.

Copy the cavity case from your tutorial and refine the mesh to 1000*1000*1, change deltaT to 2e-5 and endtime to 1e-3, and keep others unchanged. And run icoFoam.

Because I found even with one core, PCG solver is not able to converge. What have I done wrong?

wyldckat · April 8, 2014, 14:52

Just a couple of quick notes:

Do not disregard the compilation options! If you're using ICC and used the "-ffast-math" option, then it's natural that the results are slightly different.
For some reference, check this comment: http://www.cfd-online.com/Forums/blo...ml#comment1577
And make sure the BIOS settings are identical on all nodes... at least the settings related to the CPUs.

lakeat · April 8, 2014, 15:02

Quote:

Originally Posted by wyldckat

Just a couple of quick notes:

Do not disregard the compilation options! If you're using ICC and used the "-ffast-math" option, then it's natural that the results are slightly different.
For some reference, check this comment: http://www.cfd-online.com/Forums/blo...ml#comment1577
And make sure the BIOS settings are identical on all nodes... at least the settings related to the CPUs.

I don't think I have used that option, I just use "-O2 -no-prec-div", of cause I tried "-O3" as well.

Quote:

And make sure the BIOS settings are identical on all nodes... at least the settings related to the CPUs.

Well, I am not ready to piss them off

. But I don't see a reason that our cluster administrators would use different BIOS-CPU settings for each node (just a guess).

PS: post #9 has been edited. Thanks

April 8, 2014, 10:54		#5
lakeat Senior Member Daniel WEI (老魏) Join Date: Mar 2009 Location: Beijing, China Posts: 689 Blog Entries: 9 Rep Power: 21	It is exactly the same case copied from the tutorial icoFoam cavity. The only difference is the original mesh is 2020, now I refine it to 10001000. The fvScheme settings are also the same. Boundary conditions initial conditions have all been double checked with vimdiff, they show no difference. I also want to add, that I have not seen this in my 4 core and 16 core decomposition. This only starts to happen as I was trying to use 32+ core. It seems nothing to do with maximum iteration number "1000", because in a 4 core simulation, if I run twice with 1e-16 0.0, two log files shows same final residuals. __________________ ~ Daniel WEI ------------- Boeing Research & Technology - China Beijing, China Email

April 8, 2014, 14:52		#10
wyldckat Retired Super Moderator Bruno Santos Join Date: Mar 2009 Location: Lisbon, Portugal Posts: 10,974 Blog Entries: 45 Rep Power: 128	Just a couple of quick notes: Do not disregard the compilation options! If you're using ICC and used the "-ffast-math" option, then it's natural that the results are slightly different. For some reference, check this comment: http://www.cfd-online.com/Forums/blo...ml#comment1577 And make sure the BIOS settings are identical on all nodes... at least the settings related to the CPUs.

Thread Tools	Search this Thread
Show Printable Version Email this Page	Search this Thread: Advanced Search
Display Modes
Linear Mode Switch to Hybrid Mode Switch to Threaded Mode

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
udf error	srihari	FLUENT	1	October 31, 2016 14:18
Is Playstation 3 cluster suitable for CFD work	hsieh	OpenFOAM	9	August 16, 2015 14:53
Big Difference Between serial run and parallel run case	alundilong	OpenFOAM Running, Solving & CFD	3	December 5, 2014 08:27
Run case from a script?	miro2000	OpenFOAM Running, Solving & CFD	4	July 7, 2013 05:52
Big Difference Between Serial run and Parallel run case	alundilong	OpenFOAM Programming & Development	1	March 20, 2013 15:52

April 8, 2014, 02:03		#2
alexeym Senior Member Alexey Matveichev Join Date: Aug 2011 Location: Nancy, France Posts: 1,930 Rep Power: 38	Hi, it seems you've got rather strange settings in fvSolution. 1000 is a default value for maximum number of iterations for linear system solvers, and as GAMG can't fulfill your tolerance settings in 1000 iterations, final residual can differ. Also wrong settings for GAMG can lead to this situation, try switching to PCG and check if the behavior is the same.

April 8, 2014, 10:40		#4
alexeym Senior Member Alexey Matveichev Join Date: Aug 2011 Location: Nancy, France Posts: 1,930 Rep Power: 38	Well, In the first part of the answer you've used GAMG solver, can you show your settings for this solver? As it will do interpolation between coarse and fine meshes and maybe the difference in final residuals come from this operation. Blowup of PCG usually means that your initial or boundary conditions do not make sense. Can you describe your case a little bit, show your fvSchemes, fvSolutions, boundary conditions? For example with cavity case from tutorial (which uses PCG and PBiCG), the residuals between runs are identical.

April 8, 2014, 11:02		#6
alexeym Senior Member Alexey Matveichev Join Date: Aug 2011 Location: Nancy, France Posts: 1,930 Rep Power: 38	OK. - What decomposition method do you use? - Does 4 and 16 core decomposition mean that you're running the case on a single node?

April 8, 2014, 11:11		#8
alexeym Senior Member Alexey Matveichev Join Date: Aug 2011 Location: Nancy, France Posts: 1,930 Rep Power: 38	And case blows-up only when run on more than one node? (i.e. 4 and 16 core variants converge).

April 8, 2014, 11:14		#9
lakeat Senior Member Daniel WEI (老魏) Join Date: Mar 2009 Location: Beijing, China Posts: 689 Blog Entries: 9 Rep Power: 21	OOPS, can anyone verify by running a small test on your computer. Copy the cavity case from your tutorial and refine the mesh to 100010001, change deltaT to 2e-5 and endtime to 1e-3, and keep others unchanged. And run icoFoam. Because I found even with one core, PCG solver is not able to converge. What have I done wrong? __________________ ~ Daniel WEI ------------- Boeing Research & Technology - China Beijing, China Email Last edited by lakeat; April 8, 2014 at 15:09.