CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Software User Forums > OpenFOAM > OpenFOAM Running, Solving & CFD

Same case run twice I got different results, what's going wrong?

Register Blogs Members List Search Today's Posts Mark Forums Read

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   April 7, 2014, 17:51
Default Same case run twice I got different results, what's going wrong?
  #1
Senior Member
 
lakeat's Avatar
 
Daniel WEI (老魏)
Join Date: Mar 2009
Location: Beijing, China
Posts: 689
Blog Entries: 9
Rep Power: 21
lakeat is on a distinguished road
Send a message via Skype™ to lakeat
I ran a cavity case with icoFoam in parallel using 32 cores (i.e. two nodes) twice, but I got different results. By saying "different", I mean the log file shows residual difference.

log-1:
Quote:
Time = 2e-05

Courant Number mean: 0 max: 0
smoothSolver: Solving for Ux, Initial residual = 1, Final residual = 4.22814e-13, No Iterations 1000
smoothSolver: Solving for Uy, Initial residual = 0, Final residual = 0, No Iterations 0
GAMG: Solving for p, Initial residual = 1, Final residual = 4.09553e-11, No Iterations 1000
time step continuity errors : sum local = 8.15199e-17, global = -3.71519e-21, cumulative = -3.71519e-21
GAMG: Solving for p, Initial residual = 0.717435, Final residual = 4.10012e-13, No Iterations 1000
time step continuity errors : sum local = 1.13405e-18, global = 2.562e-22, cumulative = -3.45899e-21
ExecutionTime = 20.07 s ClockTime = 22 s
log-2:
Quote:
Time = 2e-05

Courant Number mean: 0 max: 0
smoothSolver: Solving for Ux, Initial residual = 1, Final residual = 4.22814e-13, No Iterations 1000
smoothSolver: Solving for Uy, Initial residual = 0, Final residual = 0, No Iterations 0
GAMG: Solving for p, Initial residual = 1, Final residual = 4.12736e-11, No Iterations 1000
time step continuity errors : sum local = 8.22913e-17, global = -2.10757e-21, cumulative = -2.10757e-21
GAMG: Solving for p, Initial residual = 0.717435, Final residual = 4.10737e-13, No Iterations 1000
time step continuity errors : sum local = 1.13587e-18, global = 2.40503e-23, cumulative = -2.08352e-21
ExecutionTime = 20.26 s ClockTime = 22 s
You see, starting from this first step, the p final residual starts to differ. I feel very confused about this, the nodes used in both simulations belong to the same cluster, a new bought cluster. I have tried to tighten the tolerance even to 1e-30, but it does not help. The relTol is set as zero.

Any random behavior in CPU? Or can we blame AMG solver for this? (scotch partitioning method is used) Any ideas?
__________________
~
Daniel WEI
-------------
Boeing Research & Technology - China
Beijing, China
Email

Last edited by lakeat; April 8, 2014 at 15:13.
lakeat is offline   Reply With Quote

Old   April 8, 2014, 02:03
Default
  #2
Senior Member
 
Alexey Matveichev
Join Date: Aug 2011
Location: Nancy, France
Posts: 1,930
Rep Power: 38
alexeym has a spectacular aura aboutalexeym has a spectacular aura about
Send a message via Skype™ to alexeym
Hi,

it seems you've got rather strange settings in fvSolution. 1000 is a default value for maximum number of iterations for linear system solvers, and as GAMG can't fulfill your tolerance settings in 1000 iterations, final residual can differ. Also wrong settings for GAMG can lead to this situation, try switching to PCG and check if the behavior is the same.
alexeym is offline   Reply With Quote

Old   April 8, 2014, 10:05
Default
  #3
Senior Member
 
lakeat's Avatar
 
Daniel WEI (老魏)
Join Date: Mar 2009
Location: Beijing, China
Posts: 689
Blog Entries: 9
Rep Power: 21
lakeat is on a distinguished road
Send a message via Skype™ to lakeat
I've never had success in using PCG solver in parallel, it blows always.

Using a smaller rolerance, p 1.0e-6, U 1.0e-5, as you can see below, the number of iterations has not exceed the maximum value 1000, however the difference is still there. Why? One iterates 120, the other 122. I don't know why there is such a random behavior in the solver.

Quote:
Time = 2e-05

Courant Number mean: 0 max: 0
smoothSolver: Solving for Ux, Initial residual = 1, Final residual = 9.97037e-06, No Iterations 324
smoothSolver: Solving for Uy, Initial residual = 0, Final residual = 0, No Iterations 0
GAMG: Solving for p, Initial residual = 1, Final residual = 8.2019e-07, No Iterations 120
time step continuity errors : sum local = 1.4654e-12, global = -2.64538e-21, cumulative = -2.64538e-21
GAMG: Solving for p, Initial residual = 0.717438, Final residual = 8.45058e-07, No Iterations 115
time step continuity errors : sum local = 2.09456e-12, global = 5.24107e-23, cumulative = -2.59297e-21
ExecutionTime = 4.77 s ClockTime = 7 s
Quote:
Time = 2e-05

Courant Number mean: 0 max: 0
smoothSolver: Solving for Ux, Initial residual = 1, Final residual = 9.97037e-06, No Iterations 324
smoothSolver: Solving for Uy, Initial residual = 0, Final residual = 0, No Iterations 0
GAMG: Solving for p, Initial residual = 1, Final residual = 8.42585e-07, No Iterations 122
time step continuity errors : sum local = 1.5012e-12, global = -4.18191e-21, cumulative = -4.18191e-21
GAMG: Solving for p, Initial residual = 0.717438, Final residual = 8.17938e-07, No Iterations 118
time step continuity errors : sum local = 2.00798e-12, global = -1.10993e-23, cumulative = -4.19301e-21
ExecutionTime = 4.51 s ClockTime = 6 s
And even if we talk about the two results I have shown in the first post, if in both times the solver cannot converge at the 1000th iteration, then why difference? I thought they should "converge" at the same "point" OR they should "diverge" at the same "point", shouldn't they?

Edit:
-----
Here are the results of using PCG solver for the pressure (using the same tolerance and relTol, that are 1.0e-6 and 0.0 for the pressure and 1.0e-5 and 0.0 for the velocity), there is still difference in the final residual. Note it blows at the 14th time step, what you see here is only a copy of the first step residual log:
Quote:
Time = 2e-05

Courant Number mean: 0 max: 0
smoothSolver: Solving for Ux, Initial residual = 1, Final residual = 9.69316e-06, No Iterations 160
smoothSolver: Solving for Uy, Initial residual = 0, Final residual = 0, No Iterations 0
DICPCG: Solving for p, Initial residual = 1, Final residual = 0.173177, No Iterations 1001
time step continuity errors : sum local = 3.25419e-07, global = 3.30872e-23, cumulative = 3.30872e-23
DICPCG: Solving for p, Initial residual = 0.839426, Final residual = 0.0686308, No Iterations 1001
time step continuity errors : sum local = 1.96116e-07, global = 5.67777e-22, cumulative = 6.00864e-22
ExecutionTime = 2.29 s ClockTime = 7 s
Quote:
Time = 2e-05

Courant Number mean: 0 max: 0
smoothSolver: Solving for Ux, Initial residual = 1, Final residual = 9.69316e-06, No Iterations 160
smoothSolver: Solving for Uy, Initial residual = 0, Final residual = 0, No Iterations 0
DICPCG: Solving for p, Initial residual = 1, Final residual = 0.173388, No Iterations 1001
time step continuity errors : sum local = 3.25795e-07, global = -2.8058e-22, cumulative = -2.8058e-22
DICPCG: Solving for p, Initial residual = 0.83943, Final residual = 0.0685477, No Iterations 1001
time step continuity errors : sum local = 1.95924e-07, global = -2.50139e-22, cumulative = -5.30719e-22
ExecutionTime = 3.48 s ClockTime = 5 s
PS: My solver settings for these cases.
Code:
    p
    {
       solver          GAMG;
       tolerance       1e-6;
       relTol          0.0;
       smoother        GaussSeidel;
       nPreSweeps      0;
       nPostSweeps     2;
       cacheAgglomeration on;
       agglomerator    faceAreaPair;
       nCellsInCoarsestLevel 20;
       mergeLevels     1;
    }

    pFinal
    {
       $p;
       smoother        DICGaussSeidel;
       tolerance       1e-6;
       relTol          0;
    };

    U
    {
       solver          smoothSolver;
       smoother        GaussSeidel;
       tolerance       1e-5;
       relTol          0;
    }
Code:
    p   
    {   
        solver          PCG;
        preconditioner  DIC;
        tolerance       1e-06;
        relTol          0.0;
    }   
        
    pFinal   
    {   
        $p$;
        relTol          0;
    }   
        
    "(U|k|B|nuTilda)"   
    {   
        solver          smoothSolver;
        smoother        symGaussSeidel;
        tolerance       1e-05;
        relTol          0;  
    }
__________________
~
Daniel WEI
-------------
Boeing Research & Technology - China
Beijing, China
Email
lakeat is offline   Reply With Quote

Old   April 8, 2014, 10:40
Default
  #4
Senior Member
 
Alexey Matveichev
Join Date: Aug 2011
Location: Nancy, France
Posts: 1,930
Rep Power: 38
alexeym has a spectacular aura aboutalexeym has a spectacular aura about
Send a message via Skype™ to alexeym
Well,

In the first part of the answer you've used GAMG solver, can you show your settings for this solver? As it will do interpolation between coarse and fine meshes and maybe the difference in final residuals come from this operation.

Blowup of PCG usually means that your initial or boundary conditions do not make sense. Can you describe your case a little bit, show your fvSchemes, fvSolutions, boundary conditions? For example with cavity case from tutorial (which uses PCG and PBiCG), the residuals between runs are identical.
alexeym is offline   Reply With Quote

Old   April 8, 2014, 10:54
Default
  #5
Senior Member
 
lakeat's Avatar
 
Daniel WEI (老魏)
Join Date: Mar 2009
Location: Beijing, China
Posts: 689
Blog Entries: 9
Rep Power: 21
lakeat is on a distinguished road
Send a message via Skype™ to lakeat
It is exactly the same case copied from the tutorial icoFoam cavity. The only difference is the original mesh is 20*20, now I refine it to 1000*1000. The fvScheme settings are also the same. Boundary conditions initial conditions have all been double checked with vimdiff, they show no difference.

I also want to add, that I have not seen this in my 4 core and 16 core decomposition. This only starts to happen as I was trying to use 32+ core. It seems nothing to do with maximum iteration number "1000", because in a 4 core simulation, if I run twice with 1e-16 0.0, two log files shows same final residuals.
__________________
~
Daniel WEI
-------------
Boeing Research & Technology - China
Beijing, China
Email
lakeat is offline   Reply With Quote

Old   April 8, 2014, 11:02
Default
  #6
Senior Member
 
Alexey Matveichev
Join Date: Aug 2011
Location: Nancy, France
Posts: 1,930
Rep Power: 38
alexeym has a spectacular aura aboutalexeym has a spectacular aura about
Send a message via Skype™ to alexeym
OK.

- What decomposition method do you use?
- Does 4 and 16 core decomposition mean that you're running the case on a single node?
alexeym is offline   Reply With Quote

Old   April 8, 2014, 11:06
Default
  #7
Senior Member
 
lakeat's Avatar
 
Daniel WEI (老魏)
Join Date: Mar 2009
Location: Beijing, China
Posts: 689
Blog Entries: 9
Rep Power: 21
lakeat is on a distinguished road
Send a message via Skype™ to lakeat
Quote:
Originally Posted by alexeym View Post
OK.

- What decomposition method do you use?
- Does 4 and 16 core decomposition mean that you're running the case on a single node?
(1) decomposition method scotch, but I have tried simple with (32*1*1) and the final residuals are still different.

(2) 4 core: it is on my personal workstation. 16-core it's running on a high performance cluster. By using 16-core on that cluster, it means it is running on one node.


Why not try your self on your cluster, you just need to refine the cavity case grid to 1000*1000*1 and time step 2E-5, endTime set to 0.001. With 32 core, it will only take less than one minute.
__________________
~
Daniel WEI
-------------
Boeing Research & Technology - China
Beijing, China
Email
lakeat is offline   Reply With Quote

Old   April 8, 2014, 11:11
Default
  #8
Senior Member
 
Alexey Matveichev
Join Date: Aug 2011
Location: Nancy, France
Posts: 1,930
Rep Power: 38
alexeym has a spectacular aura aboutalexeym has a spectacular aura about
Send a message via Skype™ to alexeym
And case blows-up only when run on more than one node? (i.e. 4 and 16 core variants converge).
alexeym is offline   Reply With Quote

Old   April 8, 2014, 11:14
Default
  #9
Senior Member
 
lakeat's Avatar
 
Daniel WEI (老魏)
Join Date: Mar 2009
Location: Beijing, China
Posts: 689
Blog Entries: 9
Rep Power: 21
lakeat is on a distinguished road
Send a message via Skype™ to lakeat
OOPS, can anyone verify by running a small test on your computer.

Copy the cavity case from your tutorial and refine the mesh to 1000*1000*1, change deltaT to 2e-5 and endtime to 1e-3, and keep others unchanged. And run icoFoam.

Because I found even with one core, PCG solver is not able to converge. What have I done wrong?
__________________
~
Daniel WEI
-------------
Boeing Research & Technology - China
Beijing, China
Email

Last edited by lakeat; April 8, 2014 at 15:09.
lakeat is offline   Reply With Quote

Old   April 8, 2014, 14:52
Default
  #10
Retired Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,974
Blog Entries: 45
Rep Power: 128
wyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to all
Just a couple of quick notes:
  1. Do not disregard the compilation options! If you're using ICC and used the "-ffast-math" option, then it's natural that the results are slightly different.
    For some reference, check this comment: http://www.cfd-online.com/Forums/blo...ml#comment1577
  2. And make sure the BIOS settings are identical on all nodes... at least the settings related to the CPUs.
wyldckat is offline   Reply With Quote

Old   April 8, 2014, 15:02
Default
  #11
Senior Member
 
lakeat's Avatar
 
Daniel WEI (老魏)
Join Date: Mar 2009
Location: Beijing, China
Posts: 689
Blog Entries: 9
Rep Power: 21
lakeat is on a distinguished road
Send a message via Skype™ to lakeat
Quote:
Originally Posted by wyldckat View Post
Just a couple of quick notes:
  1. Do not disregard the compilation options! If you're using ICC and used the "-ffast-math" option, then it's natural that the results are slightly different.
    For some reference, check this comment: http://www.cfd-online.com/Forums/blo...ml#comment1577
  2. And make sure the BIOS settings are identical on all nodes... at least the settings related to the CPUs.

I don't think I have used that option, I just use "-O2 -no-prec-div", of cause I tried "-O3" as well.

Quote:
And make sure the BIOS settings are identical on all nodes... at least the settings related to the CPUs.
Well, I am not ready to piss them off . But I don't see a reason that our cluster administrators would use different BIOS-CPU settings for each node (just a guess).

PS: post #9 has been edited. Thanks
__________________
~
Daniel WEI
-------------
Boeing Research & Technology - China
Beijing, China
Email
lakeat is offline   Reply With Quote

Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
udf error srihari FLUENT 1 October 31, 2016 14:18
Is Playstation 3 cluster suitable for CFD work hsieh OpenFOAM 9 August 16, 2015 14:53
Big Difference Between serial run and parallel run case alundilong OpenFOAM Running, Solving & CFD 3 December 5, 2014 08:27
Run case from a script? miro2000 OpenFOAM Running, Solving & CFD 4 July 7, 2013 05:52
Big Difference Between Serial run and Parallel run case alundilong OpenFOAM Programming & Development 1 March 20, 2013 15:52


All times are GMT -4. The time now is 22:31.