
[Sponsors] 
May 28, 2018, 03:28 
Limiters at interfaces in MPI

#1 
New Member
Nived M R
Join Date: May 2018
Posts: 6
Rep Power: 4 
We have a 3D unstructured grid serial code which we are trying to parallelize using MPI. For the cases involving shocks we have problems of convergence when using a 2nd order upwind scheme. The code is blowing up.
We have tried communicating the higher order reconstructed values at the interface faces. But there are oscilations along the shock in case of parallel code w.r.t the serial code. Can anyone suggest a possible solution to the issue? Thanks. 

May 28, 2018, 03:43 
Blowing up

#2 
Senior Member
Selig
Join Date: Jul 2016
Posts: 213
Rep Power: 7 
When you say "blow up" are you referring to numerical blow up or your simulation crashing due to memory leaks? If its due to blow up, I would confirm that your serial code under the exact same conditions to ensure its not an implementation issue. In terms of debugging MPI codes I would use TotalView, it is pretty much the standard.


May 28, 2018, 04:09 

#3 
New Member
Nived M R
Join Date: May 2018
Posts: 6
Rep Power: 4 
The blowup is a numerical blowup. The parallel code converges for a few time step iterations and then diverges. We have set the same conditions in both the cases. The first order accurate simulations converge well. We have issues in 2nd order accurate convective scheme implementation.


May 28, 2018, 11:53 

#4 
Senior Member
Join Date: Oct 2011
Posts: 130
Rep Power: 11 
Did you check at first order if you have exact same results (up to machine accuracy) between serial and // code ? You did not mention it.
If so, then second order. You mention that you have a communication point for the limited recontructed states, I guess prior to the Riemann solver. How do you compute the reconstructed states ? Assuming that first you compute the gradients with GreenGauss or LeastSquares, both methods needs average values coming from a neighborhood. Did you communicate these average values prior gradient computation ? Depending on your implementation, same remark may hold for the limiter, for example BJ limiter needs Min and Max over a neighborhood. 

May 28, 2018, 12:01 

#5 
Senior Member
Join Date: Dec 2017
Posts: 150
Rep Power: 4 
Hi,
As people here is writing, you have a bug in the parallel code. Trust me, I have done the same job with my code at the resulta between the two codes must be the same. Since for the second order you need the upwind and the downwind cell, have you cheked that the ghost nodes are exchanged correctly? Naturally this is just a guess! 

May 29, 2018, 01:42 

#6 
New Member
Nived M R
Join Date: May 2018
Posts: 6
Rep Power: 4 
Thanks for the replies.
I would like to clarify that we are getting a maximum error of about 3.47 x 10^3 percentage in case of first order calculations while using 16 processors. I have attached the images of results showing the same for both 1st order and second order results. Is this fine? We are having a code with Implicit Jacobians solved using LUSGS method and we haven't modified the Jacobian part till now for parallel computations. Regarding the communication of reconstructed values, we communicate the solution at interface cells once we get the solution at a new time step, then we use these to find average values of the variables. These are then used to calculate GreenGauss Gradient at all cell centres. Then we calculate the limited reconstructed values at faces, which are communicated prior to solving for the solution at the next time set. We use the Venkatkrishnan limiter and we are calculating minimum and maximum across the neighbourhood considering the neighbour cells in other zones also, which does not change the result. 

May 29, 2018, 02:11 

#7 
Senior Member
Join Date: Dec 2017
Posts: 150
Rep Power: 4 
Hi,
Unfortunatly a bug in mpi is very tedious to find. In my case for example I have done a contour of the central coeff of the matrices for the serial and the parallel version and the same thing for the rhs. Then I have looked at this results togwther. With these method you can see immeditlely if your equations are the same and I mean exactly the same at the 16 decimal digit maybe you can try to do the same if rhs are different than you can start the debug by going back. 

May 29, 2018, 04:10 

#8 
Senior Member
Join Date: Oct 2011
Posts: 130
Rep Power: 11 
Your first results are in my opinion NOT fine. Even if it may look almost the same on the graphs, the error is much higher than real precision.
Then before switching to second order, you should first fix it with first order. Can you switch your temporal scheme to a simple explicit time integrator like 1st order Euler scheme for testing purpose ? If it works, then you add once at a time new ingredients, implicit, 2nd order etc edit: you say you fill in ghost cells at the END of the time step. Is the initial solution communicated for the first time step ? 

May 30, 2018, 01:01 

#9 
New Member
Nived M R
Join Date: May 2018
Posts: 6
Rep Power: 4 
Dear naffrancois,
As per your suggestion, I tried running the same case using explicit Euler time stepping and it gives exact match(13 decimal places) in case of first order calculations. Please see the attached results. But once I use the 2nd order limiter, it gives a result similar to the one in case of implicit code even with explicit Euler time stepping. I tried running explicit code with a limiter value of 1.0, giving match up to 1112 decimal places suggesting that the error is not in the gradient calculations. 

May 31, 2018, 07:42 

#10 
New Member
Nived M R
Join Date: May 2018
Posts: 6
Rep Power: 4 
We have finally fixed a bug in the 2nd order convective flux limiter calculation, thus serial and parallel results of explicit Euler timestepping are matching up to decimal places of machine precision.
Now we have a slight difference in the results of serial and parallel codes in case of implicit timestepping using LUSGS solver for the same case. Will it be because of a mismatch in the calculation in the LUSGS solver caused due to decomposition of the computational domain in the parallel code? Kindly suggest a way to fix it. 

May 31, 2018, 17:47 

#11 
Senior Member
Join Date: Oct 2011
Posts: 130
Rep Power: 11 
Well you have done a big step finding the bug for the explicit solver. I am not familiar with LUSGS time stepping, but clearly extra care has to be taken. I parallelized not so long time ago an unstructured implicit scheme using 1st order in space Jacobian with gmres, I did it that way: each cpu owns a submatrix of size Nelem*Nelem_overlap, like an augmented matrix. In krylov linear solvers, typical operations involve matrixvector products Jac*x=b and scalar products. Communication of x vector is then necessary prior matrixvector product and sum reduction (MPI_ALLREDUCE) for scalar products.
I dont know how this applies to LUSGS, maybe you forgot contributions coming from ghost cells in the Upper part ? By the way I assume you are testing LUSGS with 1st order 

June 1, 2018, 02:30 

#12 
New Member
Nived M R
Join Date: May 2018
Posts: 6
Rep Power: 4 
Right now we are not using any MPI Functions inside the LUSGS solver. We are working out a possible solution to account for missing neighbour cell information in the LUSGS Matrix in all the processes.


June 4, 2018, 15:22 

#13 
Senior Member
Join Date: Oct 2011
Posts: 130
Rep Power: 11 
I would say as long as you have offdiagonal blocks into your matrix, e.g. not a blockdiagonal matrix, means that you have for each element neighbor information. Then at a certain point in your LUSGS solver you will need MPI calls to synchronize values between the partitions. Anytime you need values coming from a neighbor element you have to ask yourself if this value may belong to a different CPU.


June 5, 2018, 05:17 

#14  
Senior Member
Join Date: Dec 2017
Posts: 150
Rep Power: 4 
Quote:


Tags 
mpi 
Thread Tools  Search this Thread 
Display Modes  


Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Doubt regarding MUSCL scheme implementation and associated limiters  SangeethCFD  Main CFD Forum  0  October 7, 2017 09:00 
mpirun, best parameters  pablodecastillo  Hardware  18  November 10, 2016 12:36 
[OpenFOAM.org] MPI compiling and version mismatch  pki  OpenFOAM Installation  7  June 15, 2015 16:21 
Error using LaunderGibsonRSTM on SGI ALTIX 4700  jaswi  OpenFOAM  2  April 29, 2008 10:54 
Is Testsuite on the way or not  lakeat  OpenFOAM Installation  6  April 28, 2008 11:12 