CFD Online Discussion Forums

CFD Online Discussion Forums (https://www.cfd-online.com/Forums/)
-   Main CFD Forum (https://www.cfd-online.com/Forums/main/)
-   -   Limiters at interfaces in MPI (https://www.cfd-online.com/Forums/main/202327-limiters-interfaces-mpi.html)

nived_mr May 28, 2018 03:28

Limiters at interfaces in MPI
 
We have a 3D unstructured grid serial code which we are trying to parallelize using MPI. For the cases involving shocks we have problems of convergence when using a 2nd order upwind scheme. The code is blowing up.

We have tried communicating the higher order reconstructed values at the interface faces. But there are oscilations along the shock in case of parallel code w.r.t the serial code.

Can anyone suggest a possible solution to the issue?

Thanks.

selig5576 May 28, 2018 03:43

Blowing up
 
When you say "blow up" are you referring to numerical blow up or your simulation crashing due to memory leaks? If its due to blow up, I would confirm that your serial code under the exact same conditions to ensure its not an implementation issue. In terms of debugging MPI codes I would use TotalView, it is pretty much the standard.

nived_mr May 28, 2018 04:09

The blow-up is a numerical blow-up. The parallel code converges for a few time step iterations and then diverges. We have set the same conditions in both the cases. The first order accurate simulations converge well. We have issues in 2nd order accurate convective scheme implementation.

naffrancois May 28, 2018 11:53

Did you check at first order if you have exact same results (up to machine accuracy) between serial and // code ? You did not mention it.

If so, then second order. You mention that you have a communication point for the limited recontructed states, I guess prior to the Riemann solver.

How do you compute the reconstructed states ? Assuming that first you compute the gradients with Green-Gauss or Least-Squares, both methods needs average values coming from a neighborhood. Did you communicate these average values prior gradient computation ?

Depending on your implementation, same remark may hold for the limiter, for example BJ limiter needs Min and Max over a neighborhood.

AliE May 28, 2018 12:01

Hi,

As people here is writing, you have a bug in the parallel code. Trust me, I have done the same job with my code at the resulta between the two codes must be the same. Since for the second order you need the upwind and the downwind cell, have you cheked that the ghost nodes are exchanged correctly? Naturally this is just a guess!

nived_mr May 29, 2018 01:42

3 Attachment(s)
Thanks for the replies.

I would like to clarify that we are getting a maximum error of about 3.47 x 10^-3 percentage in case of first order calculations while using 16 processors. I have attached the images of results showing the same for both 1st order and second order results. Is this fine? We are having a code with Implicit Jacobians solved using LU-SGS method and we haven't modified the Jacobian part till now for parallel computations.

Regarding the communication of reconstructed values, we communicate the solution at interface cells once we get the solution at a new time step, then we use these to find average values of the variables. These are then used to calculate Green-Gauss Gradient at all cell centres. Then we calculate the limited reconstructed values at faces, which are communicated prior to solving for the solution at the next time set.

We use the Venkatkrishnan limiter and we are calculating minimum and maximum across the neighbourhood considering the neighbour cells in other zones also, which does not change the result.


AliE May 29, 2018 02:11

Hi,

Unfortunatly a bug in mpi is very tedious to find. In my case for example I have done a contour of the central coeff of the matrices for the serial and the parallel version and the same thing for the rhs. Then I have looked at this results togwther. With these method you can see immeditlely if your equations are the same and I mean exactly the same at the 16 decimal digit :) maybe you can try to do the same :) if rhs are different than you can start the debug by going back.

naffrancois May 29, 2018 04:10

Your first results are in my opinion NOT fine. Even if it may look almost the same on the graphs, the error is much higher than real precision.

Then before switching to second order, you should first fix it with first order. Can you switch your temporal scheme to a simple explicit time integrator like 1st order Euler scheme for testing purpose ? If it works, then you add once at a time new ingredients, implicit, 2nd order etc

edit: you say you fill in ghost cells at the END of the time step. Is the initial solution communicated for the first time step ?

nived_mr May 30, 2018 01:01

3 Attachment(s)
Dear naffrancois,

As per your suggestion, I tried running the same case using explicit Euler time stepping and it gives exact match(13 decimal places) in case of first order calculations. Please see the attached results. But once I use the 2nd order limiter, it gives a result similar to the one in case of implicit code even with explicit Euler time stepping.

I tried running explicit code with a limiter value of 1.0, giving match up to 11-12 decimal places suggesting that the error is not in the gradient calculations.

nived_mr May 31, 2018 07:42

We have finally fixed a bug in the 2nd order convective flux limiter calculation, thus serial and parallel results of explicit Euler time-stepping are matching up to decimal places of machine precision.
Now we have a slight difference in the results of serial and parallel codes in case of implicit time-stepping using LU-SGS solver for the same case. Will it be because of a mismatch in the calculation in the LU-SGS solver caused due to decomposition of the computational domain in the parallel code? Kindly suggest a way to fix it.

naffrancois May 31, 2018 17:47

Well you have done a big step finding the bug for the explicit solver. I am not familiar with LUSGS time stepping, but clearly extra care has to be taken. I parallelized not so long time ago an unstructured implicit scheme using 1st order in space Jacobian with gmres, I did it that way: each cpu owns a submatrix of size Nelem*Nelem_overlap, like an augmented matrix. In krylov linear solvers, typical operations involve matrix-vector products Jac*x=b and scalar products. Communication of x vector is then necessary prior matrix-vector product and sum reduction (MPI_ALLREDUCE) for scalar products.

I dont know how this applies to LUSGS, maybe you forgot contributions coming from ghost cells in the Upper part ?

By the way I assume you are testing LUSGS with 1st order

nived_mr June 1, 2018 02:30

Right now we are not using any MPI Functions inside the LU-SGS solver. We are working out a possible solution to account for missing neighbour cell information in the LU-SGS Matrix in all the processes.

naffrancois June 4, 2018 15:22

I would say as long as you have off-diagonal blocks into your matrix, e.g. not a block-diagonal matrix, means that you have for each element neighbor information. Then at a certain point in your LUSGS solver you will need MPI calls to synchronize values between the partitions. Anytime you need values coming from a neighbor element you have to ask yourself if this value may belong to a different CPU.

AliE June 5, 2018 05:17

Quote:

Originally Posted by naffrancois (Post 694566)
I would say as long as you have off-diagonal blocks into your matrix, e.g. not a block-diagonal matrix, means that you have for each element neighbor information. Then at a certain point in your LUSGS solver you will need MPI calls to synchronize values between the partitions. Anytime you need values coming from a neighbor element you have to ask yourself if this value may belong to a different CPU.

naffrancois is right. Have you tried to use PETSC in order to solve the linear system? It resolved a lot of problem related to the linear system parallelization in my case and it is super-fast!


All times are GMT -4. The time now is 07:28.