CFD Online Logo CFD Online URL
Home > Forums > General Forums > Main CFD Forum

How does parallelisation works ?

Register Blogs Members List Search Today's Posts Mark Forums Read

LinkBack Thread Tools Search this Thread Display Modes
Old   July 16, 2020, 04:42
Question How does parallelisation works ?
New Member
Join Date: Nov 2016
Posts: 6
Rep Power: 8
Martin007 is on a distinguished road
Hi everyone,

I am currently running parallelized calculation of a bi-periodic channel flow. The channel is divided into 4 parts which are associated with a processor.
For instance, if we have 4 processors distributed in as follow:
| 1 | 3 |
--------- -> flow direction
| 2 | 4|

(___ represents the walls of the channel, | and --- represent the frontier of the domain attributed to each processor)

I would have like to know how does and when information is transfered between processors. How can processors 3 and 4 work in parallel if they do not have the flow characteristics resulting from processors 1 and 2's calculation ? The same question is valid from the frontier between processor 1 and processor 2. There should be a continuous interaction between all processors but I don't understand how it works. Can someone explain it to me ?

Thank you very much,
Martin007 is offline   Reply With Quote

Old   July 16, 2020, 07:28
Join Date: Sep 2019
Posts: 51
Rep Power: 5
gnwt4a is on a distinguished road
without knowing the numerical method one cannot say. moreover, r u talking about 4 cores on the same chip or four separate multicore chips.

in genera,l whatever the method, if the flow is incompressible there must be a global exchange of information once per time step.

gnwt4a is offline   Reply With Quote

Old   July 16, 2020, 08:06
Senior Member
Join Date: Apr 2011
Location: Orlando, FL USA
Posts: 5,146
Rep Power: 61
LuckyTran has a spectacular aura aboutLuckyTran has a spectacular aura aboutLuckyTran has a spectacular aura about
The |, and --- are called inter-processor boundary faces and they are tagged as such in the decomposed mesh. Eventually what you need in FVM is the face values (more correctly the face fluxes) on these shared inter-processor boundary faces. The approach for determining the face fluxes is defined via your gradient interpolation scheme which generally requires cell values on either side of the inter-processor faces.

In modern parallelized codes (pretty much anything that runs on an mpi), the values from from cells at neighboring processors are streamed to one-another. That is, the left side of | sends the cell values to the right side and vice-versa. It's straightforward to stream cell values of the adjacent cells (i.e. 1 layer deep). Not trivial is how to stream cell values multiple layers deep and that's why your discretization schemes at inter-processor boundaries are usually limited in many commercial codes.
LuckyTran is offline   Reply With Quote

Old   July 19, 2020, 06:00
Senior Member
sbaffini's Avatar
Paolo Lampitella
Join Date: Mar 2009
Location: Italy
Posts: 2,014
Blog Entries: 29
Rep Power: 38
sbaffini will become famous soon enoughsbaffini will become famous soon enough
Send a message via Skype™ to sbaffini
The idea is that each processor owns not only its own cells as depicted by you, but also some others from the neighboring processors. These can be one or multiple layers, with a tradeoff on memory consumption and amount of data exchanged typically being on using just a single layer.

This sounds more difficult than it is, you actually just have a larger than expected mesh on each processor and keep track of the part that is effectively owned by the processor and which one actually belongs to the neighbor ones.

At the start of each iteration, in the exact same way as you would need to initialize your variables on a single grid, you perform the parallel exchanges between neighbor processors. Once that is done, you can practically treat the computation on each processor as if it was serial.

There are just a couple of caveats:

1) if your algorithm needs cell gradients, it is typically better to exchange them as well, instead of computing them with the exchanged values. So after you compute them, you exchange them as well. If you need iterations to compute them (as required by some gradient computation method), you exchange them after each iteration.

2) If you need to solve a linear system, say, because you are using an implicit method, the parallelization is needed there as well, but you need to see it as part of the linear system solver. In practice, if you use, say, SOR, the idea is that you exchange the variables solved in the linear system (as opposed to the general variables used in the code) after each linear iteration. So you effectively work Jacobi like between processors and SOR like on each processor. That's typically a good compromise. Not an expert here, but Krylov methods should then just need some global reduction to work on top of such a SOR like preconditioner.
sbaffini is offline   Reply With Quote


information transfer, parallel calculation, processors

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On

Similar Threads
Thread Thread Starter Forum Replies Last Post
Homogeneous reaction works fine alone but not with diffusion and/or convection Anderson2019 Main CFD Forum 1 October 5, 2019 09:12
Viscosity UDF works when interpreted, Doesn't when compiled? bloodflow Fluent UDF and Scheme Programming 4 April 11, 2019 09:06
My UDF works well with Fluent 16 but not with Fluent 19 Ahmed A. Serageldin Fluent UDF and Scheme Programming 3 October 19, 2018 11:38
Why renumbering works for LduMatrix? chengdi OpenFOAM 4 July 31, 2017 18:54
Parallel runs with sonicDyMFoam crashes (works fine with sonicFoam) jnilsson OpenFOAM Running, Solving & CFD 0 March 9, 2012 06:45

All times are GMT -4. The time now is 23:54.