CFD Online Discussion Forums - Parallel Scaling in Unsteady Sliding Mesh Cases

CFD Online Discussion Forums

CFD Online Discussion Forums (https://www.cfd-online.com/Forums/)

- FLUENT (https://www.cfd-online.com/Forums/fluent/)

- - Parallel Scaling in Unsteady Sliding Mesh Cases (https://www.cfd-online.com/Forums/fluent/27864-parallel-scaling-unsteady-sliding-mesh-cases.html)

Jonas Larsson

August 30, 2000 06:31

Parallel Scaling in Unsteady Sliding Mesh Cases

I'm running a large unsteady sliding-mesh case (an axial turbine with several blade-rows). The case is large (several million cells) and has to be run in parallel. Normally a case of this size would scale very well up to more than 20 CPUs. However, for some reason this is not the case when you run it unsteady with sliding meshes in Fluent. When you increase the number of CPUs the iteration time for each time-step decreases but this gain in speed is lost becasue with more parallel CPUs the time to update the solution between each time-step increases dramatically. For example, with 20 CPUs the subiterations in each time-step finish in a few minutes and then the update to the next time-step can take 10 minutes (the solver just writes out "Updating solution at time levels N and N-1." and stops for 10 minutes!) If you instead run on 6 CPUs the subiterations take about 10 minutes but the time-step-update is much quicker so the case actually runs just as fast on 6 CPUs as on 20.

I see no good reason to this rapid increase in "time-step-update-time" when you run on many CPUs. Is there? Is there any way to reduce this time? I have noticed that some cases seem to be a little more well-behaved than others in this respect. Can the specific partitioning used affect this time-step-update-time?

John C. Chien

August 30, 2000 10:30

Re: Parallel Scaling in Unsteady Sliding Mesh Case

(1). I am just curious about this "updating solution" operation. (2). Is it writing the results to a disk file? If you have several million cells, then how big is your results file? Are the results saved on each machine , or on one single file? (3). I am not parallel processor user, and all I know is that writing a huge file to disk can take a lot of time. (depending on the hardware, operating system, I guess).

Glenn Price

August 30, 2000 12:15

Re: Parallel Scaling in Unsteady Sliding Mesh Case

I believe the updating solution step includes moving the sliding mesh. Maybe this is being done in an inefficient way. Jonas, have you talked to Fluent about this?

Regards, Glenn

Jonas Larsson

August 30, 2000 13:33

Re: Parallel Scaling in Unsteady Sliding Mesh Case

The case and data files are several hundred megabytes so saving them, as you say, takes a long time (they are saved as one big file by one control-process on one CPU). This is not what causes the delay though - no files are saved during the "time-step-update".

Jonas Larsson

August 30, 2000 13:42

Re: Parallel Scaling in Unsteady Sliding Mesh Case

Yep, the sliding mesh-parts are moved and I also assume that the interfaces between sliding and non-sliding parts must be updated. I don't see any reason why this process should become so terribly slow on many CPUs though. If you run a steady case without sliding meshes parallel scaling on this kind of large case is very good up to at least 40 CPUs. With the unsteady sliding meshes things slow down very quickly on more than 6 CPUs. As you say Fluent might be doing something which is a bit inefficent in in the time-step-update process.

The problem is not caused by a slow network - the same slowdown occurs both on our Linux cluster (100 mbit fast ethernet) and on our HP V-class parallel SMP machines (internal cross-bar switch with gbit speed).

I have talked to Fluent support in UK to get some feedback on what might cause this. They will look into it and I'm waiting for their response. If they have a good explanation I'll post it here.

Nath Gopalaswamy

September 1, 2000 17:58

Re: Parallel Scaling in Unsteady Sliding Mesh Case

When the mesh is moved, restablishing the connectivity because of the mesh motion could be more expensive for the parallel solver because of a particular labelling function. I think this is what you are experiencing. As the number of processes grows, this labelling algorithm could be the dominant consumer of elapsed time, based on some of our recent findings. This is something we are looking into for Fluent6, to improve the efficiency of transient calculations with sliding boundaries for the parallel solver.

Sincerely,

Nath Gopalaswamy

Jonas Larsson

September 2, 2000 06:34

Re: Parallel Scaling in Unsteady Sliding Mesh Case

Nath, many thanks for the explanation of what is going on. Is there any way that I can reduce the time that this "relabellling" takes by carefully chosing partitioning method or so?

I have run the same basic case two times (I ran a different operating point a few months ago) and the second time I ran it I started from scratch, re-created the sliding interfaces and re-partitioned the case. For some reason this second version takes much longer time between each time-step. I'm not sure of which partitioning method I used the first time but the partitioning and the boundary conditions is the only big difference I can think of aside from the fact that the second case is run with two species in order to simulate a CO2 seeding used in experiments (can this affect the relabelling time?).

In order to get the case to run in parallel I have to de-select "partitioning across zone boundaries". This means that, if I for example partition for 6 CPUs, each mesh-part will be split into 6 partitions. I have 5 mesh-parts, with 4 general interfaces (two steady and two sliding), giving a total of 30 partitions. This doesn't seem optimal. Can I somehow get the partitioning to only create 6 partitions for running on 6 CPUs?

John C. Chien

September 2, 2000 10:45

Re: Parallel Scaling in Unsteady Sliding Mesh Case

(1). Instead of solving your problem right away, they are spending time to improve it in this version-6. A good sign. (2). And the version-6 may also have other things need to be improved also. So, in using the commercial codes, you will have to be patient. (3). If you had run a case with much better performance before, I am sure that you can get back to it again. (there is still this uncertainty, if you partition it in different ways, would you still get back the same results? ) (4). I have a friend using a commercial code for a while (on and off), so he had all of his files set up nicely to run it. Now, with many newer versions out, the old features are no longer available. So, he has two choices: one is to learn the new approach and convert everything to make it work,(he failed this approach), the second one is to do nothing, and see whether one day he can get the old version back. (5). As an extra tools, the commercial codes are nice to have. But like any other products, if it's there use it right away, otherwise, next week, it may not be there. (for my friend, the old vendor also changed hand, now he needs to wait longer.) (6). I still think that, the only reliable way to do cfd is to write my own codes, if accuracy and reliability is very important to the task. For odd jobs, one can try commercial codes, if the solution is there, it's there. Otherwise, just look for something else.

Nath Gopalaswamy

September 3, 2000 17:55

Re: Parallel Scaling in Unsteady Sliding Mesh Case

As the number of partition interfaces grows, the relabelling time will most probably increase. You mentioned that you had to deselect partitioning across zones in order to get it to run in parallel the second time. I think this could be the source of the problem, because of the increased number of partition interfaces. You could look at the output of Grid/Partition/Print partitions for the old case and the new case, and compare the number of interface faces per CPU. Does the parallel solver crash if partitioning across zones is turned on? You may want to log this issue with your Fluent Support staff who can investigate this further and suggest a solution or pass it on to development if a bug has caused the crash.

Sincerely, Nath

Jonas Larsson

September 5, 2000 10:13

Re: Parallel Scaling in Unsteady Sliding Mesh Case

Thanks again for the information. I actually brought up this issue with support when I first started running this type of sliding-mesh cases in Fluent about a year ago. The advice I got then was that I had to deactivate "partition across zone" in order to get it to work. I also did this both for the first and the latest case. If I didn't deselect this across-zone partitioning I think I got negative-volume errors after the first time-step. I haven't tried this lately though. Are you saying that it shouldn't be necessary to deactivate across-zone partitioning for sliding-mesh cases?

All times are GMT -4. The time now is 08:21.