CFD Online Discussion Forums

CFD Online Discussion Forums (https://www.cfd-online.com/Forums/)
-   CFX (https://www.cfd-online.com/Forums/cfx/)
-   -   Parallel run on x2 zeon is really slow (https://www.cfd-online.com/Forums/cfx/236836-parallel-run-x2-zeon-really-slow.html)

ErenC June 18, 2021 03:10

Parallel run on x2 zeon is really slow
 
Hello!

I am kind of newbie on CFX, it is not my main CFD solver. We have a x2 zeon processor workstation, I ran 5m case on 24 cores but after 16 hours it only ran 1200 iterations. It is isothermal, MRF, st-st case. Roughly there are 200k nodes per processor, I think it should be much faster.

I nearly set-up nothing. I just selected Intel MPI Local Parallel, 24 partitions, changed Memory Alloc. Factor to 1.3 and run it. When I check cpu, 56% of the CPU(with ctrl alt del) is under use and 20 GB ram is being used.

If anyone can give me tips of what might be wrong, I will be glad.

Eren.

Martin_Sz June 18, 2021 04:54

how many threads have Ur processor - its possible that U only use little of Ur CPU power

ErenC June 18, 2021 05:00

Thank you for your answer. Both processors has 12 cores and 24 threats, I'm not trying to split it more than core numbers, If that is what you are asking.

ghorrocks June 18, 2021 05:06

There are many, many things to check to get maximum speed out of a computer.

First, on the computer itself:
* Is the CPU, motherboard, memory suitable? (I don't mean it turns on, I mean does it run continuously at high speed)
* Is the BIOS and firmware correct? If you have a CPU which is not supported by the BIOS it will run dog slow until you fix the BIOS (I fell for this one years ago)
* Are all the drivers up to date and suitable from your hardware?
* Is there other software or users hogging CPU?
* Is it being anti-virused to death?

Then the simulation:
* Is it all in RAM or going to swap?
* do you have a suitable number of nodes per core?

Then some basic checks?
* Run a benchmark simulation in serial mode on this machine and an alternate machine you know is working well. Is the speed difference what you expect? You can estimate CPU speed using CPU benchmark websites (spec.org is my favourite)
* Run a series of simulations with 2, 4, 8, 16 etc partitions. Is it scaling with partitions as you expect? All machines will taper off in performance somewhere. Also, some machines do not run fastest with all cores. They run better when a few cores are left idle.

ErenC June 18, 2021 05:29

Thank you for the detailed answer, Well I started working here few months ago and I think bios is the answer. Because HPC was sitting dead until now. I'll check all the parameters you mentioned if bios update wont work. (btw, I know it is really slow because i7 4k series PC is solving it with 4 cores in similar speed)

Opaque June 18, 2021 09:01

For clarification, does your machine have 12 physical cores, and 24 logical cores?

On such a machine, you will rarely (if ever) see a performance close 12x for any real industrial HPC simulation.

If you want to find out for yourself what you machine can do, take the StaticMixer tutorial.

1 - Set your BIOS to NOT use multi-threading, i.e only use the maximum number of physical cores.

2 - Run your simulation with 1 partition, make that your baseline

3 - Repeat the simulation with 2, 4, 8, 12 and compare the numbers

4 - You will then find out the true scaling that specific machine can offer.

HPC scaling is only possible when the pieces align correctly: software and hardware.

Having a lot of cores with a not so great memory layout, it is a recipe for problems. The different partitions must be in independent memory regions; otherwise, there will be a lot of OS overhead between the independent partitions.

You may need other tools (beyond me) to inquire for such information, and realize what is really happening.

It is like a over hyped car with lot of power in the engine, and not so great suspension and handling. Can it drive really fast in a real track, not just straight ahead?

Hope the above helps,

ErenC June 18, 2021 15:39

Thank you for your answer @Opaque I am sorry that I wasn't clear.

It is a Dell PowerEdge R730 server with dual Zeon E5-2670(so 12x2 physical cores and 24x2 logical cores) with 96 GB of ram. So I don't think the issue lies with the hardware, because this machine build for HPC right? But you are right, I remember reading about "only use physical cores", this is my first HPC experience(I am actually academic 2D thermal flow guy) so I appreciate all your marks.

So being more clear: as I said it was sitting dead like 1 year and it had fresh format, so virus is not an option. I installed ANSYS two days ago(I would like to give my thanks to our ANSYS distrubutor because they haven't answered my question about speed) and nothing else is installed (well, I installed anydesk so I wont have to go to cluster everytime). After suggestions, I updated BIOS, updated firmwares(last bios update in 2018? why server updates are not really frequent?) and started a run, I'll see the results on monday.

I'll try a benchmark problem for core scaling as you suggest, but I am really tight on schedule at the moment(for two weeks), so if you have more suggestion I'll gladly listen to them.

evcelica June 21, 2021 15:30

Try turning off hyperthreading as Opaque suggested.
E5-2670 (what version?) original only has 8 cores, and 10 cores for v2. So yours must be v3?

Most Importantly, 96GB RAM sounds like it has misconfigured memory (unbalanced). How many memory DIMMs do you have? You should have all 4 memory channels of each CPU populated evenly. (research "balanced memory" configuration) It is not possible to do this with 96GB since 12GB server DIMMs are really not available. My guess is your memory is misconfigured.

Martin_Sz June 23, 2021 02:01

how many cores U have HPC license
Remember Ansys define cores/ threads/ cpu as the same so if u have
6cpu 12 threads and u have hpc on 6 - u only use 50 % of Ur machine possibilities


All times are GMT -4. The time now is 00:17.