CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Software User Forums > ANSYS > CFX

Running CFX solver in batch parallel mode

Register Blogs Members List Search Today's Posts Mark Forums Read

Like Tree4Likes
  • 2 Post By ghorrocks
  • 1 Post By ghorrocks
  • 1 Post By ghorrocks

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   January 19, 2023, 08:05
Default Running CFX solver in batch parallel mode
  #1
New Member
 
Felipe Silva Maffei
Join Date: Dec 2015
Posts: 11
Rep Power: 10
Maffei is on a distinguished road
Hi,

I am trying to run the CFX solver using my university cluster, but when I execute the following command:

Code:
cfx5solve -numa auto -def Fluid\ Flow\ CFX.def -start-method "Intel MPI Distributed Parallel" -par-dist n02*40,n03*40 -batch -monitor log
I observed that both nodes have 40 processes (through htop command), but only node n03 uses 100% of the 40 CPUs. The processes in node n02 use only 3% of CPUs.

Did you know how to make the CFX solver uses all the nodes of both nodes?
Maffei is offline   Reply With Quote

Old   January 19, 2023, 16:29
Default
  #2
Super Moderator
 
Glenn Horrocks
Join Date: Mar 2009
Location: Sydney, Australia
Posts: 17,728
Rep Power: 143
ghorrocks is just really niceghorrocks is just really niceghorrocks is just really niceghorrocks is just really nice
The most likely cause for this is that you are trying to feed the multipartition data for 80 partitions through one interconnect. Unless you have a very high-end network between these two machines the interconnect will be flooded and will bottleneck the simulation.

It would be better to have 8 machines with 10 partitions each rather than 2 machines with 40 partitions each as the network load will get spread over more interconnects.
Opaque and Maffei like this.
__________________
Note: I do not answer CFD questions by PM. CFD questions should be posted on the forum.
ghorrocks is offline   Reply With Quote

Old   January 19, 2023, 16:32
Default
  #3
Super Moderator
 
Glenn Horrocks
Join Date: Mar 2009
Location: Sydney, Australia
Posts: 17,728
Rep Power: 143
ghorrocks is just really niceghorrocks is just really niceghorrocks is just really niceghorrocks is just really nice
Oh - and the problem might not necessarily be the network connection. It could be the connection of the network adapter to the CPU, so that means the FSB, memory interconnect and other motherboard stuff. So the motherboard quality is critical as well.
Maffei likes this.
__________________
Note: I do not answer CFD questions by PM. CFD questions should be posted on the forum.
ghorrocks is offline   Reply With Quote

Old   January 20, 2023, 08:44
Default
  #4
New Member
 
Felipe Silva Maffei
Join Date: Dec 2015
Posts: 11
Rep Power: 10
Maffei is on a distinguished road
Thanks for the answer,

Ok, I will check these points and the possibility to have more nodes with fewer CPUs in each one. Is there another possibility? Because I made one test where I stated the same case, but I don't allocate the node which the thinks works well on the first try and I observe one node with 40 CPUs working at 3% of its capacity. I was wondering if is something related to the host machine configuration (either Ansys or cluster).
Maffei is offline   Reply With Quote

Old   January 21, 2023, 01:18
Default
  #5
Super Moderator
 
Glenn Horrocks
Join Date: Mar 2009
Location: Sydney, Australia
Posts: 17,728
Rep Power: 143
ghorrocks is just really niceghorrocks is just really niceghorrocks is just really niceghorrocks is just really nice
It is a really good idea to test the capabilities of your cluster before you start using it. I recommend getting a benchmark simulation, and then running it 1,2,4,8,16,32 etc partitions on one machine and check the scaling, and then do the same across 2 machines, then 4, 8 etc.

This will tell you how many partitions you can put on a single node (as it is likely performance will drop off before you use all cores), and how it scales across multiple nodes. It is very instructive - and it will also tell you the optimum configuration you should define to get best performance from your cluster. As you are seeing, optimum performance is almost certainly not using the maximum number of partitions.
Maffei likes this.
__________________
Note: I do not answer CFD questions by PM. CFD questions should be posted on the forum.
ghorrocks is offline   Reply With Quote

Old   January 21, 2023, 07:32
Default
  #6
New Member
 
Felipe Silva Maffei
Join Date: Dec 2015
Posts: 11
Rep Power: 10
Maffei is on a distinguished road
Ok, thanks for the explanation. I will try it. One more thing, can the cluster performance be software dependent? This is because I have a lab friend who use to run OpenFOAM using 4 nodes with all CPUs and he doesn't report problems like this.
Maffei is offline   Reply With Quote

Old   January 21, 2023, 17:19
Default
  #7
Super Moderator
 
Glenn Horrocks
Join Date: Mar 2009
Location: Sydney, Australia
Posts: 17,728
Rep Power: 143
ghorrocks is just really niceghorrocks is just really niceghorrocks is just really niceghorrocks is just really nice
Yes, cluster performance is software dependant. Different software has different loads on main memory, L1 and L2 caches, hard drive, inter-partition communications and so on. Also the optimisation options when the software is compiled makes a big difference.

But most respectable Navier Stokes based CFD codes should be very similar. You should only note major differences if going to extreme numbers of partitions. But if you compare a CFD software with a ray tracing software for instance - I would expect them to scale very differently on a large cluster.
__________________
Note: I do not answer CFD questions by PM. CFD questions should be posted on the forum.
ghorrocks is offline   Reply With Quote

Reply

Tags
ansys, cfx, cluster

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
CFX Solver stopped with error when requested for backup during solver running Mfaizan CFX 40 May 13, 2016 06:50
[PyFoam] Problems with the new PyFoam release zfaraday OpenFOAM Community Contributions 13 December 9, 2014 18:58
Running macros in parallel in batch mode nomad STAR-CCM+ 13 February 22, 2013 08:30
RSH problem for parallel running in CFX Nicola CFX 5 June 18, 2012 18:31
DPM model in parallel batch mode Prashanth FLUENT 2 March 6, 2009 07:54


All times are GMT -4. The time now is 17:45.