CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > General Forums > Hardware

Upgrade from 2x E5-2687W v3 for Comsol 5.3 electromagnetic simulations

Register Blogs Members List Search Today's Posts Mark Forums Read

Like Tree1Likes
  • 1 Post By fernbedienung

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   August 6, 2019, 04:44
Default Upgrade from 2x E5-2687W v3 for Comsol 5.3 electromagnetic simulations
  #1
New Member
 
Joshua
Join Date: Aug 2019
Posts: 3
Rep Power: 3
fernbedienung is on a distinguished road
Hello everyone,

We are looking for an upgrade for our workstation, as it is heavily used for all sorts of different tasks.

We are only doing electromagnetic simulations with COMSOL 5.3 and a relative low node count of 10,000 up to 250,000. A typical simulation is using less than 10 GB of RAM. The new Workstation will be used exclusively for these calculations.

Our current machine:
Dual E5-2687W v3 (each 10 cores, 3.1 GHz Haswell)
192 GB (8x 16GB) of dual rank DDR4 RAM running in quad channel at 2132 MHz

We are looking for a performance incensement of about 50%; otherwise, a new machine is not worth the investment. Initially we planned spending about 4000€ max. Unfortunately, benchmarks for COMSOL are not very popular and other benchmarks are all over the place.
Our first plan was to buy a Threadripper 2990WX and the fastest available RAM for it, but after reading in this forum I’m not so sure anymore because of the two cores without a direct memory controller.
We would greatly appreciate your opinion on the performance and your suggestions.
fernbedienung is offline   Reply With Quote

Old   August 6, 2019, 12:30
Default
  #2
Senior Member
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 2,680
Rep Power: 38
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
First things first: You won't get a 50% upgrade over your current machine with 4000$. Maybe not even with 20000$. And avoid TR 2990WX at all costs.

Reading through some of the advice for COMSOL and combining it with your requirements of very low element counts I come to the conclusion that there may not be anything on the market worth buying compared to your current machine. The problems in particular:

Scaling with low element counts is bad
That's just normal and there is not really a way around this. Parallelization overhead increases as element count decreases. So just buying something with more cores and more memory bandwidth won't help much for strong scaling of small cases.

The solver seems to have a significant serial fraction
The extent seems to change depending on the exact solver type you use. But in general this means that high core counts don't help much, and even higher memory bandwidth (which the solver definitely likes) does not help much either beyond a certain point. What is needed here is high single-core performance, and the Xeon E5-2687W v3 is quite good in that regard. Amdahl's law at work.

You can verify how your small cases scale on your machine by running it on 1,2,4...cores and comparing the execution times. This could definitely help choosing an upgrade path.
There would be a simple way around all of this in case your workflow and licenses allow it: Instead of running 1 case on all cores of the machine, run several cases at the same time with lower core count each. The cases will of course run slower due to the memory bottleneck, but this resolves scaling issues and leads to higher overall throughput. So e.g. running 4 cases at the same time will only take 2-3 times as long as running a single case.

I assume you already tweaked the solver settings and disabled SMT. If available, activating cluster-on-die mode for your CPUs should also yield some performance gains combined with the right execution flags.
There is a discussion about the fastest settings here, specifically with a NUMA machine: https://www.comsol.com/forum/thread/...opteron-system

Edit: thinking about this again, maybe there is a chance to get a relatively cheap upgrade. Assuming the following criteria are met:
1) you can not run more than 1 case at the same time due to licensing constraints
2) the case scales better on a single CPU than distributed across both CPUs.
3) overall bad scaling beyond around 8 cores
In this case, a core I7-9800x along with 4x16GB of the fastest memory you can afford might perform better.

Last edited by flotus1; August 7, 2019 at 04:43.
flotus1 is offline   Reply With Quote

Old   August 7, 2019, 13:14
Default
  #3
Senior Member
 
Simbelmynė's Avatar
 
Join Date: May 2012
Posts: 456
Rep Power: 12
Simbelmynė is on a distinguished road
Just to add to this. I tested one of our dual EPYC 7301 machines with Comsol. Just out of curiosity since I do not normally use Comsol.


An 8700k was faster for a CFD case with approximately one million degrees of freedom.



I think Comsol suffers a lot from Amdahl's law. We have seen better scaling on Intel CPUs though so that might also be the case.
Simbelmynė is offline   Reply With Quote

Old   August 7, 2019, 13:47
Default
  #4
Senior Member
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 2,680
Rep Power: 38
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
The way I understood it, COMSOL has two kinds of parallelism implemented. One for shared memory and one for distributed memory systems.
For a workstation with a rather complicated NUMA topology like 2xEpyc, choosing the right one and getting the settings for core binding correct is crucial for performance. I would imagine that just starting it with -np will lead to abysmal performance.
They have two pretty in-depth articles about setting up each parallel mode
distributed: https://www.comsol.com/support/knowledgebase/1001/
shared: https://www.comsol.com/support/knowledgebase/1096/
flotus1 is offline   Reply With Quote

Old   August 8, 2019, 04:22
Default
  #5
New Member
 
Joshua
Join Date: Aug 2019
Posts: 3
Rep Power: 3
fernbedienung is on a distinguished road
Thanks for your testing and considerations.
To fill in the missing gaps of information, some more info from my side:
-licensing should not be a problem, as we can compute as many cases as we want; as long COMSOL is running on one PC (cluster should be possible as well, not tried yet). From what I read on the forums, however this is only somewhat true for Windows, as with Linux one instance of COMSOL uses one license.
- Turning off Hyperthreading did not change the computing speed
- We did not try different execution flags up to now, but we'll definitely try
- We have experienced the performance scaling roughly with the square root of the number of threads, but this was only tested for “high” core counts (20,15,10….)
- we’re using the MUMPS solver, as the COMSOL link suggests is not ideal for high core counts
- The current PC is used by other users and programs, so if we can get at least the same performance for let’s say about 2000€ we would buy a new machine too.

I will try to get some scaling and flag benchmarks in the meanwhile.


Edit:
We tried the PARADISO solver and it appears to be faster using all cores (typcal sweep: 8min vs 12min). However testing the expected different scaling of this solver has to wait some time, as the workstation is used by other people as well right now - which then maybe spoils the results.

Last edited by fernbedienung; August 8, 2019 at 05:39.
fernbedienung is offline   Reply With Quote

Old   August 16, 2019, 08:02
Default
  #6
New Member
 
Joshua
Join Date: Aug 2019
Posts: 3
Rep Power: 3
fernbedienung is on a distinguished road
Quote:
Originally Posted by flotus1 View Post
You can verify how your small cases scale on your machine by running it on 1,2,4...cores and comparing the execution times. This could definitely help choosing an upgrade path.
There would be a simple way around all of this in case your workflow and licenses allow it: Instead of running 1 case on all cores of the machine, run several cases at the same time with lower core count each. The cases will of course run slower due to the memory bottleneck, but this resolves scaling issues and leads to higher overall throughput. So e.g. running 4 cases at the same time will only take 2-3 times as long as running a single case.
To give you an quick update -We were able to do some tests on the machine:
running the cases of a parameter sweep as a batch sweep massively improved computation time!
1 process, 20 cores: 9:03 h
10 processes 2 cores each 2:40 h
20 processes 1 core each 2:31 h



Comsol already is aware of the topology of two CPUs with each 10 cores (also in settings, not only via flags), but as I understand, the additional flags should only affect the execution when using more than 1 core for one calculation?



If so I guess this is the best optimization we can get. So thanks for all your great help!!!
flotus1 likes this.
fernbedienung is offline   Reply With Quote

Old   August 16, 2019, 08:12
Default
  #7
Senior Member
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 2,680
Rep Power: 38
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
Quote:
but as I understand, the additional flags should only affect the execution when using more than 1 core for one calculation?
Correct, running single-threaded removes most of the pitfalls of a NUMA system. The only issue remaining would be your operating system swapping threads around on the physical CPUs. But I would assume COMSOL pins its threads to physical cores.
Anyway, great to hear that you got a 250% performance increase for free. With that out of the way, you could of course upgrade to a faster machine now. With scaling issues resolved, you can now benefit from the hardware improvements of the last 5 years.
flotus1 is offline   Reply With Quote

Reply

Tags
comsol, electomagnetics, electromagnetic

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Electromagnetic Theory electromagneticseasy Main CFD Forum 0 June 1, 2012 02:12


All times are GMT -4. The time now is 23:02.