travonz April 1, 2013 15:57

Few fast cores or a lot of slow cores

I wonder if there is a big performance gap between those 2 configurations for OpenFOAM:

Dual E5-2620 so 12 cores at 2GHz, memory speed 1600Mhz.
Mono i7-3930K so 6 cores at 3.2GHz, with overclockable memory speed (up to 2400Mhz).

Here is a link to the detailed configurations:

Thanks for your help.

evcelica April 1, 2013 16:08

I would bet that the dual E5 machine would be faster since you would have 8 memory channels instead of just 4 with the i7 machine. I don't think the faster CPU clock or memory speed would be enough for the i7 system to make up for having half the number of memory channels.
This is just my opinion though, I have no first hand experience running these two types of configurations directly against each other.

dkokron April 1, 2013 22:17

I agree with Erik on this one. According to Intel's Vtune tool, OpenFOAM keeps all eight memory channels in my dual E5-2643 system very busy.


travonz April 4, 2013 06:04

Thank you very much for your answers.

Just an advice about motherboard. If a dual proc motherboard support quad channel memory, does it mean that there is a total of 8 channels ? or do I have to be careful about channel number when I buy a dual proc motherboard ?

Thanks a lot

evcelica April 4, 2013 07:33

If it is a dual socket board for XEON E5s, then yes it will definately have quad channel for each CPU, and possibly more than one rank of four slots for each CPU. So It could have 8 memory slots for each CPU, 16 total, but its still only quad channel per CPU, 8 channels total, since each CPU can only communicate through 4 memory channels at once, even if there could be more modules connected.

travonz April 4, 2013 09:09

ok, I understand better how it works.

thanks for your help.

abdul099 April 5, 2013 20:52

Well, the i7 is also a Sandy Bridge E CPU and therefore has 4 memory channels. And when he really runs it on 2400MHz, it will also have a good performance, since it's not only related to memory bandwidth but also on memory latency. And the communication between the higher number of partitions also takes some time, especially since usually (depending on the solver design) one process needs to collect some number from all involved cores.
The main issue is: It strongly depends on the solver and any serial processes which might be involved.

For sure the i7 is more optimated for high performance than the slow Xeons, which are good for a high stability.

So the dual E5 Xeons might still be slightly faster. But of course for an almost 20% higher price.

