|
[Sponsors] |
September 15, 2013, 05:00 |
Workstation with new E5 Xeon Ivy-bridge
|
#1 |
New Member
Manuel
Join Date: Sep 2013
Location: Spain
Posts: 12
Rep Power: 13 |
I'm involved in a coal combustion simulation in Fluent. As Intel has released its new E5 ivy-bridge Xeon v2 processors last week, I thought it was a good idea to make some hardware upgrading with that stuff. But what I have read in this forum has confused me a bit, and I'm not sure if I'm selecting the best hardware configuration to put my money in.
I would appreciate if someone could give me some advice. I appologize in avance because there may be other threads in the forum that could answer to my question but I've done some searching and I couldn't completely solve my doubt. Here is the workstation I'd firstly selected: - 2 x Intel Xeon E5-2697 v2 (12 cores each proccesors, 24 cores total, 30 mb cache). This micro is the new flagship of Intel Xeon two sockets line and is available just from a few days ago. - 8 x 8 gb = 64 gb RAM memory at 1866 Mhz (max speed allowed by Xeon Ivy-bridge). Eight channel memory as there are two sockets. - Nvidia 650 Ti graphic card. - ASUS Z9PE-D8 WS motherboard. - A good heat dissipation set. This computer would cost around 8.000 €. For that money or even less I could probably buy two of the following computers, which would be linked by gigabit ethernet to work in distributed parallel: - Intel i7-4960X (15 mb of cache, 6 cores). This has also been released last week and I think it could probably beat the previous top i7 desktop processor of Intel (3970X). - 8 x 4 gb = 32 gb RAM memory at 2400 Mhz (overclocked). Four channel memory. - The rest would be similar to the Xeon configuration. The first configuration has more cores (24 vs 12, and every core from Xeon may be better than i7 cores), the same cache size (30 vs (15 + 15)), the same memory channels (4 for every processor) but less memory speed (1866 vs 2400). So in terms of processing power I think Xeon is the best choice but in terms of memory bandwidth it would probably be on the contrary. With regard to connection between processes, Xeon (with high speed qpi link) would probably beat to gigabit ethernet link. I thought that Xeon was the best choice but after looking at some threads across this forum I've seen that the most important parameter form the CFD point of view is the memory bandwidth, so I wonder if the couple of i7 would beat to the Xeon one. I'm stating this because Xeon can't go beyond 1866 Mhz, but i7 can go up to 2400 Mhz easily and work stable at that speed. Another point would be the net speed and latency. Do you think that conventional gigabit ethernet would perform properly for interconnect or should I link them with 10gb Ethernet? Here is a outline of the simulation I´m involved in: - Coal combustion in a multiburner boiler with dpm, eddy-dissipation/finite-rate and DO-radiation (around 10 millions cells) - Steady simulation. - RSM turbulence model. I have up to 32 proccesses license, so there wouldn't be any problem with the software. Thanks. Last edited by Manuelo; September 15, 2013 at 19:09. Reason: typo errors |
|
September 15, 2013, 13:23 |
|
#2 |
New Member
Manuel
Join Date: Sep 2013
Location: Spain
Posts: 12
Rep Power: 13 |
Another doubt. Is it worth buying the top E5 ivy processor in the sense that the system could be bottlenecked by the memory bandwidth? Would I roughly get the same performance using a proccesor with lower cores (e.g. E5 2787W v2 with 8 cores instead of 12)?
The cost of the 12-cores one is 50% higher tan the one with 8-cores. Last edited by Manuelo; September 15, 2013 at 19:10. |
|
September 15, 2013, 19:51 |
|
#3 | |
Member
Join Date: Dec 2012
Posts: 47
Rep Power: 13 |
Quote:
I would get the Xeon machine, but I would get a Quadro Tesla GPU instead of the 650 Ti. That way, you'll have ECC (error correction) at the CPU, RAM, and GPU levels. This will prevent corruption of data in RAM, the likelihood of which increases with RAM size. Judging from the fact that your license covers 32 cores, you should optimize for performance, not cost
__________________
Find me online here. |
||
September 16, 2013, 02:38 |
|
#4 |
New Member
Manuel
Join Date: Sep 2013
Location: Spain
Posts: 12
Rep Power: 13 |
Thanks jdrch,
I thought there wasn't no point on investing money in such powerful graphics card because my software (Fluent) doesn't support gpucomputing. And the memory isn`t ECC because I couldn't find it rated at 1866 MHz. At least my hardware supplier says they can't find it. |
|
September 16, 2013, 03:01 |
|
#5 | |
Member
Join Date: Dec 2012
Posts: 47
Rep Power: 13 |
Quote:
This is totally your decision. I'm just saying that with that much RAM memory errors actually become a significant risk, which is problematic given CFD's large in-memory data requirements.
__________________
Find me online here. |
||
September 16, 2013, 07:03 |
|
#6 |
Member
Kim Bindesbøll Andersen
Join Date: Oct 2010
Location: Aalborg, Denmark
Posts: 39
Rep Power: 16 |
I have been working for some months now to specify a hardware solution for optimal utilisation of licenses to 32 cores. We are about to order below system:
2 machines: Dual CPU motherboard 2 x Xeon E5-2667 v2 3,3 GHz, 8-cores 32 GB: 8 x 4 GB RAM 1866 MHz HDD: 2 pcs 140 GB, 15.000 rpm SATA-600, RAID 1 Infiniband networkadapter (cluster interconnect) Gigabit Ethernet networkadapter (system network) One machine used for data storrage has additionally: 3 x 500 GB 10.000 rpm HDD, RAID 5. As the 32 cores cannot be in a single machine, two dual sockets machines are chosen. This ends up being 8 cores per CPU (2 x 2 x 8 = 32). There is no benefit in more cores per CPU as the system is most likely memory bandwidth limited. Infiniband is choosen for cluster interconnect as the latency is low. Bandwidth is not that important at the amount of data exchange between the cluster nodes should not be that huge, when the solving is up running. But the speed or the individual data requests is important. Infiniband switches are costly, but can be ommited as there is only 2 nodes in the system, these can be connected directly without a switch. The Pre and Post work will be done on my present CFD workstation (AMD Firepro 7900). So no special grafics are required for the cluster nodes. |
|
September 16, 2013, 08:07 |
|
#7 |
New Member
Manuel
Join Date: Sep 2013
Location: Spain
Posts: 12
Rep Power: 13 |
Hi bindesboll,
Thanks for your advices. Just a quick questions: - Are you sure that more powerful processors would lead you to a memory bandwidth limited system? - I've never dealt with infiniband parts. How much does it cost an infiniband adapter approximately? - What about Ethernet 10 Gb? Could it be also considered for the two machines cluster that you've proposed? - Why don`t you go to SSD hard disk solutions? |
|
September 16, 2013, 08:58 |
|
#8 | |||
Member
Kim Bindesbøll Andersen
Join Date: Oct 2010
Location: Aalborg, Denmark
Posts: 39
Rep Power: 16 |
Quote:
Actually you could consider to go for even quad-core or six-cores CPUs, to get a better utilization of each core. However, then you would need more than 2 cluster nodes (increased cost), which would also decrease the scaling efficiency and imply a costly Infiniband switch (5-6.000 €). Quote:
Quote:
From 2 to 4 cores: 0.91 From 4 to 8 cores: 0.62 (clearly memorybandwidth limited, more cores will not increase speed). From 8 to 16 cores (Motherboard CPU interconnect): 0.92 From 16 to 32 cores (cluster interconnect, GigE): 0.82 As it appears the efficiency of using 8 cores instead of 4 cores is poor indicating that the CPU performance of 8 cores is limited by the memory bandwidth. Also the effiency of using 32 cores instead of 16 cores is not optimal, indicating that the cluster interconnect (in this case 1 Gb/s Gigabyte Ethernet) is limiting. Thus our decision to go for Infiniband. SSD will only increase the calculation speed if heavy disk-write operations are done during solving (frequent write of transient data). Otherwise SSD will do no difference for the solving time. Maybe the load time of CFD cases will improve a few seconds but as we have 10.000 rpm in RAID 5 this will be insignificant. |
||||
September 16, 2013, 11:38 |
|
#9 |
New Member
Manuel
Join Date: Sep 2013
Location: Spain
Posts: 12
Rep Power: 13 |
Ok, thanks for your explanations. Really interesting.
As memory speed has been increased in ivy-bridge (+16.6%) there will be higher memory bandwidth. Hopefully this will help us. Why didn't you choose 2687W? It also has 8 cores. Maybe for power saving? |
|
September 17, 2013, 03:48 |
|
#10 |
Member
Kim Bindesbøll Andersen
Join Date: Oct 2010
Location: Aalborg, Denmark
Posts: 39
Rep Power: 16 |
I would expect the performance of 2687W v2 and 2667 v2 to be very similiar. Dependant on to what degree the memory bandwidth limits the system even the 2650 v2 could be interesting - especially as the price of the CPU is only half of the other two.
|
|
September 25, 2013, 18:19 |
|
#11 |
New Member
Manuel
Join Date: Sep 2013
Location: Spain
Posts: 12
Rep Power: 13 |
Bindesboll, thanks for your comments.
I've found an Intel report stating significant performance increase between 2687W v2 and 2697 v2 processors. They don't give many details... Here is the link: http://www.intel.com/content/www/us/...ys-fluent.html What's your opinion about it? |
|
September 26, 2013, 03:12 |
|
#12 |
Member
Kim Bindesbøll Andersen
Join Date: Oct 2010
Location: Aalborg, Denmark
Posts: 39
Rep Power: 16 |
Its scales supprisingly well. The scaling efficiency is 0.77: For linear scaling it ought to scale 12/8 (cores) = 1.5, but it only scales 1.41/1.22 = 1.16 (performance), thus 1.16/1.5 = 0.77. This is better than the scaling efficiency of 0.62 going from 4 to 8 cores on the Xeon E5 2670 I presented above. So it seems that the scaling for many cores is significantly better on the new Xeon E5 2600 v2 generation.
Still I would look into the price vs. performance before buying the 2697 v2 processor as usually the price doesnt scale well on the top-model CPUs :-) And still if your license is limited to 32 cores, you would still end up with a system with 2 nodes, double CPU motherboard and 8 core CPUs (2687W v2 or 2667 v2) . Using the 2697 v2 you could configure a system with a single node, double CPU motherboard and 12 core CPU, but that would only give you 24 cores. I wouldnt expect that to outperform a 32 cores system. |
|
September 26, 2013, 03:23 |
|
#13 |
Member
Join Date: Dec 2012
Posts: 47
Rep Power: 13 |
FWIW, don't forget about Opteron options: http://www.padtinc.com/blog/the-focu...-vs-intel-xeon <- Article here has a 16 core dual Xeon machine being absolutely trounced in FLUENT benchmarks by a 16 core Opteron one costing much less.
__________________
Find me online here. |
|
September 26, 2013, 04:41 |
|
#14 |
New Member
Manuel
Join Date: Sep 2013
Location: Spain
Posts: 12
Rep Power: 13 |
Bindesboll, I agree with you with regard to license aspect. But I'm not sure about the pricing point. Price for a workstation with 2 x E5-2687W v2 is about 7000 € and the one with 2 x E5-2697 v2 is 8000 €. So the ratio is 8/7 = 1.14. That fit the performance ratio pretty well.
jdrch, thanks for your advice I'll have a look on it. But it's not easy to find amd opteron for me and xeon is everywhere |
|
September 26, 2013, 04:56 |
|
#16 |
New Member
John McEntee
Join Date: Jun 2013
Posts: 8
Rep Power: 13 |
I thought I would add in some information of interest, as I have been heavily researching a CFD cluster recently.
Infiniband switches have a very big price difference, I have quotes ranging from £2,000 (intel) to 18,000 (re-badged mellonex) for a 36 port switch. Infiniband cables also range from the sensible £50 to £550 Infiniband cards also range from £250 to £950 The latency of QDR and FDR is very similar so for CFD the QDR is more cost effective. Intel also do a compute server platform, H2312XXKR, that is a 2u rackmount. That takes 4 x dual cpu server nodes, with the right model can have onboard infinband for about the £12,000 + VAT price (80 cores using 10 core cpus) so just 2 nodes in it for 32 cores would be £6,000 but you would need a good desktop, e.g. £1000. for pre and post processing What looks like a good guide to setting up infiniband is at. http://pkg-ofed.alioth.debian.org/ho...owto.html#toc4 John |
|
September 26, 2013, 06:02 |
|
#17 |
New Member
Manuel
Join Date: Sep 2013
Location: Spain
Posts: 12
Rep Power: 13 |
I've never dealt with servers but I'll check it because it looks quite interesting.
Thanks for your contribution. |
|
September 26, 2013, 13:09 |
|
#18 |
New Member
Manuel
Join Date: Sep 2013
Location: Spain
Posts: 12
Rep Power: 13 |
bindesboll, check these figures:
2697 v2: 12*3,0 GHz (turboboost on 12 cores) = 36,0 GHz 2687W v2: 8*3,6 GHz (turboboost on 8 cores) = 28,8 GHz So the ratio between both proccessors is 36/28.8 = 1,25 So the scaling efficiency is = 1.16/1.25 = 0.928 Am I right? What's your opinion? Do you trust on that benchmark |
|
September 27, 2013, 04:18 |
|
#19 |
Member
Kim Bindesbøll Andersen
Join Date: Oct 2010
Location: Aalborg, Denmark
Posts: 39
Rep Power: 16 |
Yes, you can calculate the scaling efficiency that way. However, it doesnt change the conclusion that you gain 16% performance by using 50% more licenses (number of cores).
|
|
November 19, 2013, 09:18 |
|
#20 | |
Senior Member
Philipp
Join Date: Jun 2011
Location: Germany
Posts: 1,297
Rep Power: 27 |
Quote:
__________________
The skeleton ran out of shampoo in the shower. |
||
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Dual cpu workstation VS 2 node cluster single cpu workstation | Verdi | Hardware | 18 | September 2, 2013 04:09 |
Ivy Bridge - Haswell - Sandy Bridge | russh | Hardware | 29 | August 28, 2013 09:01 |
[OpenFOAM] Color display problem to view OpenFOAM results. | Sargam05 | ParaView | 16 | May 11, 2013 01:10 |
need opinion Workstation 2x Xeon e5 e2690 ? | laxwendrofzx9r | Hardware | 6 | June 5, 2012 10:04 |
PC vs. Workstation | Tim Franke | Main CFD Forum | 5 | September 29, 1999 16:01 |