CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > General Forums > Hardware

Workstation with new E5 Xeon Ivy-bridge

Register Blogs Members List Search Today's Posts Mark Forums Read

Like Tree2Likes

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   September 15, 2013, 05:00
Default Workstation with new E5 Xeon Ivy-bridge
  #1
New Member
 
Manuel
Join Date: Sep 2013
Location: Spain
Posts: 12
Rep Power: 13
Manuelo is on a distinguished road
I'm involved in a coal combustion simulation in Fluent. As Intel has released its new E5 ivy-bridge Xeon v2 processors last week, I thought it was a good idea to make some hardware upgrading with that stuff. But what I have read in this forum has confused me a bit, and I'm not sure if I'm selecting the best hardware configuration to put my money in.

I would appreciate if someone could give me some advice. I appologize in avance because there may be other threads in the forum that could answer to my question but I've done some searching and I couldn't completely solve my doubt.

Here is the workstation I'd firstly selected:

- 2 x Intel Xeon E5-2697 v2 (12 cores each proccesors, 24 cores total, 30 mb cache). This micro is the new flagship of Intel Xeon two sockets line and is available just from a few days ago.
- 8 x 8 gb = 64 gb RAM memory at 1866 Mhz (max speed allowed by Xeon Ivy-bridge). Eight channel memory as there are two sockets.
- Nvidia 650 Ti graphic card.
- ASUS Z9PE-D8 WS motherboard.
- A good heat dissipation set.

This computer would cost around 8.000 €. For that money or even less I could probably buy two of the following computers, which would be linked by gigabit ethernet to work in distributed parallel:

- Intel i7-4960X (15 mb of cache, 6 cores). This has also been released last week and I think it could probably beat the previous top i7 desktop processor of Intel (3970X).
- 8 x 4 gb = 32 gb RAM memory at 2400 Mhz (overclocked). Four channel memory.
- The rest would be similar to the Xeon configuration.

The first configuration has more cores (24 vs 12, and every core from Xeon may be better than i7 cores), the same cache size (30 vs (15 + 15)), the same memory channels (4 for every processor) but less memory speed (1866 vs 2400). So in terms of processing power I think Xeon is the best choice but in terms of memory bandwidth it would probably be on the contrary. With regard to connection between processes, Xeon (with high speed qpi link) would probably beat to gigabit ethernet link.

I thought that Xeon was the best choice but after looking at some threads across this forum I've seen that the most important parameter form the CFD point of view is the memory bandwidth, so I wonder if the couple of i7 would beat to the Xeon one. I'm stating this because Xeon can't go beyond 1866 Mhz, but i7 can go up to 2400 Mhz easily and work stable at that speed.

Another point would be the net speed and latency. Do you think that conventional gigabit ethernet would perform properly for interconnect or should I link them with 10gb Ethernet?

Here is a outline of the simulation I´m involved in:
- Coal combustion in a multiburner boiler with dpm, eddy-dissipation/finite-rate and DO-radiation (around 10 millions cells)
- Steady simulation.
- RSM turbulence model.

I have up to 32 proccesses license, so there wouldn't be any problem with the software.

Thanks.

Last edited by Manuelo; September 15, 2013 at 19:09. Reason: typo errors
Manuelo is offline   Reply With Quote

Old   September 15, 2013, 13:23
Default
  #2
New Member
 
Manuel
Join Date: Sep 2013
Location: Spain
Posts: 12
Rep Power: 13
Manuelo is on a distinguished road
Another doubt. Is it worth buying the top E5 ivy processor in the sense that the system could be bottlenecked by the memory bandwidth? Would I roughly get the same performance using a proccesor with lower cores (e.g. E5 2787W v2 with 8 cores instead of 12)?

The cost of the 12-cores one is 50% higher tan the one with 8-cores.

Last edited by Manuelo; September 15, 2013 at 19:10.
Manuelo is offline   Reply With Quote

Old   September 15, 2013, 19:51
Default
  #3
Member
 
Join Date: Dec 2012
Posts: 47
Rep Power: 13
jdrch is on a distinguished road
Quote:
Originally Posted by Manuelo View Post
The most important parameter form the CFD point of view is the memory bandwidth
True, but this only if all of said memory is on the same machine. Otherwise you run into latency and bandwidth bottlenecks having to transfer data over Ethernet.

I would get the Xeon machine, but I would get a Quadro Tesla GPU instead of the 650 Ti. That way, you'll have ECC (error correction) at the CPU, RAM, and GPU levels. This will prevent corruption of data in RAM, the likelihood of which increases with RAM size.

Judging from the fact that your license covers 32 cores, you should optimize for performance, not cost
__________________
Find me online here.
jdrch is offline   Reply With Quote

Old   September 16, 2013, 02:38
Default
  #4
New Member
 
Manuel
Join Date: Sep 2013
Location: Spain
Posts: 12
Rep Power: 13
Manuelo is on a distinguished road
Thanks jdrch,

I thought there wasn't no point on investing money in such powerful graphics card because my software (Fluent) doesn't support gpucomputing.

And the memory isn`t ECC because I couldn't find it rated at 1866 MHz. At least my hardware supplier says they can't find it.
Manuelo is offline   Reply With Quote

Old   September 16, 2013, 03:01
Default
  #5
Member
 
Join Date: Dec 2012
Posts: 47
Rep Power: 13
jdrch is on a distinguished road
Quote:
Originally Posted by Manuelo View Post
I thought there wasn't no point on investing money in such powerful graphics card because my software (Fluent) doesn't support gpucomputing.
Fair enough. I guess I'm biased because Quadros are the only GPUs I've ever run FLUENT on. Quadros are tested and certified for engineering analysis applications, which means you're less likely to run into rendering problems with them. Given the nonstandard graphics implementations found in many engineering packages, a certified card is a good thing to have.

Quote:
Originally Posted by Manuelo View Post
And the memory isn`t ECC because I couldn't find it rated at 1866 MHz. At least my hardware supplier says they can't find it.
This is totally your decision. I'm just saying that with that much RAM memory errors actually become a significant risk, which is problematic given CFD's large in-memory data requirements.
__________________
Find me online here.
jdrch is offline   Reply With Quote

Old   September 16, 2013, 07:03
Default
  #6
Member
 
Kim Bindesbøll Andersen
Join Date: Oct 2010
Location: Aalborg, Denmark
Posts: 39
Rep Power: 16
bindesboll is on a distinguished road
I have been working for some months now to specify a hardware solution for optimal utilisation of licenses to 32 cores. We are about to order below system:

2 machines:
Dual CPU motherboard
2 x Xeon E5-2667 v2 3,3 GHz, 8-cores
32 GB: 8 x 4 GB RAM 1866 MHz
HDD: 2 pcs 140 GB, 15.000 rpm SATA-600, RAID 1
Infiniband networkadapter (cluster interconnect)
Gigabit Ethernet networkadapter (system network)

One machine used for data storrage has additionally:
3 x 500 GB 10.000 rpm HDD, RAID 5.

As the 32 cores cannot be in a single machine, two dual sockets machines are chosen.
This ends up being 8 cores per CPU (2 x 2 x 8 = 32). There is no benefit in more cores per CPU as the system is most likely memory bandwidth limited.

Infiniband is choosen for cluster interconnect as the latency is low. Bandwidth is not that important at the amount of data exchange between the cluster nodes should not be that huge, when the solving is up running. But the speed or the individual data requests is important. Infiniband switches are costly, but can be ommited as there is only 2 nodes in the system, these can be connected directly without a switch.

The Pre and Post work will be done on my present CFD workstation (AMD Firepro 7900). So no special grafics are required for the cluster nodes.
bindesboll is offline   Reply With Quote

Old   September 16, 2013, 08:07
Default
  #7
New Member
 
Manuel
Join Date: Sep 2013
Location: Spain
Posts: 12
Rep Power: 13
Manuelo is on a distinguished road
Hi bindesboll,

Thanks for your advices. Just a quick questions:

- Are you sure that more powerful processors would lead you to a memory bandwidth limited system?

- I've never dealt with infiniband parts. How much does it cost an infiniband adapter approximately?

- What about Ethernet 10 Gb? Could it be also considered for the two machines cluster that you've proposed?

- Why don`t you go to SSD hard disk solutions?
Manuelo is offline   Reply With Quote

Old   September 16, 2013, 08:58
Default
  #8
Member
 
Kim Bindesbøll Andersen
Join Date: Oct 2010
Location: Aalborg, Denmark
Posts: 39
Rep Power: 16
bindesboll is on a distinguished road
Quote:
Originally Posted by Manuelo View Post
- Are you sure that more powerful processors would lead you to a memory bandwidth limited system?
As it appears from below scaling efficiencies, the scaling from 4 to 8 cores are poor. Thus more CPU power will not improve the speed, indicating that the system is otherwise limited, thats is: if the data cannot get in/out of the CPU it does not matter how fast the CPU performes (how many cores it has).
Actually you could consider to go for even quad-core or six-cores CPUs, to get a better utilization of each core. However, then you would need more than 2 cluster nodes (increased cost), which would also decrease the scaling efficiency and imply a costly Infiniband switch (5-6.000 €).

Quote:
Originally Posted by Manuelo View Post
- I've never dealt with infiniband parts. How much does it cost an infiniband adapter approximately?
40 Gb/s Infiniband networkadapter (PCI Express) costs 300-400 €.

Quote:
Originally Posted by Manuelo View Post
- What about ethernet 10 Gb? Could it be also consider for the two machines cluster?
My ANSYS reseller has done some testing for me on a 32 core system based on Xeon E5 2670 (8 core CPU). This system showed the following scaling efficiency (1 = linear scaling):
From 2 to 4 cores: 0.91
From 4 to 8 cores: 0.62 (clearly memorybandwidth limited, more cores will not increase speed).
From 8 to 16 cores (Motherboard CPU interconnect): 0.92
From 16 to 32 cores (cluster interconnect, GigE): 0.82
As it appears the efficiency of using 8 cores instead of 4 cores is poor indicating that the CPU performance of 8 cores is limited by the memory bandwidth.
Also the effiency of using 32 cores instead of 16 cores is not optimal, indicating that the cluster interconnect (in this case 1 Gb/s Gigabyte Ethernet) is limiting. Thus our decision to go for Infiniband.

Quote:
Originally Posted by Manuelo View Post
- Why don`t you go with SSD hard disk solution?
SSD will only increase the calculation speed if heavy disk-write operations are done during solving (frequent write of transient data). Otherwise SSD will do no difference for the solving time. Maybe the load time of CFD cases will improve a few seconds but as we have 10.000 rpm in RAID 5 this will be insignificant.
jdrch likes this.
bindesboll is offline   Reply With Quote

Old   September 16, 2013, 11:38
Default
  #9
New Member
 
Manuel
Join Date: Sep 2013
Location: Spain
Posts: 12
Rep Power: 13
Manuelo is on a distinguished road
Ok, thanks for your explanations. Really interesting.

As memory speed has been increased in ivy-bridge (+16.6%) there will be higher memory bandwidth. Hopefully this will help us.

Why didn't you choose 2687W? It also has 8 cores. Maybe for power saving?
Manuelo is offline   Reply With Quote

Old   September 17, 2013, 03:48
Default
  #10
Member
 
Kim Bindesbøll Andersen
Join Date: Oct 2010
Location: Aalborg, Denmark
Posts: 39
Rep Power: 16
bindesboll is on a distinguished road
Quote:
Originally Posted by Manuelo View Post
Why didn't you choose 2687W? It also has 8 cores. Maybe for power saving?
I would expect the performance of 2687W v2 and 2667 v2 to be very similiar. Dependant on to what degree the memory bandwidth limits the system even the 2650 v2 could be interesting - especially as the price of the CPU is only half of the other two.
bindesboll is offline   Reply With Quote

Old   September 25, 2013, 18:19
Default
  #11
New Member
 
Manuel
Join Date: Sep 2013
Location: Spain
Posts: 12
Rep Power: 13
Manuelo is on a distinguished road
Bindesboll, thanks for your comments.

I've found an Intel report stating significant performance increase between 2687W v2 and 2697 v2 processors. They don't give many details...

Here is the link:

http://www.intel.com/content/www/us/...ys-fluent.html

What's your opinion about it?
jdrch likes this.
Manuelo is offline   Reply With Quote

Old   September 26, 2013, 03:12
Default
  #12
Member
 
Kim Bindesbøll Andersen
Join Date: Oct 2010
Location: Aalborg, Denmark
Posts: 39
Rep Power: 16
bindesboll is on a distinguished road
Its scales supprisingly well. The scaling efficiency is 0.77: For linear scaling it ought to scale 12/8 (cores) = 1.5, but it only scales 1.41/1.22 = 1.16 (performance), thus 1.16/1.5 = 0.77. This is better than the scaling efficiency of 0.62 going from 4 to 8 cores on the Xeon E5 2670 I presented above. So it seems that the scaling for many cores is significantly better on the new Xeon E5 2600 v2 generation.

Still I would look into the price vs. performance before buying the 2697 v2 processor as usually the price doesnt scale well on the top-model CPUs :-)

And still if your license is limited to 32 cores, you would still end up with a system with 2 nodes, double CPU motherboard and 8 core CPUs (2687W v2 or 2667 v2) . Using the 2697 v2 you could configure a system with a single node, double CPU motherboard and 12 core CPU, but that would only give you 24 cores. I wouldnt expect that to outperform a 32 cores system.
bindesboll is offline   Reply With Quote

Old   September 26, 2013, 03:23
Default
  #13
Member
 
Join Date: Dec 2012
Posts: 47
Rep Power: 13
jdrch is on a distinguished road
FWIW, don't forget about Opteron options: http://www.padtinc.com/blog/the-focu...-vs-intel-xeon <- Article here has a 16 core dual Xeon machine being absolutely trounced in FLUENT benchmarks by a 16 core Opteron one costing much less.
__________________
Find me online here.
jdrch is offline   Reply With Quote

Old   September 26, 2013, 04:41
Default
  #14
New Member
 
Manuel
Join Date: Sep 2013
Location: Spain
Posts: 12
Rep Power: 13
Manuelo is on a distinguished road
Bindesboll, I agree with you with regard to license aspect. But I'm not sure about the pricing point. Price for a workstation with 2 x E5-2687W v2 is about 7000 € and the one with 2 x E5-2697 v2 is 8000 €. So the ratio is 8/7 = 1.14. That fit the performance ratio pretty well.

jdrch, thanks for your advice I'll have a look on it. But it's not easy to find amd opteron for me and xeon is everywhere
Manuelo is offline   Reply With Quote

Old   September 26, 2013, 04:53
Default
  #15
Member
 
Join Date: Dec 2012
Posts: 47
Rep Power: 13
jdrch is on a distinguished road
Quote:
Originally Posted by Manuelo View Post
But it's not easy to find amd opteron for me and xeon is everywhere
I was just saying so for argument's sake. In reality given your situation you really should pick a Xeon machine. Opteron only if you're budget conscious.
__________________
Find me online here.
jdrch is offline   Reply With Quote

Old   September 26, 2013, 04:56
Default
  #16
New Member
 
John McEntee
Join Date: Jun 2013
Posts: 8
Rep Power: 13
jmcentee is on a distinguished road
I thought I would add in some information of interest, as I have been heavily researching a CFD cluster recently.

Infiniband switches have a very big price difference, I have quotes ranging from £2,000 (intel) to 18,000 (re-badged mellonex) for a 36 port switch.

Infiniband cables also range from the sensible £50 to £550

Infiniband cards also range from £250 to £950

The latency of QDR and FDR is very similar so for CFD the QDR is more cost effective.

Intel also do a compute server platform, H2312XXKR, that is a 2u rackmount. That takes 4 x dual cpu server nodes, with the right model can have onboard infinband for about the £12,000 + VAT price (80 cores using 10 core cpus) so just 2 nodes in it for 32 cores would be £6,000 but you would need a good desktop, e.g. £1000. for pre and post processing

What looks like a good guide to setting up infiniband is at.
http://pkg-ofed.alioth.debian.org/ho...owto.html#toc4

John
jmcentee is offline   Reply With Quote

Old   September 26, 2013, 06:02
Default
  #17
New Member
 
Manuel
Join Date: Sep 2013
Location: Spain
Posts: 12
Rep Power: 13
Manuelo is on a distinguished road
I've never dealt with servers but I'll check it because it looks quite interesting.

Thanks for your contribution.
Manuelo is offline   Reply With Quote

Old   September 26, 2013, 13:09
Default
  #18
New Member
 
Manuel
Join Date: Sep 2013
Location: Spain
Posts: 12
Rep Power: 13
Manuelo is on a distinguished road
bindesboll, check these figures:

2697 v2: 12*3,0 GHz (turboboost on 12 cores) = 36,0 GHz
2687W v2: 8*3,6 GHz (turboboost on 8 cores) = 28,8 GHz

So the ratio between both proccessors is 36/28.8 = 1,25

So the scaling efficiency is = 1.16/1.25 = 0.928

Am I right? What's your opinion? Do you trust on that benchmark
Manuelo is offline   Reply With Quote

Old   September 27, 2013, 04:18
Default
  #19
Member
 
Kim Bindesbøll Andersen
Join Date: Oct 2010
Location: Aalborg, Denmark
Posts: 39
Rep Power: 16
bindesboll is on a distinguished road
Yes, you can calculate the scaling efficiency that way. However, it doesnt change the conclusion that you gain 16% performance by using 50% more licenses (number of cores).
bindesboll is offline   Reply With Quote

Old   November 19, 2013, 09:18
Default
  #20
Senior Member
 
RodriguezFatz's Avatar
 
Philipp
Join Date: Jun 2011
Location: Germany
Posts: 1,297
Rep Power: 27
RodriguezFatz will become famous soon enough
Quote:
Originally Posted by bindesboll View Post
As it appears from below scaling efficiencies, the scaling from 4 to 8 cores are poor. Thus more CPU power will not improve the speed, indicating that the system is otherwise limited, thats is: if the data cannot get in/out of the CPU it does not matter how fast the CPU performes (how many cores it has).
What about one motherboard with two CPUs? Do they have twice the memory bandwidth, compared with a single CPU board? Do you get any improvement for 8-core license, when using two CPUs in one workstation?
__________________
The skeleton ran out of shampoo in the shower.
RodriguezFatz is offline   Reply With Quote

Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Dual cpu workstation VS 2 node cluster single cpu workstation Verdi Hardware 18 September 2, 2013 04:09
Ivy Bridge - Haswell - Sandy Bridge russh Hardware 29 August 28, 2013 09:01
[OpenFOAM] Color display problem to view OpenFOAM results. Sargam05 ParaView 16 May 11, 2013 01:10
need opinion Workstation 2x Xeon e5 e2690 ? laxwendrofzx9r Hardware 6 June 5, 2012 10:04
PC vs. Workstation Tim Franke Main CFD Forum 5 September 29, 1999 16:01


All times are GMT -4. The time now is 00:09.