CFD Online Discussion Forums

CFD Online Discussion Forums (https://www.cfd-online.com/Forums/)
-   Hardware (https://www.cfd-online.com/Forums/hardware/)
-   -   AMD FX 8-core or 6-core? (https://www.cfd-online.com/Forums/hardware/95431-amd-fx-8-core-6-core.html)

Jordi December 17, 2011 08:05

AMD FX 8-core or 6-core?
 
It's time for me to upgrade my hw and I'm considering these options, either a 6-core or and 8-core AMD FX. Price difference is just some 50 eur, not much, but also I suppose the 8-core option will need a more expensive motherboard, cooler and power supply. Anyone has experience with these processors to comment?

kyle December 17, 2011 14:18

Are you putting the chip in an existing motherboard that you already have? If you are buying a new system, you really should not be purchasing any AMD chip.

Jordi December 17, 2011 16:43

what's wrong with AMD?

kyle December 17, 2011 17:43

They are slower than Intel chips, especially for CFD. If you have >$400 to spend, then AMD does not make sense... memory bandwidth and cache performance is too low.

Jordi December 18, 2011 03:33

Yes I have seen some benchmarks and AMD FX-8150 scores just below i7-980, but the prices are 250€ to 1000€+. And the performance difference is some few percent.... My needs/ wishes are a decent hardware but also with an affordable price.

kyle December 18, 2011 21:44

Who cares about the i7 980X? The $175 i5 2400 beats any AMD chip.

http://techreport.com/articles.x/21813/15

Until they make some serious advances, AMD is just a bad choice for CFD.

andyj December 21, 2011 16:16

AMD is very popular in supercomputers. The majority of supercomputers use AMD chips. And they all run Linux.

In the consumer market, Intel does have the lead right now. But everyone cannot afford to go with the i7.

The most computing power for the money at the consumer level is AMD.

The new AMD chips will improve on a windows format when microsoft finishes optimizing multicore threading.
The AMD X6 can be found for $130 or less with a free motherboard.
The FX is going to be a good chip as well for the money. The FX8120 8 core will improve once Windows 8 comes out with better support for multicore processors.

A great website with all the cpu benchmarks in nice color is http://www.cpubenchmark.net/high_end_cpus.html

kyle December 21, 2011 16:36

Who cares about whatever the hell a "PassMark" benchmark is? This is a CFD message board, use CFD benchmarks.

A system with an Intel Core i5 2400 is faster than ANY single socket system using an AMD chip. If you are in the US, you can build an i5 2400 system with 8gb of RAM for less than $400!

Just because Intel has a $1000 chip that most people cannot afford does not make everything the company sells a bad value. AMD currently only makes sense if you need to go ultra-cheap, like a $250 workstation.

sail December 21, 2011 16:44

Quote:

Originally Posted by andyj (Post 336662)
AMD is very popular in supercomputers. The majority of supercomputers use AMD chips. And they all run Linux.

In the consumer market, Intel does have the lead right now. But everyone cannot afford to go with the i7.

The most computing power for the money at the consumer level is AMD.

The new AMD chips will improve on a windows format when microsoft finishes optimizing multicore threading.
The AMD X6 can be found for $130 or less with a free motherboard.
The FX is going to be a good chip as well for the money. The FX8120 8 core will improve once Windows 8 comes out with better support for multicore processors.

A great website with all the cpu benchmarks in nice color is http://www.cpubenchmark.net/high_end_cpus.html

I agree with Kyle.

Amd has the lead in hpc due to the lot of cores on chip, lower price and higher memory brandwith. But this applies only for servers.

The bulldozer family chip is made of modules: a module contain 2 integer units and one shared fpu. and AMD marketing departement make the assumption that one module = 2 cores. Unfortunately cfd codes run mostly on FPU, so buying FX chips might not be the best idea.

I would suggest going with the i7 family, if you plan using desktop parts.

Wesley December 25, 2011 14:44

Jordi's question is one I have been curious about myself.

I will freely admit that I choose AMD processors as an emotional/non-rational choice. My next build will be AMD. I am using some CFD to explore my job (flow of food materials) - I am not somebody who uses CFD to the maximum capabilities. I don't need the maximum possible performance, but want to make the better decision with the processor brand I will use.

For OpenFOAM, am I better off with the 6-core Phenom II, or the 8-core FX. I have not tracked down any reviews I am willing to trust on this, at least so far. It looks like the OSU info may be worth checking into deeper.

The 8-core FX has the advantage of more cores, but an FPU is shared between the cores. As I understand the chip, the FPU can operate as a single 256-bit FPU or as two 128-bit FPUs in parallel. If somebody is familiar with the architectures of the two chips or has run both with OpenFOAM (or other parallel programs), I would be curious about how the chips compare.

scipy December 25, 2011 15:33

I would just like to chime in because I had a similar dilemma and my choice turned out to be wrong.

Two months ago or so, I sorely needed a computer upgrade from my old Core2Duo E6600 with 2 GB DDR2 to finish my final thesis for college. It was right about when FX-8150 came out and I searched the whole internet for a Fluent benchmark between the the FX-8150 and an i7-2600K and found zilch. In all of the commercial apps that are used for cpu benchmarks, even the i5-2500K was beating the new AMD and the i7 was obliterating it.. so I went with an i7-2600K.

Ever since then, I was trying to find someone in my poor country who had an 8150 to run a few tests and today I finally managed it. Here's the info on the testing method/results:

I used my project setup which is external aerodynamics on a generic car model. The case has an unstructured tetrahedral mesh consisting of 6.8 million elements, coupled solver is used with a Realizable K-epsilon model and standard wall functions. The usual calculation time on the i7-2600K with 4 threads (4 cores active, meaning hyperthreading is turned off in the BIOS) is about 10 hours and a few minutes to finish 1050 iterations. As during the calcuation 1st order upwind and 2nd order upwind discretisations are switched between, I tested both.

i7-2600K overclocked from 3.4 GHz to 4.5 GHz:
31.15 seconds per iteration during 1st order upwind phase
34.78 seconds per iteration during 2nd order upwind phase

FX-8150 at stock frequency (3.6 GHz):
25 seconds/iter @ 1st order upwind
27.5 seconds/iter @ 2nd order upwind

This makes the AMD 24.6 % faster in the 1st order calculations, and 26.5 % faster in 2nd order calculations.

Now, I know what you're going to say: "AMD has 8 cores and i7 only 4" and you are right.. the i7 has stronger per core performance and since Fluent is licensed per core, you are probably better off getting the fastest cores possible. But, if you're in a situation where the college or the company provides the licensing for up to 8 cores, then why the hell wouldn't you get the AMD?

Another thing is, AMD can also overclock to a stable 4.5-4.8 GHz so the difference might be even bigger (I will put my cpu back to 3.4 GHz tomorrow and come back with another set of results).

In any case, a test including CFD has become available meanwhile:
http://techreport.com/articles.x/21813/15

But it had me confused. They said "Here's how our contenders handled the test with optimal thread counts for each processor." What does that mean? If you are comparing 4, 6 and 8 core processors, what is the optimal thread count? When I tested today, my only goal was to get all the 8 cores of the FX-8150 to 100 % usage, and this meant 8 threads of Fluent. I tried to run 8 threads on my i7 with hyperthreading on and the cpu wasn't going above 50 % overall load, plus a Fluent whitepaper suggested that HT should be turned off. Me being a non-believer, I tested for myself, and it was true.. calculation time was similar but with HT on and using 8 threads the simulation was around 10 % slower.

Another test that I found was this:
http://pcper.com/reviews/Processors/...rimental-Tests

It seems to use the same software but displays the results based on thread numbers, but in this test their conclusion was: "It is obvious that the Euler3D test is not a fan of the Bulldozer architecture even in the 8 thread variation. The FX-8150 is about 44% slower than the Core i7-2600k and 22% slower than the Core i5-2500k."

So, I don't know.. but why the hell wouldn't you test in a REALISTIC real life test like Fluent? How can 8 threads lose to 4 in their test and win out by probably 35 % in mine (both overclocked or both stock)?

Discuss.

wyldckat December 26, 2011 12:27

Greetings to all!

@scipy: perhaps this might shed some more light on this issue: 4 cpu motherboard for CFD - in post #5 it talks about AMD opteron's... «CFD performance with unstructured grids on AMD's multi-socket boards is extremely poor.» ... And gives a link to an investigation on this issue: http://www.anandtech.com/show/4486/s...mark-session/5

Another detail to always take into account when racing AMD vs Intel is the compiler used: http://en.wikipedia.org/wiki/Intel_C...iler#Criticism

These two issues might explain why you've reached an opposite conclusion. It's as simple as:
  • Using the proper settings in the BIOS/EFI.
  • The CFD software should be compiled with compilers that aren't bias to the CPU maker :p
  • Last but not least, the compiled software might not be properly identifying the CPU where it is running.

Best regards,
Bruno

scipy December 28, 2011 04:17

Hi Bruno,

I don't quite follow. What we're talking about here is a single socket "consumer" cpu. I've read both that post and AnandTech article on the subject before, when I was researching the Bulldozer and judging by everything said there about unstructured grids etc (even though it's for multisocket mbos) the AMD should have lower performance.

But in the case of FX-8150 vs i7-2600K, on a strictly tetrahedral completely unstructured grid the FX-8150 is 25 % faster versus an overclocked i7. If this info was available to me somewhere at the time when I was upgrading my desktop pc, I would've gone for the Bulldozer without thinking any further. If the performance of the Bulldozer is even better on structured grids, then there's really nothing to even think about..

wyldckat December 28, 2011 07:51

Hi scipy,

Usually single and multiple sockets don't differ all much. I've had results with a consumer grade stock clocked AMD 1055T give run times almost identical to the respective Opteron on system's with dual socket. The detail about AMD on those links was the issue of properly tuning the BIOS on how the memory was being used.

As for the difference in run times you are getting, I think the problem is specific to the compiler used to build those solvers you've used. If you used Fluent, search for the compiler(s) they use to build it. If they use Intel's compilers, then you were strangely lucky in having an AMD outperform an Intel machine.

If by any chance you used custom CFD code and compiled it yourself with Gcc or some other non-Intel compiler, then it's only natural that an AMD machine was faster and better than an Intel machine!

Best regards,
Bruno

kyle December 29, 2011 21:44

scipy,

What code are you using? I have benchmarked OpenFOAM and Star-CCM+ myself accross a wide range of grid sizes and geometries. I have seen Fluent benchmarks from others as well. Bulldozer was always slower than the i7 2600k.

I certainly would like for AMD to be faster, because their systems tend to be cheaper.

scipy December 30, 2011 04:28

I was using Fluent 13.0 with a mesh done in ANSYS Meshing (automatic one). Patch conforming tetrahedrals with a few bodies of influence to capture the wake behind the car and underbody flows. Here's a picture of the mesh:

http://i.imgur.com/7HSlw.png

I don't know how you tested the bulldozer but as I described above, I ran 4 threads of Fluent on the i7 and 8 threads on the FX-8150 (so all cores were at 100 % on both cpu's).

andyj January 3, 2012 02:02

http://hpc.admin-magazine.com/Vendor...zer-processors

http://www.pgroup.com/about/news.htm#45

http://www.agner.org/optimize/blog/read.php?i=49

http://www.bostondeutschland.de/pres...computers.aspx

wyldckat January 3, 2012 03:54

Quote:

Originally Posted by andyj (Post 337625)

So, all we need now is to compare applications that are compiled with these compilers and with the respective options. Icc and Gcc are the easiest to get our hands on, as well as OpenFOAM. Now we still need time and an AMD bulldozer CPU :D

Quote:

Originally Posted by andyj (Post 337625)

I was expecting that either Star-CCM+ or Fluent would be compiled with PGI compilers... that was why I was alleging that Fluent 13 was getting so much performance for Scipy.

Quote:

Originally Posted by andyj (Post 337625)

This one is referenced in the wikipedia page I posted before... the one about criticism for Intel's compiler.

Quote:

Originally Posted by andyj (Post 337625)

Mmm, this is just marketing, there are no numbers for comparison :(

andyj January 3, 2012 05:19

PGI which makes the compilers used on Cray's supercomputers has optimized a new compiler for the Bulldozer. It's ver 11.9..just came out a couple of weeks ago.
Whats interesting is SGI has purchased OpenFOAM. They also have a new compiler and do supercomputer work..

You can run optimized openfoam on their cloud. Amazon is likely cheaper..they have openfoam as well.

Maybe Suse Linux Enterprise uses the compiler and instruction set that fits the bulldozer? RHEL is going to support the bulldozer also.

An OpenFOAM test case with known variables and results would likely give the best comparison.

andyj January 3, 2012 06:17

There is a very well documented test case complete with scripts and downloadable files, with 94,000 elements and 5000 iterations:
http://openfoamwiki.net/index.php/Sig_Turbomachinery_/_ERCOFTAC_centrifugal_pump_with_a_vaned_diffuser
Microsoft has even studied this case.

Microsoft Research on the Ercoftac-centrifugal-pump-openfoam-case-study
Would even allow comparison between Windows and Linux:
http://academic.research.microsoft.com/Publication/5891520/the-ercoftac-centrifugal-pump-openfoam-case-study

There is even a thesis on this study http://www.tfd.chalmers.se/~hani/pdf_files/ShashaMasterThesis.pdf ·

GTCo8 January 6, 2012 14:51

Quote:

Originally Posted by Jordi (Post 336195)
what's wrong with AMD?

Nothing is wrong.

I see that AMD FX 8150 (8 cores, Bulldozer) in some activities is comparable in speed to Intel i7-2600 (4 cores, Sandy Bridge) and it is only little cheaper than Intel. But only in some. Core per core is worse than Intel. Maybe this will change a little in improved Windows 8 and new Linux kernel. I have read that FX 8100 will be optimal (price vs speed).

Another problem with AMD MOBOs is that they support less memory (usually 4 banks, Intel have 6 or even 12 banks). I mean desktop MOBOs, not Server MOBOs.

And one thing more: MOBOs for AMD are cheaper than for Intel.

wyldckat February 1, 2012 06:35

Greetings to all!

Just to give a small update on this subject:
So, when in doubt, contact your CFD software supplier and ask them if and which versions of your favourite solver/meshers have been tuned for the AMD Bulldozer or any other CPU architectures!

Best regards,
Bruno

daveatstyacht February 3, 2012 16:59

Scipy,
Thank you for going and benchmarking the two systems, I had not considered AMD as a possible candidate for my next CPU. The fact that overclocking the intel did not help your performance says to me that a different part of your system is bottlenecking the performance (probably the bus or memory). Interestingly, I went into researching the differences that might give the FX 8150 an edge and an area that it beats the 2700k is L2 cache hands down (2048 KB per two cores vs 256 KB per core or in other words 8 times more!). Since cache misses can have a huge effect on performance, the larger L2 cache could help significantly, particularly if something like multi-grid is being utilized. Another thing to consider is the maximum supported memory speed 2700k: 1333 MHz vs the 1866 MHz of the FX 8150. Memory speed is an important consideration for unstructured meshes. I think Bruno brings up an excellent point considering the potential bias that can be introduced by the compiler.

Dave

Sources:
Comparison of the two:

http://www.knowbytes.com/home/articl...us-amd-fx-8150

And to confirm the numbers are all correct go to the product pages:

http://www.amd.com/us/products/deskt...omparison.aspx
http://ark.intel.com/products/61275/...che-3_5-GHz%29

rmh26 February 7, 2012 09:25

The L2 and L3 cache's are shared on Intel's processors (smart cache) so a four core i7 would be able to share 1Mb of L2 and 8Mb of L3 while a FX4100 would have two independent 2Mb L2 caches(one per FPU unit) and a shared 8Mb L3. The completely shared cache would seem to give Intel an edge for smaller threaded problems while AMD would win out when working on larger sets.


I would say a big plus on AMD's side is the inclusion of FMA which would greatly speed up many linear algebra operations. I don't think Intel will get FMA until Haswell.

mlotek April 30, 2013 22:41

Amd
 
What about cost vs performance? When I saw the prices of Intel processors...

I acquired x4 AMD Interlagos 6274's (64 cores) for under 250.00 USD each (granted they were used). I am currently running unstructured meshes in openFoam > 16,000,000 cells and find convergence times to be very reasonable (even with the memory bandwidth bottleneck...). I am considering getting another 64 cores because the cost is so low...

https://kudlaengineering.wordpress.com/

I would not rule out AMD, especially if its a cost vs performance issue. Now a watt/performance issue may be a different story.

Best,

Tom

wyldckat May 1, 2013 05:04

Greetings Tom,

You didn't mention how many sockets you're using per motherboard. If you have one motherboard per processor, along with 4 DDR3 memory modules per motherboard, then it's only natural that you have a very good performance! Each processor is able to use 4 channel DDR3, therefore having a good configuration.

But you should also try to test using only 8 cores per processor and compare the performance you're getting. There was a post somewhere that described how to set the core affinity in mpirun... it's this thread: http://www.cfd-online.com/Forums/har...arameters.html

Best regards,
Bruno

CapSizer May 1, 2013 17:38

Looks like a handy system Thomas. What did you do for CPU cooling, and how are the noise levels?

mlotek May 3, 2013 21:44

Hi wlydckat,

All 6274's are on one board, H8QGi-F. The hard drive is a 256 gb SSD (very fast but way too small). I read about the performance penalty for unstructured grids due to the memory bandwidth (though this board has quad memory channels), but due to the used prices I found, I went for it anyway. So far, I am very happy with this setup and left space for a second board and switch (maybe infiniband if I can find a decent used one for not too much money, I have heard horror stories of Ethernet performance from HPC engineers).

Speed wise, I have not ran any benchmarks - it's been running OpenFOAM almost nonstop since I bought it and hasn't skipped a beat.

Hi CapSizer,
I used (4) Noctua fans (which are very quiet). Unfortunately, the two fans on the server case are very loud and will need to be replaced.

I would like to run some benchmarks, but not just with 8 cores...

Best,

Tom


All times are GMT -4. The time now is 14:51.