CFD Online Logo CFD Online URL
Home > Forums > Hardware

Advice on the technical requirements for a new Fluent Workstation

Register Blogs Members List Search Today's Posts Mark Forums Read

Like Tree28Likes

LinkBack Thread Tools Display Modes
Old   March 7, 2013, 05:22
Senior Member
Join Date: Feb 2011
Location: Earth (Land portion)
Posts: 694
Rep Power: 13
evcelica is on a distinguished road
Geez, a little sensitive are we? I didn't go crying and running away when you gave me your pompous attitude and acted like I know nothing:

Originally Posted by Daveo643 View Post
It's unfortunate that is doesn't work for you, but I don't run Mechanical and neither does the original poster of this thread. I'm going to keep trying and waiting for an authoritative answer on this. Thank you anyway.
That slightly overclocked machine shouldn't account for it being 60% faster than the Tesla. Any one of the upper end Sandy-Bridge E based CPUs should beat the Tesla in that benchmark.

Some guys at my work run Teslas with ANSYS, so I've seen their real world performance or lack thereof in certain cases. I'm sorry, but you come here and recommend something very expensive based on no personal experience, and just some marketing pamphlets you found. Then you get mad at me and stomp off for urging people to think about the real world performance and applications before making such a purchase? Fine, Sayonara buddy.
ghost82 and HMN like this.
evcelica is offline   Reply With Quote

Old   March 7, 2013, 16:55
Senior Member
Join Date: Mar 2009
Location: Austin, TX
Posts: 147
Rep Power: 11
kyle is on a distinguished road
For what it's worth, I agree with evcelica, and WOW Daveo643 is a sensitive one.

The software to make GPU's a good choice just is not there yet. It seems to me that they show well with certain simulations on structured meshes where you can efficiently organize the simulation in memory, but they are a poor investment for the types of simulations that most of us are doing. You may be able to eek out 1.5x speedup on certain cases, but it often comes at 3x the cost. If you are writing research code for DNS flow in a square channel, then it might make sense to invest in a GPU for the computations. I don't see how it makes sense for anyone solving industrial problems.

The fastest machines for traditional CFD on unstructured meshes use the i7 CPUs with the x79 chipset and high speed memory. As stated earlier in this thread, this is because this is the most memory bandwidth per core available.
HMN, Anna Tian and firat like this.
kyle is offline   Reply With Quote

Old   March 12, 2013, 18:55
New Member
Join Date: Apr 2012
Posts: 27
Rep Power: 7
HMN is on a distinguished road
Thanks evcelica for linking me to this discusion.

I had exactly the same question in other post and now it's clear: There's no support at all for the GForce GPUs and probably never will be. I don't really understand why. They are not so powerful and have less memory (2+GBytes) as the Quadro or Tesla but they could be a solution in cases where there's no much money to invest.

Perhaps there's agreements to support only that hardware... so they can sell it for a higher cost.
HMN is offline   Reply With Quote

Old   March 20, 2013, 21:52
Senior Member
Join Date: Feb 2011
Location: Earth (Land portion)
Posts: 694
Rep Power: 13
evcelica is on a distinguished road
I thought this was hilarious and would like to share it.

I was reading one of the marketing pamphlets Daveo643 posted: Boost your productivity through HPC (pdf)

On slide 19 one "customer" of ANSYS and their HPC and GPU solutions is analyzing 3D glasses and says:

By optimizing our solver selection and workstation configuration,
and including GPU acceleration, we’ve been able to dramatically
reduce turnaround time from over two days to just an hour. This
enables the use of simulation to examine multiple design ideas and
gain more value out of our investment in simulation.

I'm thinking what kind of piece of crap computer did they have before this 77x speedup? They say they upgraded "solver selection and workstation configuration, and including GPU acceleration" so I'm thinking these results are total B.S. and aren't comparable at all, they obviously compared a single core of the biggest P.O.S. computer solving out of core and disk thrashing to a high performance cluster just to make the comparison look good.... then I look at who the quote is from:

-Berhanu Zerayohannes, Senior Mechanical Engineer, NVIDIA

Ha Ha HA..
An NVIDIA engineer saying that NVIDIA Tesla GPUs gave him 77x speedup!!!!

REALLY?!?, marketing at its worst!

evcelica is offline   Reply With Quote

Old   July 11, 2013, 10:44
Default Hyper-Threading
New Member
Join Date: Aug 2011
Posts: 28
Rep Power: 8
schwermetall is on a distinguished road
Hi everyone,
just a comment on the afore mentioned Hyper-Threading. As far as I know, CFD runs don't benefit from Hyper-Threading. The reason is, that the idea of Hyper-Threading is to make unused CPU-Power accessible.
Say you have a dual-core machine with hyper-threading, than you can run 4 processes that need half of the performance of a single core. Without hyper-Threading you will have difficulties with such a task. So Hyper-Threading is a kind of managing of processes so you can use 100% of your performance with multiple processes. The point of CFD runs is, that a single process uses 100% of a core's performance, so there is no speed up with hyper-threading.
I'm not sure whether the description of hyper-threading is technically correct, but I'm sure of the statement that Hyper-threading doesn't speed up CFD runs. I worked on different projects in university with OpenFoam and CFX and this statement was confirmed by every PhD I asked. On our 12-core cluster no one uses the 24 cores (it does has hyper-threading capability).

Hope it helps
schwermetall is offline   Reply With Quote

Old   July 17, 2013, 09:54
New Member
Phillip Boldra
Join Date: Mar 2013
Location: Central Texas
Posts: 1
Rep Power: 0
anvaloy is on a distinguished road
Send a message via Skype™ to anvaloy
I am running Flow3d Cast for HPDC on a BOXX Technologies Extreme 3D i7 3940 overclocked to 4.5ghz. 32gb of ram with a NVidia quarto 8000 video card and I can run a 134gb flow simulation within 24 hours.
anvaloy is offline   Reply With Quote

Old   July 17, 2013, 10:03
New Member
Join Date: Aug 2011
Posts: 28
Rep Power: 8
schwermetall is on a distinguished road
Hi anvaloy,
that's interesting, can you give a little more detail on your simulation? Number of cells and which kind of Simulation you're running (steady,unsteady, RANS...)

Just out of curiosity, I ran OpenFOAM on i7-2600K with 4x 3.8 GHz and it crashed after 1,5 years and fried the mainboard as well. For how long are you running that system over clocked?

schwermetall is offline   Reply With Quote

Old   February 11, 2014, 16:20
Senior Member
Anna Tian's Avatar
Meimei Wang
Join Date: Jul 2012
Posts: 494
Rep Power: 9
Anna Tian is on a distinguished road
Originally Posted by CapSizer View Post
OK, it's a quiet Saturday evening, so I will bite ;-)

First of all, before you start agonizing about the hardware, it is necessary to address the question of the software licenses at your disposal. CFD software is way more expensive than the hardware you will be running on, so this needs to be sorted out first. There is no merit in getting a 16-core workstation if you can only run 8-way parallel. AFAIK, the way that Ansys markets parallel Fluent these days, you buy the "HPC" facility in steps of 8, 32, 128, 512 cores. If you have only paid for one HPC pack, you can only run 8-way parallel, so you have to figure out the best hardware configuration for 8 cores. The next step is the ability to run on 32 cores, which is a really big step. There's not much sense in having 32-core capability but only running say 16 cores. Sort this question out first before committing to any hardware. In my experience, 8 million cells will definitely be much better on 32-way parallel than 8-way, but there may be only relatively small gains if you try to go for more parallel cores than that.

8 million cells, coupled solver, 2-equation TM .... That would fit (I think!) fine in 16 GB of RAM if you are running single precision, but you may run out of memory if you need to use double precision. RAM is inexpensive, so I would say go for 32 GB rather. The effect of total memory system is that you need to have enough, but no more. You can't gain speed by adding more RAM, provided that you have enough to start with. If you run out of memory, everything stops. The computer will try to swap memory to disk, but that is so slow and unresponsive that you will be tempted to pull the plug in order to stop things. Not a good idea, by the way.

I have been advised by hardware experts to use ECC rather than ordinary RAM, but frankly I cannot say that I have ever seen a benefit from using ECC RAM on a single socket machine. Many (most, all?) multi-socket systems require ECC however.

The two characteristics of memory that do make a huge difference to CFD speed are the actual memory clock speed (get the fastest supported by the chipset) and the number of memory channels. Inexpensive single socket systems (AMD FX, Intel i5) use two parallel memory channels (typically, but not always, 4 slots in total). By contrast the current Intel i7 uses four (either 4 or 8 slots in total). When you measure CFD performance, you find that this is what really makes the difference. The server CPU's (Intel Xeon E5 and AMD Socket G34) will use 4 parallel channels per socket. So the nice thing about a dual socket server board is that you have a total of 8 memory slots feeding the CPU's, instead of the 4 that you would get from a single Core i7. Two Core i7 systems linked with GB ethernet will probably be competitive with a dual socket workstation (i.e. 8 memory channels for both systems), and probably cost a bit less, but distributed parallel is always just a little bit of a pain to deal with.

Neither CPU clock speed, nor cache size, nor even architecture, is as significant as the memory system, when it comes to performance in CFD. For example, and AMD FX8150 running either 4 or 8 cores, will be close in performance to the very different Intel Core i5 (4 cores), because both can use the same memory system (two channel 1600 MHz DDR3 as standard, although there are overclocking options). Neither can match the Core i7, with its 4 memory channels. The same effect is likely to be seen when comparing Opterons and Xeons. Yes, you can get 16 cores in an Opteron CPU, but these are fed by 4 memory channels, just like the 8 core Xeon, so don't expect it to be any quicker.

This is not to say, however, that clock speed, core core count and cache are insignificant, but sort the software license and memory questions out first. If your software license requires you to pay per parallel process, get a smaller number of the fastest cores that you can get. If parallel licensing is a flat fee (like Adapco power session) it starts making sense to go for more cores and more memory channels.

If all that you can afford is an 8-process HPC license, think in terms of two linked core i7's, or a Xeon workstation with two E5-2643 CPU's.
Your post is very helpful and interesting. Thanks!

You mentioned that usually the memory bandwidth matters. I'm wondering is there a upper limit of it so that it becomes very costly to further increase it and the increase of memory bandwidth won't improve the CFD running speed significantly if the memory bandwidth is already above that limit?
Best regards,

Last edited by Anna Tian; February 12, 2014 at 13:51.
Anna Tian is offline   Reply With Quote

Old   February 18, 2014, 06:40
Default technical requirements for a new Fluent Workstation
New Member
Join Date: Feb 2014
Posts: 1
Rep Power: 0
Rachita is on a distinguished road
ANSYS Mechanical is used for mechanical and structural engineering
The solution is used to compute the
response of a structural system. The equation solvers that
are used to drive the simulation are computat
ional intensive.
The equation solvers run o
n central
processing unit (CPU)
core(s) and in add
ition can run
graphics processing unit
hardware is p
arallel computer architecture.
The CPU core(s) will continue to be used for all other
computations in a
nd around the equation solver
when GPU hardware is used
The large arrays of
ation solvers and datasets used in the
simulation require a large, fast memory system.
data storage files accessed during simulation benefit from dedicated, fast storage
I/O systems.
Use as much memory as possible
to minimize the I/O required.
The ap
plication has the ability to
use parallel computing (both shared memory and distributed memory). The distributed
model can run on
a single
machine or across machines/nodes
connected via high speed

get latest updates
Rachita is offline   Reply With Quote


fluent, specifications, technical specs, workstation

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

Similar Threads
Thread Thread Starter Forum Replies Last Post
Abaqus - Fluent Coupling WITHOUT MPCCI s.mishra FLUENT 1 April 5, 2016 06:47
What the differences flow equation of Fluent 6.3 and Fluent 12.1 opehterinar81 FLUENT 0 August 19, 2011 11:55
Engine modelling using fluent - advice nat2479 FLUENT 0 February 1, 2011 15:18
Fluent VOF Method - At a total loss advice required please LSF Main CFD Forum 5 April 13, 2009 21:56
solving ocean wave with Fluent or CFX? gholamghar Main CFD Forum 1 March 21, 2009 13:49

All times are GMT -4. The time now is 16:02.