CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > OpenFOAM Running, Solving & CFD

OF Parallel Processing with Core i7 - How to Handle Hyperthreading

Register Blogs Members List Search Today's Posts Mark Forums Read

Reply
 
LinkBack Thread Tools Display Modes
Old   February 28, 2012, 20:23
Default OF Parallel Processing with Core i7 - How to Handle Hyperthreading
  #1
Senior Member
 
Daniel
Join Date: Jul 2009
Location: Montreal, Canada
Posts: 150
Rep Power: 7
dancfd is on a distinguished road
Hello all,

Does anyone have experience parallel processing with the Core i7 processor? It has 4 cores, however it appears as 8 in System Monitor due to hyperthreading of the cores. My question is this: I am running a 2-D airfoil simulation in a C-grid, which breaks up very nicely into 4 parts for parallel processing, but does not break up very nicely into 8. If I run in parallel on 4 processors I will only use half of my computer's processing power... any ideas on how I can best utilise the system's resources?

Thanks,

Dan
dancfd is offline   Reply With Quote

Old   February 28, 2012, 21:30
Default
  #2
Senior Member
 
kmooney's Avatar
 
Kyle Mooney
Join Date: Jul 2009
Location: Amherst, MA USA - San Diego, CA USA
Posts: 242
Rep Power: 8
kmooney is on a distinguished road
What kind of CPU utilization do you get with an 8 processor decomposition? I use an i7 and get pretty good efficiency with 7 or 8 threads.
kmooney is offline   Reply With Quote

Old   February 29, 2012, 03:41
Default
  #3
Member
 
Flavio Galeazzo
Join Date: Mar 2009
Location: Karlsruhe, Germany
Posts: 30
Rep Power: 7
flavio_galeazzo is on a distinguished road
Hi dancfd,

I also work with i7 processors in our cluster. You already has the clues that lead to the answer: the i7 processor has 4 physical cores, that appear as 8 virtual cores due to Hyper-Treading (HT). The processing power is not doubled when using HT, the physical cores remain the same. I have tested OpenFoam in three configurations:

1. HT deactivated - running parallel with 4 nodes
2. HT activated - running parallel with 4 nodes
3. HT activated - running parallel with 8 cores

In all configurations I am saturating the processing power of the machine, independent of what the task manager says. The performance difference is very small, with an 2-3% advantage to the configuration 1 (probably due the overhead of the HT system in configurations 2 and 3).

In short, once you get all your physical cores full, there is no advantage in splitting a physical core into 2 virtual ones, and fill them up.
flavio_galeazzo is offline   Reply With Quote

Old   February 29, 2012, 04:16
Default
  #4
Member
 
Rob
Join Date: Sep 2011
Posts: 55
Rep Power: 5
robbirobocop is on a distinguished road
From my point of view you cannot estimate your performance by means of the number of cores or nodes. It heavily depends on the case you are running in parallel. I have an i7 as well and for the latest bigger case I run - that was a steam drum by the way - 5 cores (hierarchical method) was the fastest configuration...

So if you have a huge case which needs a simulation time of around 2 or 3 days, I recommend to firstly run a little parallelisation study to see which configuration needs the shortest amount of time. Thus, you can save computational time
robbirobocop is offline   Reply With Quote

Old   March 2, 2012, 19:20
Default
  #5
Senior Member
 
Daniel
Join Date: Jul 2009
Location: Montreal, Canada
Posts: 150
Rep Power: 7
dancfd is on a distinguished road
Hello all,

Thank you for the detailed responses. I will run a parallelization study and post the results here.

Regards,

Dan
dancfd is offline   Reply With Quote

Old   March 5, 2012, 21:43
Default
  #6
Senior Member
 
Daniel
Join Date: Jul 2009
Location: Montreal, Canada
Posts: 150
Rep Power: 7
dancfd is on a distinguished road
Hello all,

In case anyone is still interested, I ran a 42k cell 2D airfoil C-grid mesh on 4, 6 and 8 processors, decomposed as follows with equal weighting given to each processor:

Code:
#Cores   Decomposition    Run Time [s]
4            2 2 1                  4231
6            2 3 1                  3103
8            2 4 1                  3252
I ran 3000 timesteps in simpleFoam, and found that the 6 cores were 27% faster than 4, and 8 cores were 23% faster than 4. I did not run any tests to verify the effect of changing the processor weighting or decomposition (e.g. 2-4-1 vs 4-2-1). For your information,

Dan
dancfd is offline   Reply With Quote

Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
solving a conduction problem in FLUENT using UDF Avin2407 Fluent UDF and Scheme Programming 0 April 13, 2010 01:49
Superlinear speedup in OpenFOAM 13 msrinath80 OpenFOAM Running, Solving & CFD 17 August 22, 2009 03:59
Parallel Processing in Quad Core Computer Francis FLUENT 2 August 5, 2008 08:35
Parallel processing in quad core Renato Pacheco FLUENT 1 June 4, 2008 12:06
FEDORA CORE and PARALLEL processing Tuks CFX 2 August 20, 2005 11:05


All times are GMT -4. The time now is 09:54.