CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Software User Forums > OpenFOAM > OpenFOAM Installation

[Other] Comparison of OpenFOAM on i7, Xeon@32 cores, Xeon Phi Knights Landing, Tesla K20m

Register Blogs Community New Posts Updated Threads Search

Like Tree11Likes
  • 10 Post By ma-tri-x
  • 1 Post By ma-tri-x

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   September 28, 2016, 08:48
Default Comparison of OpenFOAM on i7, Xeon@32 cores, Xeon Phi Knights Landing, Tesla K20m
  #1
Member
 
Join Date: Sep 2013
Posts: 46
Rep Power: 12
ma-tri-x is on a distinguished road
Hey users, thought this might be interesting for you.

I compared a simple DamBreak case for interFoam with 100x100x100 cells and 115 time steps on the following machines:

- intel core i7, 3.4 ghz, 4 cores with OpenFOAM 2.3.0
- Tesla K20M with RapidCFD
- intel Xeon 2.2Ghz, 32 cores
- Xeon Phi Knights Landing 64 cores

so this compares more or less all kinds of currently available architectures.
RapidCFD and OpenFOAM2.3.0 are comparable, I think, in their usage. RapiCFD is I think the port of Openfoam 2.3.0 to CUDA based language.

Here's the values ("machine" <computation time [s]>):
"OF, Knights Landing Xeon Phi at 64 cores" 142.17
"OF, CPU Xeon 2.0 Ghz at 32 cores" 309.28
"Rapid-CFD, Tesla K20M at 100 W/250 W" 558.05
"OF, CPU core i7 3.40 Ghz at 4 cores" 1687.29

So the knights landing is quite ahead. Seems that OpenFOAM was already prepared for vectorization? At least to some extend. For the KNL OF was compiled with cray-mpi and Icc, Icpc 17.xxxx and vectorization flag "-xmic-avx512" instead of "-mmic".

I tried to find the market prices, but I don't guarantee for accuracy:
- Xeon Phi KNL: 6000 € + "mother hardware"
- 32 cores Xeon: > 4000 € (don't have an exact clue)
- tesla k20m: seems to be not on stock anymore. k20: $2500
- core i7: about $330 + mother hardware.

Computation times are attached as a histogram. Also the case file blockMeshDict that I used:

Code:
convertToMeters 1;

vertices
(
    (0 0 0)  // Vertex bld = 0 
    (1 0 0)  // Vertex brd = 1 
    (1 0 -1)  // Vertex frd = 2 
    (0 0 -1)  // Vertex fld = 3 

    (0 1 0)  // Vertex blt = 4 
    (1 1 0)  // Vertex brt = 5 
    (1 1 -1)  // Vertex frt = 6 
    (0 1 -1)  // Vertex flt = 7 
);

blocks
(
    hex (0 1 2 3   4 5 6 7) (100 100 100) simpleGrading (1 1 1)
);

edges
(
);

boundary
(
    Wall
    {
        type wall;
        faces
        (
            (0 1 2 3)
            (4 7 6 5)
        (3 7 4 0)
        (1 5 6 2)
        (3 2 6 7)
        (0 4 5 1)
        );
    }
);

mergePatchPairs
(
);
Attached Files
File Type: pdf compData.pdf (7.8 KB, 244 views)
mprinkey, snak, wyldckat and 7 others like this.
ma-tri-x is offline   Reply With Quote

Old   September 28, 2016, 16:37
Default
  #2
Retired Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,975
Blog Entries: 45
Rep Power: 128
wyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to all
Hi ma-tri-x,

Many thanks for the report and tests!

If you could test building with OpenFOAM-dev on the Knights Landing, you should see a considerable improvement! Paul Edwards from Intel has been working directly with the OpenFOAM Foundation to improve performance even further!
Look for the abstract "Performance Optimization of OpenFOAM on the new Intel® Xeon Phi™ Processor" on the Agenda page for the 4th Annual OpenFOAM User Conference 2016: http://www.esi-group.com/company/eve...ce-2016/agenda

Also, you can see dedicated rules for the KNL in OpenFOAM-dev: "linux64IccKNL" and "linux64GccKNL"

By the way, which model of KNL are you using? Is it the one that is directly installed on the motherboard or the PCI-E card edition?

Best regards,
Bruno
__________________
wyldckat is offline   Reply With Quote

Old   September 29, 2016, 06:26
Default
  #3
Member
 
Join Date: Sep 2013
Posts: 46
Rep Power: 12
ma-tri-x is on a distinguished road
Hi wyldckat!

Thanks for the quick reply! Yes I was thinking whether I should go to cologne, but my schedule won't make it possible.

As far as I know it's the newest integrated version of KNL. For sure not the PCIe version. It was part of the HPC of the HLRN.

Good news that OpenFOAM is going to be optimized for the KNL!
wyldckat likes this.
ma-tri-x is offline   Reply With Quote

Old   April 3, 2017, 06:00
Default
  #4
New Member
 
Sumeet Patil
Join Date: Oct 2016
Location: Pune
Posts: 9
Rep Power: 9
Sumeet Patil is on a distinguished road
Hii,

Have anyone worked with profiling of OpenFOAM on Intel Xeon Phi ?
Can you help me out ? I'm unable to profile the OpenFOAM solver execution on MIC.
Sumeet Patil is offline   Reply With Quote

Old   February 27, 2018, 13:46
Default
  #5
New Member
 
Daniel W Theobald
Join Date: Feb 2017
Posts: 10
Rep Power: 9
pm11dt is on a distinguished road
Quote:
Originally Posted by ma-tri-x View Post
Hey users, thought this might be interesting for you.

I compared a simple DamBreak case for interFoam with 100x100x100 cells and 115 time steps on the following machines:

- intel core i7, 3.4 ghz, 4 cores with OpenFOAM 2.3.0
- Tesla K20M with RapidCFD
- intel Xeon 2.2Ghz, 32 cores
- Xeon Phi Knights Landing 64 cores

so this compares more or less all kinds of currently available architectures.
RapidCFD and OpenFOAM2.3.0 are comparable, I think, in their usage. RapiCFD is I think the port of Openfoam 2.3.0 to CUDA based language.

Here's the values ("machine" <computation time [s]>):
"OF, Knights Landing Xeon Phi at 64 cores" 142.17
"OF, CPU Xeon 2.0 Ghz at 32 cores" 309.28
"Rapid-CFD, Tesla K20M at 100 W/250 W" 558.05
"OF, CPU core i7 3.40 Ghz at 4 cores" 1687.29

So the knights landing is quite ahead. Seems that OpenFOAM was already prepared for vectorization? At least to some extend. For the KNL OF was compiled with cray-mpi and Icc, Icpc 17.xxxx and vectorization flag "-xmic-avx512" instead of "-mmic".
I am assuming you have decomposed the cases in the following way: Xeon phi -np 64, Xeon 2.0 Ghz -np 32, Core i7 -np 4.

I which case how you can you say the Xeon phi case is faster when you're running on twice as processors (compared to Xeon 2.0 Ghz)!? Obviously its going to be twice as fast on twice as many processors!

I'm looking more depth into speed up on xeon phi KNL based environments for OpenFOAM. I'm not seeing anything like the kinds of speed ups mentioned in literature just yet even when compiling with special flags etc...
pm11dt is offline   Reply With Quote

Old   March 6, 2018, 06:35
Default
  #6
Member
 
Join Date: Sep 2013
Posts: 46
Rep Power: 12
ma-tri-x is on a distinguished road
Quote:
I which case how you can you say the Xeon phi case is faster when you're running on twice as processors (compared to Xeon 2.0 Ghz)!? Obviously its going to be twice as fast on twice as many processors!
Obviously, you haven't got much experience with speedups on different machines. Speedups are not determined by the amount of cores. If so, why is it then not like:
64*1.1Ghz = 70.4
32*2.0Ghz = 64
--> almost same speed but KNL has proven double speed.

Or even:
4*3.4Ghz = 13.6
32*2.0Ghz = 64
--> Factor of 4.7
but 1687.29/309.28 = 5.46
??

It depends on what you want to compare and how the software is capable of using the hardware ressources (a processor is not only determined by Ghz number and amount of cores). It even also depends on the compiler and its flags, sometimes also on the Operating System.
ma-tri-x is offline   Reply With Quote

Old   March 10, 2018, 09:00
Default
  #7
New Member
 
Daniel W Theobald
Join Date: Feb 2017
Posts: 10
Rep Power: 9
pm11dt is on a distinguished road
Quote:
Originally Posted by ma-tri-x View Post
Obviously, you haven't got much experience with speedups on different machines. Speedups are not determined by the amount of cores. If so, why is it then not like:
64*1.1Ghz = 70.4
32*2.0Ghz = 64
--> almost same speed but KNL has proven double speed.

Or even:
4*3.4Ghz = 13.6
32*2.0Ghz = 64
--> Factor of 4.7
but 1687.29/309.28 = 5.46
??

It depends on what you want to compare and how the software is capable of using the hardware ressources (a processor is not only determined by Ghz number and amount of cores). It even also depends on the compiler and its flags, sometimes also on the Operating System.
You make a good point however you didnt make your basis of comparison clear at all in your post. You didn't even mention the clock speed of the KNL cores.

I mean ultimately the wouldnt best comparison to make would be to base it on performance for a system of given FLOPS?

Also in addition to your point, your right, but additionally you would expect the KNL to run slow when running on 64 cores (compared to 32 on Xeon) becuase of the MPI lag involved and communication overheads.

And I am aware of the vectorision architecture of the KNL nodes and compiling with -O3 optimisation and such.

I hope to do some serious work on this subject and potentially even publish. It will be interesting to see if the same performance boost is seen across multiple Xeon phi compute nodes (scale up).

My work generally involves cases running on 240 cores or more (standard 2.5Ghz core xeons) so this is an area I am invested in.
pm11dt is offline   Reply With Quote

Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
[OpenFOAM.org] OpenFOAM build on Intel Xeon Phi asaijo OpenFOAM Installation 31 July 26, 2017 10:35
OpenFOAM profiling on Intel Xeon and Xeon Phi processors Sumeet Patil OpenFOAM Programming & Development 3 April 28, 2017 14:19
Running OpenFoam in parallel on xeon phi bala_gk1988 OpenFOAM Running, Solving & CFD 1 July 28, 2015 16:16
Superlinear speedup in OpenFOAM 13 msrinath80 OpenFOAM Running, Solving & CFD 18 March 3, 2015 05:36
New OpenFOAM Forum Structure jola OpenFOAM 2 October 19, 2011 06:55


All times are GMT -4. The time now is 22:40.