# How to assess performance of a CFD Application?

 Register Blogs Members List Search Today's Posts Mark Forums Read

 November 17, 2016, 13:12 How to assess performance of a CFD Application? #1 Senior Member   Hector Redal Join Date: Aug 2010 Location: Madrid, Spain Posts: 191 Rep Power: 9 Hi, I am interested in evaluating the performance of a CFD Application I have developed. I am simulating a discretized domain with the following values: Number of nodes = 19716 Number of tria elements = 38887 Method used: FEM (Finite Element Method). Initial Time = 0 Final Time = 140 Delta Time = 0.0004 This means that the number of iterations is 350000. The simulation time takes more or less 4 days. I would like to know if the application I have developed is too slow and it needs further improvement / optimization. Which should be the expected simulation time? I see that more or less the calculation of every time step takes 1 second. Any input / comment / opinion is welcome. Best regards, Hector.

November 17, 2016, 13:33
#2
Senior Member

Filippo Maria Denaro
Join Date: Jul 2010
Posts: 3,487
Rep Power: 40
Quote:
 Originally Posted by HectorRedal Hi, I am interested in evaluating the performance of a CFD Application I have developed. I am simulating a discretized domain with the following values: Number of nodes = 19716 Number of tria elements = 38887 Method used: FEM (Finite Element Method). Initial Time = 0 Final Time = 140 Delta Time = 0.0004 This means that the number of iterations is 350000. The simulation time takes more or less 4 days. I would like to know if the application I have developed is too slow and it needs further improvement / optimization. Which should be the expected simulation time? I see that more or less the calculation of every time step takes 1 second. Any input / comment / opinion is welcome. Best regards, Hector.

Generally, before assessing the performance, the key is to do a careful validation. Then, you can look at performance in terms of work-unit. If you use a Fortran compiler, you can simply start with the profiling to check the percentage of work in the parts of your code. I do not consider the real time that depends strongly on the computer hardware you are using.
In general, the most part of the computational time is in the linear algebra solutors.
You should also check the scaling of your code for an increasing number of unknowns

 November 17, 2016, 13:34 #3 Senior Member   Filippo Maria Denaro Join Date: Jul 2010 Posts: 3,487 Rep Power: 40 P.S.: 4 days in your such a coarse grid is too much time

 November 17, 2016, 13:40 #4 Senior Member   Hector Redal Join Date: Aug 2010 Location: Madrid, Spain Posts: 191 Rep Power: 9 Hi Filippo, Well, it is a transient solution. It simulates a flow past a ciruclar cylinder. I have developed the application in C/C++. I am using a profiler to profile the application, but what I see is that the most time is spent in preparating the data in each time step for the matrix solver. I see that the inversion of the matrix to compute pressure is not the hot line of execution. But anyway, your response is quite enough for what I was looking for. I though that my grid was fine enough, but it appears not , :-( Best regards, Hector.

 November 17, 2016, 13:45 #5 Senior Member   Filippo Maria Denaro Join Date: Jul 2010 Posts: 3,487 Rep Power: 40 Are you assembling a matrix with coefficients that depends on time?

 November 17, 2016, 13:50 #6 Senior Member   Filippo Maria Denaro Join Date: Jul 2010 Posts: 3,487 Rep Power: 40 you can see here how we analysed the performances of a own-made code https://www.researchgate.net/publica...putation_tools

 November 18, 2016, 05:12 #7 Senior Member   Hector Redal Join Date: Aug 2010 Location: Madrid, Spain Posts: 191 Rep Power: 9 Apart from the matrices that consider the advecting terms, I am not assembling matrices with time dependent coefficients.

November 18, 2016, 05:15
#8
Senior Member

Hector Redal
Join Date: Aug 2010
Posts: 191
Rep Power: 9
Quote:
 Originally Posted by FMDenaro you can see here how we analysed the performances of a own-made code https://www.researchgate.net/publica...putation_tools

I will take a look at there reference you have provided.
It appears quite interesting. I have only read the title and the abstract of the reference. It appears to be related to optimization of calculation with sparse matrix.

As far as I have seen from my profiler tool, right now, the bottleneck in the application is not the sparse matrix calculations. But maybe I am wrong. I am going to double check this.

November 18, 2016, 05:24
#9
Senior Member

Filippo Maria Denaro
Join Date: Jul 2010
Posts: 3,487
Rep Power: 40
Quote:
 Originally Posted by HectorRedal Apart from the matrices that consider the advecting terms, I am not assembling matrices with time dependent coefficients.
ok, if you are using a linearization of the convective term you actually have to assemble the matrix of the momentum equation at each time step....conversely, we use the Adam-Bashforth explicit scheme for the convection therefore the matrix assembling we did is out of the time integration. Then, the pressure equation leads to coefficient not depending on time.
However, your computational cost for the assembling is too much compared to the solution of the algebric system. You should consider some less expensive method.

 November 18, 2016, 07:36 #10 Senior Member     Paolo Lampitella Join Date: Mar 2009 Location: Italy Posts: 761 Blog Entries: 17 Rep Power: 21 Profiling is certainly necessary. Still, it only gives you a relative figure of merit. For example, you might be doing everything correctly or even state of the art, with the best O(n) algorithms but, say, have all the variables allocated wrong, so every access produces a cache miss or, say, you might not be using the best optimization flags for your architecture. As Filippo wrote, scaling with respect to the problem size n is also a useful check (just to be sure that you do not have anything above n log n like, say, n^2). Still, in the end, your description will always be too much vague to have a sharp estimate of how much time should your simulation take. There are just too many variables to consider. I suggest to find a second code which does similar things and is well accepted by your community. Then just try to compare with that using similar settings on the same machine. FMDenaro likes this.

 November 18, 2016, 07:58 #11 Senior Member   Filippo Maria Denaro Join Date: Jul 2010 Posts: 3,487 Rep Power: 40 I can just suggest to consider using an explicit integration for the convection, in such a way the assembling of the matrix is done only one time out of the cycle of time integration.

November 19, 2016, 17:54
#12
Senior Member

Hector Redal
Join Date: Aug 2010
Posts: 191
Rep Power: 9
Quote:
 Originally Posted by sbaffini Profiling is certainly necessary. Still, it only gives you a relative figure of merit. For example, you might be doing everything correctly or even state of the art, with the best O(n) algorithms but, say, have all the variables allocated wrong, so every access produces a cache miss or, say, you might not be using the best optimization flags for your architecture. As Filippo wrote, scaling with respect to the problem size n is also a useful check (just to be sure that you do not have anything above n log n like, say, n^2). Still, in the end, your description will always be too much vague to have a sharp estimate of how much time should your simulation take. There are just too many variables to consider. I suggest to find a second code which does similar things and is well accepted by your community. Then just try to compare with that using similar settings on the same machine.
Hi Paolo,

First of all, I would like to thank you for your time and godd explanation.

I am trying to optimize all the code: running the profiler sevaral times after any change, looking for any improvement in any nook and crany I can.

The problem that I see is that I have used Object Oriented programming for developing my application, and I don't know if it has not been a good decision.

I am keeping optimizing the application.

Best regards,
Hector.

November 19, 2016, 17:55
#13
Senior Member

Hector Redal
Join Date: Aug 2010
Posts: 191
Rep Power: 9
Quote:
 Originally Posted by FMDenaro I can just suggest to consider using an explicit integration for the convection, in such a way the assembling of the matrix is done only one time out of the cycle of time integration.
Well, this is the scheme I am using right now.

 November 22, 2016, 11:41 #14 Senior Member   Hector Redal Join Date: Aug 2010 Location: Madrid, Spain Posts: 191 Rep Power: 9 Hi, I have optimized my code, eliminating some bottlenecks. Now, the simulation has taken approximately one day (26 hours). So as to considering if this is a enough or not, as you had commented I need to compare it with some commercial software or equivalent. The point is that I have not access to any commercial software. Is there any resource (online or not) that can be checked where this figures may appear? Or leat least any evaluation software I can use for comparing with it? Normally, evaluation software only allows to simulate problems with a reduced number of nodes / elements. So, I bet this will be difficult to investigate.

November 22, 2016, 12:23
#15
Senior Member

Filippo Maria Denaro
Join Date: Jul 2010
Posts: 3,487
Rep Power: 40
Quote:
 Originally Posted by HectorRedal Hi, I have optimized my code, eliminating some bottlenecks. Now, the simulation has taken approximately one day (26 hours). So as to considering if this is a enough or not, as you had commented I need to compare it with some commercial software or equivalent. The point is that I have not access to any commercial software. Is there any resource (online or not) that can be checked where this figures may appear? Or leat least any evaluation software I can use for comparing with it? Normally, evaluation software only allows to simulate problems with a reduced number of nodes / elements. So, I bet this will be difficult to investigate.

First, you need to check the scaling, so you can double your number of unknow in each direction and check the CPU time.
Then, you can use OpenFOAM that is an open source CFD code. An other free code is http://code-saturne.org/cms/

 November 22, 2016, 17:34 #16 Senior Member     Paolo Lampitella Join Date: Mar 2009 Location: Italy Posts: 761 Blog Entries: 17 Rep Power: 21 Dear Hector, object orienting has typically a penalty, but this depends from the level where you introduce it, the language and, in the end, the compiler. I usually use large objects (i.e. objects of arrays) but this decision may have some subjectivity. For the comparison, i am aware of elmerfem (which maybe is more comparable to your fem code), but in general it is plenty of opensource fem codes (check home->wiki->codes on this site). Another option is ansys, which now gives its codes for free and they work up to 500k nodes (it's still better than nothing).

November 27, 2016, 23:43
#17
New Member

Join Date: May 2016
Posts: 1
Rep Power: 0
Performing Validation studies in form of spatial, and Temporal convergence test is a better approach on assessing your code.

Quote:
 Originally Posted by HectorRedal Hi, I am interested in evaluating the performance of a CFD Application I have developed. I am simulating a discretized domain with the following values: Number of nodes = 19716 Number of tria elements = 38887 Method used: FEM (Finite Element Method). Initial Time = 0 Final Time = 140 Delta Time = 0.0004 This means that the number of iterations is 350000. The simulation time takes more or less 4 days. I would like to know if the application I have developed is too slow and it needs further improvement / optimization. Which should be the expected simulation time? I see that more or less the calculation of every time step takes 1 second. Any input / comment / opinion is welcome. Best regards, Hector.

November 28, 2016, 00:16
#18
Senior Member

Arjun
Join Date: Mar 2009
Location: Nurenberg, Germany
Posts: 704
Rep Power: 19
Quote:
 Originally Posted by HectorRedal Hi, I have optimized my code, eliminating some bottlenecks. Now, the simulation has taken approximately one day (26 hours). So as to considering if this is a enough or not, as you had commented I need to compare it with some commercial software or equivalent. The point is that I have not access to any commercial software. Is there any resource (online or not) that can be checked where this figures may appear? Or leat least any evaluation software I can use for comparing with it? Normally, evaluation software only allows to simulate problems with a reduced number of nodes / elements. So, I bet this will be difficult to investigate.
It is very hard to tell but if you want to make picture of how other codes are doing, I have FVUS/Wildkatze using lots of object oriented and C++ stuff like inheritence, dynamic castings, hashtable searches and lots of things that are supposed to slow the code down.

One of the test case that i use have 24000 cells and typically it does 500 iterations in a minute or so on my i7 (second generation).
So if the time step involved 5 inner iterations you are looking at 144000 times steps in a day.

With that regard you are more than twice as fast as far as efficiency goes. But if your time step involves 1 inner iteration (fractional type) then i think your solver is a bit slower (as FVUS actually did 5 x 144000 iterations in a day).

PS: These two solvers are apples and oranges but if the similar problem be solved you are doing comparable to others.

November 28, 2016, 17:11
#19
Senior Member

Hector Redal
Join Date: Aug 2010
Posts: 191
Rep Power: 9
Quote:
 Originally Posted by sbaffini Dear Hector, object orienting has typically a penalty, but this depends from the level where you introduce it, the language and, in the end, the compiler. I usually use large objects (i.e. objects of arrays) but this decision may have some subjectivity. For the comparison, i am aware of elmerfem (which maybe is more comparable to your fem code), but in general it is plenty of opensource fem codes (check home->wiki->codes on this site). Another option is ansys, which now gives its codes for free and they work up to 500k nodes (it's still better than nothing).
Well, ANSYS options seems quite good. Simulating a problem of 500k nodes is by far more than I am achieving with the code I developed. I will try to download the free version of this software and check which problems I can simulate.

Thanks for the information.

November 28, 2016, 17:24
#20
Senior Member

Hector Redal
Join Date: Aug 2010
Posts: 191
Rep Power: 9
Quote:
 Originally Posted by arjun It is very hard to tell but if you want to make picture of how other codes are doing, I have FVUS/Wildkatze using lots of object oriented and C++ stuff like inheritence, dynamic castings, hashtable searches and lots of things that are supposed to slow the code down. One of the test case that i use have 24000 cells and typically it does 500 iterations in a minute or so on my i7 (second generation). So if the time step involved 5 inner iterations you are looking at 144000 times steps in a day. With that regard you are more than twice as fast as far as efficiency goes. But if your time step involves 1 inner iteration (fractional type) then i think your solver is a bit slower (as FVUS actually did 5 x 144000 iterations in a day). PS: These two solvers are apples and oranges but if the similar problem be solved you are doing comparable to others.
First of all, many thanks for the information you have provided.
Obviously, as you are stating, comparing both algorithms is like comparing apples with oranges. But it gives you a rough estimation about the performance of my code. Other point worth considering is the hardware. When using different hardwarde, different solution times are going to be obtained. The more powerful the processor, the less time it will take. By the way, right now, I am using a Xeon Processor of 4th generation.

Trying to answering your question, I am using CBS algorithm (Characteristics Based Split Scheme), which is more or less similar to a fractional step. It contains three steps, where in each of them a linear system of equations has to be solved (A x = b): 1st step, estimation of the velocity without considering the pressure term. 2nd step, calculation of the pressure, enforcing divergence free of the velocity field. 3rd step, correction of the velocity estimation.
So, trying to compare my results with the figures provided by you, in every time step, my algorithms performs 3 iterations. This mean that every minute I am solving 375 iterations.
As I see, the code I have developed is 25% slower than the code you are using (Taking with all care this statement, as you are stating, we are comparing two algorithms that are quite different and running in different hardware).

One important thing I would like to highlight is that the tolerance I am using for solving the linear system is tol = 10e-8, being tol = norm (Ax - b) /norm (b).
So, I don't know if this is two stringent for a transient simulation.

 Thread Tools Display Modes Linear Mode

 Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is OffTrackbacks are On Pingbacks are On Refbacks are On Forum Rules

 Similar Threads Thread Thread Starter Forum Replies Last Post ssixr ANSYS 17 July 31, 2015 15:18 Anna Tian Hardware 5 July 18, 2014 14:32 Aldrin Wong Main CFD Forum 0 April 15, 2002 02:35 Chris Kleijn Main CFD Forum 0 September 25, 2001 10:17 Pedro Torres Main CFD Forum 0 December 14, 1999 15:49

All times are GMT -4. The time now is 23:23.