
[Sponsors] 
November 17, 2016, 13:12 
How to assess performance of a CFD Application?

#1 
Senior Member
Hector Redal
Join Date: Aug 2010
Location: Madrid, Spain
Posts: 191
Rep Power: 9 
Hi,
I am interested in evaluating the performance of a CFD Application I have developed. I am simulating a discretized domain with the following values: Number of nodes = 19716 Number of tria elements = 38887 Method used: FEM (Finite Element Method). Initial Time = 0 Final Time = 140 Delta Time = 0.0004 This means that the number of iterations is 350000. The simulation time takes more or less 4 days. I would like to know if the application I have developed is too slow and it needs further improvement / optimization. Which should be the expected simulation time? I see that more or less the calculation of every time step takes 1 second. Any input / comment / opinion is welcome. Best regards, Hector. 

November 17, 2016, 13:33 

#2  
Senior Member
Filippo Maria Denaro
Join Date: Jul 2010
Posts: 3,487
Rep Power: 40 
Quote:
Generally, before assessing the performance, the key is to do a careful validation. Then, you can look at performance in terms of workunit. If you use a Fortran compiler, you can simply start with the profiling to check the percentage of work in the parts of your code. I do not consider the real time that depends strongly on the computer hardware you are using. In general, the most part of the computational time is in the linear algebra solutors. You should also check the scaling of your code for an increasing number of unknowns 

November 17, 2016, 13:34 

#3 
Senior Member
Filippo Maria Denaro
Join Date: Jul 2010
Posts: 3,487
Rep Power: 40 
P.S.: 4 days in your such a coarse grid is too much time


November 17, 2016, 13:40 

#4 
Senior Member
Hector Redal
Join Date: Aug 2010
Location: Madrid, Spain
Posts: 191
Rep Power: 9 
Hi Filippo,
Well, it is a transient solution. It simulates a flow past a ciruclar cylinder. I have developed the application in C/C++. I am using a profiler to profile the application, but what I see is that the most time is spent in preparating the data in each time step for the matrix solver. I see that the inversion of the matrix to compute pressure is not the hot line of execution. But anyway, your response is quite enough for what I was looking for. I though that my grid was fine enough, but it appears not , :( Best regards, Hector. 

November 17, 2016, 13:45 

#5 
Senior Member
Filippo Maria Denaro
Join Date: Jul 2010
Posts: 3,487
Rep Power: 40 
Are you assembling a matrix with coefficients that depends on time?


November 17, 2016, 13:50 

#6 
Senior Member
Filippo Maria Denaro
Join Date: Jul 2010
Posts: 3,487
Rep Power: 40 
you can see here how we analysed the performances of a ownmade code
https://www.researchgate.net/publica...putation_tools 

November 18, 2016, 05:12 

#7 
Senior Member
Hector Redal
Join Date: Aug 2010
Location: Madrid, Spain
Posts: 191
Rep Power: 9 
Apart from the matrices that consider the advecting terms, I am not assembling matrices with time dependent coefficients.


November 18, 2016, 05:15 

#8  
Senior Member
Hector Redal
Join Date: Aug 2010
Location: Madrid, Spain
Posts: 191
Rep Power: 9 
Quote:
I will take a look at there reference you have provided. It appears quite interesting. I have only read the title and the abstract of the reference. It appears to be related to optimization of calculation with sparse matrix. As far as I have seen from my profiler tool, right now, the bottleneck in the application is not the sparse matrix calculations. But maybe I am wrong. I am going to double check this. 

November 18, 2016, 05:24 

#9  
Senior Member
Filippo Maria Denaro
Join Date: Jul 2010
Posts: 3,487
Rep Power: 40 
Quote:
However, your computational cost for the assembling is too much compared to the solution of the algebric system. You should consider some less expensive method. 

November 18, 2016, 07:36 

#10 
Senior Member

Profiling is certainly necessary. Still, it only gives you a relative figure of merit. For example, you might be doing everything correctly or even state of the art, with the best O(n) algorithms but, say, have all the variables allocated wrong, so every access produces a cache miss or, say, you might not be using the best optimization flags for your architecture.
As Filippo wrote, scaling with respect to the problem size n is also a useful check (just to be sure that you do not have anything above n log n like, say, n^2). Still, in the end, your description will always be too much vague to have a sharp estimate of how much time should your simulation take. There are just too many variables to consider. I suggest to find a second code which does similar things and is well accepted by your community. Then just try to compare with that using similar settings on the same machine. 

November 18, 2016, 07:58 

#11 
Senior Member
Filippo Maria Denaro
Join Date: Jul 2010
Posts: 3,487
Rep Power: 40 
I can just suggest to consider using an explicit integration for the convection, in such a way the assembling of the matrix is done only one time out of the cycle of time integration.


November 19, 2016, 17:54 

#12  
Senior Member
Hector Redal
Join Date: Aug 2010
Location: Madrid, Spain
Posts: 191
Rep Power: 9 
Quote:
First of all, I would like to thank you for your time and godd explanation. I am trying to optimize all the code: running the profiler sevaral times after any change, looking for any improvement in any nook and crany I can. The problem that I see is that I have used Object Oriented programming for developing my application, and I don't know if it has not been a good decision. I am keeping optimizing the application. Best regards, Hector. 

November 19, 2016, 17:55 

#13 
Senior Member
Hector Redal
Join Date: Aug 2010
Location: Madrid, Spain
Posts: 191
Rep Power: 9 

November 22, 2016, 11:41 

#14 
Senior Member
Hector Redal
Join Date: Aug 2010
Location: Madrid, Spain
Posts: 191
Rep Power: 9 
Hi,
I have optimized my code, eliminating some bottlenecks. Now, the simulation has taken approximately one day (26 hours). So as to considering if this is a enough or not, as you had commented I need to compare it with some commercial software or equivalent. The point is that I have not access to any commercial software. Is there any resource (online or not) that can be checked where this figures may appear? Or leat least any evaluation software I can use for comparing with it? Normally, evaluation software only allows to simulate problems with a reduced number of nodes / elements. So, I bet this will be difficult to investigate. 

November 22, 2016, 12:23 

#15  
Senior Member
Filippo Maria Denaro
Join Date: Jul 2010
Posts: 3,487
Rep Power: 40 
Quote:
First, you need to check the scaling, so you can double your number of unknow in each direction and check the CPU time. Then, you can use OpenFOAM that is an open source CFD code. An other free code is http://codesaturne.org/cms/ 

November 22, 2016, 17:34 

#16 
Senior Member

Dear Hector, object orienting has typically a penalty, but this depends from the level where you introduce it, the language and, in the end, the compiler. I usually use large objects (i.e. objects of arrays) but this decision may have some subjectivity.
For the comparison, i am aware of elmerfem (which maybe is more comparable to your fem code), but in general it is plenty of opensource fem codes (check home>wiki>codes on this site). Another option is ansys, which now gives its codes for free and they work up to 500k nodes (it's still better than nothing). 

November 27, 2016, 23:43 

#17  
New Member
Join Date: May 2016
Posts: 1
Rep Power: 0 
Performing Validation studies in form of spatial, and Temporal convergence test is a better approach on assessing your code.
Quote:


November 28, 2016, 00:16 

#18  
Senior Member
Arjun
Join Date: Mar 2009
Location: Nurenberg, Germany
Posts: 704
Rep Power: 19 
Quote:
One of the test case that i use have 24000 cells and typically it does 500 iterations in a minute or so on my i7 (second generation). So if the time step involved 5 inner iterations you are looking at 144000 times steps in a day. With that regard you are more than twice as fast as far as efficiency goes. But if your time step involves 1 inner iteration (fractional type) then i think your solver is a bit slower (as FVUS actually did 5 x 144000 iterations in a day). PS: These two solvers are apples and oranges but if the similar problem be solved you are doing comparable to others. 

November 28, 2016, 17:11 

#19  
Senior Member
Hector Redal
Join Date: Aug 2010
Location: Madrid, Spain
Posts: 191
Rep Power: 9 
Quote:
Thanks for the information. 

November 28, 2016, 17:24 

#20  
Senior Member
Hector Redal
Join Date: Aug 2010
Location: Madrid, Spain
Posts: 191
Rep Power: 9 
Quote:
Obviously, as you are stating, comparing both algorithms is like comparing apples with oranges. But it gives you a rough estimation about the performance of my code. Other point worth considering is the hardware. When using different hardwarde, different solution times are going to be obtained. The more powerful the processor, the less time it will take. By the way, right now, I am using a Xeon Processor of 4th generation. Trying to answering your question, I am using CBS algorithm (Characteristics Based Split Scheme), which is more or less similar to a fractional step. It contains three steps, where in each of them a linear system of equations has to be solved (A x = b): 1st step, estimation of the velocity without considering the pressure term. 2nd step, calculation of the pressure, enforcing divergence free of the velocity field. 3rd step, correction of the velocity estimation. So, trying to compare my results with the figures provided by you, in every time step, my algorithms performs 3 iterations. This mean that every minute I am solving 375 iterations. As I see, the code I have developed is 25% slower than the code you are using (Taking with all care this statement, as you are stating, we are comparing two algorithms that are quite different and running in different hardware). One important thing I would like to highlight is that the tolerance I am using for solving the linear system is tol = 10e8, being tol = norm (Ax  b) /norm (b). So, I don't know if this is two stringent for a transient simulation. 

Thread Tools  
Display Modes  


Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Cfd to ansys thermal to ansys structural interface  ssixr  ANSYS  17  July 31, 2015 15:18 
Why not install cluster by connecting workstations together for CFD application?  Anna Tian  Hardware  5  July 18, 2014 14:32 
CFD Performance Metrics  Aldrin Wong  Main CFD Forum  0  April 15, 2002 02:35 
ASME CFD Symposium  Call for Papers  Chris Kleijn  Main CFD Forum  0  September 25, 2001 10:17 
Inquiry on CFD Application For Air Intake Systems  Pedro Torres  Main CFD Forum  0  December 14, 1999 15:49 