FLOPS
How to caluclate flops(floating point oprations in the fortran code. Is there any standard subroutine or functions pratap
|
Re: FLOPS
FLOPS or Floating Operations per Second is a characteristic of your computer and not of your code.
|
Re: FLOPS
Hmm.
If I compile my code with no optimization and measure the flop rate for a typical problem, I get a speed of n flops. If I recompile the same code with full optimization, then measure the flop rate on the same problem, I get a speed of 2n or 3n [I'm not the best programmer : ) ]. I ran both of these on the same computer. Perhaps the flop rate can depend on several different things other than the computer? |
FLOPS estimate
You can estimate the FLOPs by estimating the number of floating point operations per cycle (time step or iteration). That generally involves
(number of nodes) x (multiplies + adds + divides + subtracts per node) + (same calculation for application of boundary conditions) is number of floating point operations per cycle. Multiply by cycles/sec (you can time this) to get FLOPs. The Los Alamos folks suggest a simpler alternative, the 'cpu time per cell per cycle', or "grind." It's a lot easier to calculate than estimating all of the arithmetic operations. Probably the computer science folks have made all of this precise and mathematical? |
Re: FLOPS
Assuming that the previos posters have misunderstood your question...
There are no standard ways in FORTRAN or C to automatically count FLOPS. My guess is that you're trying to optimise your implementation by minimising FLOPS carried out. If this is the case, some programming environments provide really detailed profiling tools. I got this from profiling a simple floating point loop program on my Compaq system here. Note the line that says the program did 201 FLOPS. HTH - Steve % cc -g program.c % pixie a.out % a.out.pixie % prof -pixstats a.out: 1088 (1.037) cycles (2.176e-06s @ 500.0MHz) 1049 (1.000) instructions 419 (0.399) interlock cycles due to basic block boundary 10 (0.010) nops 391 (0.373) alu (including logicals, shifts) 157 (0.150) logicals (including ldah and lda) 0 (0.000) shifts 0 (0.000) prefetches 185 (0.176) loads 15 (0.014) stores 200 (0.191) loads+stores 34 (0.032) load followed by load 200 (0.191) data bus use 186 (0.177) sp+gp load/stores 201 (0.192) flops (92.4 mflop/s @ 500.0MHz) 135 (0.129) conditional branches 100 (0.095) branch to branch 100 (0.095) branch to branch taken 263 (0.251) basic blocks 6 (0.006) calls 0 (0.000) skip <plus loads more output not included> |
Re: FLOPS
In order to know MegaFLOPS (and other events like loads,stores and so on...) you have to instrument your code.
Take a look to: http://www.fz-juelich.de/zam/PCL/ where you can find a quit "portable" performace counter library. Best regards, Giorgio |
All times are GMT -4. The time now is 01:55. |