Your valuable opinion is needed
I have written a 3 dimensional model for flow. it has the following characteristics, 1 Incompressible laminar flow. 2 It is by a new method in finite difference. 3 As an example, I have modelled a flow with about 13500 nodes in 20 minutes with an 500 MHZ computer. there is a very good agreement between laboratory and modelling results.
Could you please give your valuable opinion about speed of my program? Thank you for your help. Thank you R Amini 
Re: Your valuable opinion is needed
Los Alamos (T3 group) used a measure they defined as the "grind," the CPU time required to advance a solution for one computational cell through one time step. In LA6296,
L. D. Cloutman, C. W. Hirt, and N. C. Romero, "SOLAICE: A Numerical Solution Algorithm for Transient Compressible Fluid Flows," Los Alamos National Lab (USA), July, 1976, they report a grind time for a coarselyresolved thermally driven cavity as 0.7 millisec/cell/cycle. This was run on CDC7600, the fastest computer of that time frame. I recently ran the same problem and algorithm on a 450 MHz Pentium II Dell Dimension. The grind time was 0.3. Of interest is that, on the PC, almost 80% of the clock time was for writing output files! If you're generating a steady state solution with your code, you'll need to replace the time step with some other metric. Maybe the number of iterations? 
Re: Your valuable opinion is needed
Although I don't know what algorithm Cloutman, Hirt, and Romero used, that grind time sounds pretty slow for a 450 MHz Pentium II. My shareware MicroTunnel, which also computes transient and unsteady compressible flow (inviscid),
www.microcfd.com/prism.htm integrates 800 x 600 = 480,000 cells per time step in 2.00 seconds on my Gateway 800 MHz Pentium III. The corresponding grind time would be, 2.00 s / 480,000 cells / cycle = 4.2 microseconds / cell / cycle Considering the difference in clock speed between the two processors, there is still a factor of 400 between these grind times, (3.0E03 s / 4.2E06 s) * (450MHz / 800 MHz) ~ 400 Does anybody else have any grind time data for transient compressible flow computations? Preferably inviscid. 
Re: Your valuable opinion is needed
I dont think that grind time is a good missure for speed of a program, because it has a linear form. For example inorder to solve N linear equation by Gauss elimination you need N^3 operation. Now if you double the number of cells then the time of computaion will become 8 time, not 4 time.

Re: Your valuable opinion is needed
If you double the resolution on a 2D compressible flow problem, running an explicit solver, your overall computation time increases by a factor of 8. You have 4 times the amount of cells (twice in X and twice in Y), and due to stability constraints you can only use 1/2 the original integration time step. I think grind time gives you a measure of how quickly you can 'turn the crank', regardless of what algorithm you are using, as long as it is time marching of some sort. Since the actual integration time step is not part of the grind time equation, it does not tell you how quickly your computation converges.

Re: Your valuable opinion is needed
Taking out the IO time, the grind drops to 18.1 microsec.
The compiler is Absoft and optimization was not turned on. More to the point, I said that the grid was coarse, but didn't give the numbers. The grid used on the original calculation  and mine to get a direct comparison  was 7 x 7; a lot of the calculation was imposing boundary conditions! Didn't really want to get into a grind shootout, just give the guy some idea of what he can use and a gross idea of the magnitude of the numbers. Is there any significant difference between PII and PIII for the same clock speed? 
Re: Your valuable opinion is needed
>Taking out the IO time, the grind drops to 18.1 microsec.
Thanks for the info! >The compiler is Absoft and optimization was not turned on. More to the point, I said that the grid was coarse, but didn't give the numbers. The grid used on the original calculation  and mine to get a direct comparison  was 7 x 7; a lot of the calculation was imposing boundary conditions! I would definitely be interested in a computation on a larger grid with all compile optimizations for speed turned on! >Didn't really want to get into a grind shootout, just give the guy some idea of what he can use and a gross idea of the magnitude of the numbers. Actually I would not mind a 'grind shootout'. I make the bold claim on my website that, "MicroTunnel runs the world's fastest unsteady inviscid flow solver, integrating nearly half a million cells each time step in under two seconds on a 1GHz Pentium PC. Its speed is derived from an assembly coded routine, which minimizes memory access by using all eight floatingpoint registers of the numeric coprocessor at 80bit precision." If someone can beat me with a grind time of less than 4 microsec's for a similar solver and resolution, I may consider changing my sales pitch... :) >Is there any significant difference between PII and PIII for the same clock speed? Not that I know of. It has been my experience that for floating point intensive tasks running on a single processor, the difference in performance of two Intel Pentium processors is mainly limited by their clock speed. For example, one integration cycle takes 2s on my PIII 800 MHz Gateway, whereas it takes 9s on my old PMMX 200MHz Quantex. So about a factor of 4 in there. Also, if you take a look at the clock cycles for addition, subtraction, and multiplication, they have not changed since the first generation Pentium (1 or 3 cycles depending on FPU state). 
All times are GMT 4. The time now is 18:26. 