 I have a Polywell workstation with a dual 500MZ Pentium III motherboard but only one cpu was installed. The system has 512M of RAM and a 22GB IBM EIDE 7200 hard drive. I am running a CFD code (NPARC3D)with a grid slightly larger than 2.5M on Windows NT; because of the size of the grid and the complexity of the geometry the grid is divided into 4 blocks; only one block can be in RAM and processed before it is output unto disk and then another block is read into RAM; I would guess that this results in a lot of io between RAM and disk. It takes about one hour for 4 iterations; and it takes about 20000 iterations, at least, to get a converged solution. At this rate, it will take 8 months to get a solution! How can I speed up this system within the next few days as we need to get a set of solution before Feb. 2000? 1)One solution is to get more RAM because if I can get the whole grid within RAM, I can get a solution about 3x faster. I would need to get 256Mb or 512Mb RAM cards as the 4 slots are now filled with 128MB cards. How much are 256 or 512MB RAM cards? 2)Another solution may to get a faster hard drive: how much faster is a SCSI drive? How much more expensive it? 3)I can reduce the grid to get the minimum necessary or reduce the grid volume of calculation by boundary condition modification. Grid reduction will reduce RAM required. Are there any other solutions? (besides buying a much more expensive system and besides getting other CFD sofware ... this will be our long term solution as we are considering a CFD code (WIND) that will use multiprocessors)

 December 21, 1999, 04:58
Just buy some computing time on a fast computer.

 December 21, 1999, 11:25
Dennis, Thanks for taking the time to write your suggestions. I'll make a copy for future guidance. 1)Using IBORD and MBORD to keep a block within RAM for several iteration is a very good idea; I will do this when I am forced to use the hard disk to store intermediate calculations. But, when using PC's for CFD, keeping ALL blocks in memory is much preferred since disk speed is too slow to be useful. For example in my case, when all the blocks are in RAM (with MDISK=1) the number of iterations per minute is 6 times greater than with MDISK=0. 2)I use DTBLK usually for the reasons that you state. 3)I've tried an inviscid solution as a initial solution to a viscous, turbulent solution but NPARC3D seems to abort when this transition is tried. I suspect that the gradient at the wall is to great for NPARC3D. Maybe I should go from inviscid, to laminar, to turbulent as you suggested. Have you tried this? 4) I use LREST=1 once FT37 and FT38 are generated. We do plan to get WIND, since we will be doing work where variable gamma is required. Do you know if there is a Windows version of NPARC3D that can be run in parallel?

 December 21, 1999, 14:21
Have you simplified your problem enough? Can you make use of symmetry or periodicity? It is some times very difficult to give good suggestions without getting some details about the problem, but on the other hand most people do not want to disclose certain aspects of their problems because of company rules. Good luck.

 December 21, 1999, 17:50
Amadou, I have used the symmetry condition and also will be using boundary conditions to exclude from calculations parts of the flowfield that will not influence the region of interest

 December 21, 1999, 18:39
Steve, You mention that the code aborts. If by this you mean it stops with a negative density or pressure, then it may be that the gradient at the wall is preventing you from using your inviscid solution as a starting point for your turbulent flow calculations. Easing into the solution via a laminar run may help. If the gradients are severe, you may need to reduce the time step during the early phases of the turbulent run. You might be able to gradually increase them as the solution progresses. Also keep in mind that the two-equation turbulence models in NPARC are not self-starting. They require that you first run another model (say Spalart or Baldwin-Lomax), so that you have a turbulent viscosity profile it can use to back out non-uniform k and epsilon (or omega) values. Examine the turbulent viscosity (mu_t/mu) contained in the fort.31 output file. The maximum value in the flowfield should exceed 500 before you try starting the two-equation models. If there are no k and epsilon values in your initial restart file and the code expects them, it should stop. I don't know how meaningful an error message you get, but the code likely won't get past the first iteration. NPARC3.0 and later had parallel processing capability, but I've never used it. Since it is distributed as raw source, you would need Fortran and C compilers on your PC. The makefile scripts were developed for Unix plaforms, so they may need some editing. The manual mentions PVM, perhaps MPI will work too, and uses a master-worker scheme. The manual is on-line at http://www.arnold.af.mil/nparc/ under Original_NPARC_Code then Documentation.

 December 22, 1999, 09:26
Dennis Yes, I think that the error message was about negative pressure/density. Next time I will use the laminar flow and small time step suggestions. I'm using the Spalart-Allmaras (sp?) model where I used about 1000 iterations with the Baldwin-Lomax model before starting Spalart-Allmaras (sp?). I'll use your suggestions if I use a 2-equations model. I'm not familiar with multi-processor procedures, software or terminology. Is MPI available for Windows NT? I have just installed a 2nd CPU on our PC (the ICEM-CFD grid mesher will automatically use multiprocessors);and, I think that it's worthwhile to pursue multi-processor NPARC computations on Windows NT and I would appreciate any help to implement it.

 December 22, 1999, 10:07
We are a very small company and probably can't afford the time needed.

 December 22, 1999, 11:17
Steve, The only thing I know about PVM and MPI are their names. If you only have two processors, you may not even see an improvement. The NPARC3.0 documentation page I cited earlier says that it uses a master-worker approach. With two processors, only one of them is a worker - so it acts the same as a serial run (probably slower, though, due to the added overhead). Perhaps there is a way to assign three tasks (master-worker-worker), but this is beyond my limited knowledge of the subject.

 December 27, 1999, 11:11
Here is the answer to a couple of questions and some things to think about. MPI is available for Windows. WMPI (non-commercial) and MPIPro (commercial) are two choices that come to mind. I am not sure about Windows based PVM. With MPI you can start a master-worker-worker problem on a single machine and this will utilize both processors (85-100%) the issue will be memory. Each process (worker) is going to need enough memory to process its blocks and communications. A problem with 3 processes (master-worker-worker) is larger than just problem size/3. If you can get your code to run on dual processors with MPI, then extending it to run on multiple machines is pretty easy.

 January 3, 2000, 10:39
Ken, Thanks for the info on WMPI. Do you have a source for the non-commercial WMPI? I can get a version of the code to run in parallel mode. How do I install WMPI to do use this capabalilty?

 January 4, 2000, 14:29
Try here ... http://dsg.dei.uc.pt/wmpi All my work has been with the previous version, but I plan on upgrading to v1.3 in the near future. The documentation should tell you how to hitch it to your code. Hope this helps.

 January 7, 2000, 03:45
Hi John, Some month ago 1 cpu hour on a NEC-SX4 including the license fee for StarCD costs about 75 DM (approx. $ 40). The speedup compared to 1 Processor on an Origin2000 (195 MHz) was about the factor of 6 , but included an unoptimized user-subroutine, which slowed down the NEC a little bit. The guys there told me that a speedup of 10 or more is quite normal. So I think that this might be a realistic option if the project is setup carefully

 January 7, 2000, 11:38
(1). That's encouraging. (2). It depends on whether his project is allowed to use any available computer, anywhere. Thank you very much for the cost information.

 January 10, 2000, 19:48
do both one and two get faster disk (scsi) and about 2 to four times more (faster) RAM no matter the cost (<$1000us) it'll be worth it. three is up to you how accurate do you want your results to be. why not add the second processor. it probably won't hurt

 January 10, 2000, 23:43
Clifford, I have bought 3 256MB RAM cards by shopping through www.computershooper.com and saved about $1200 when compared to the CompUSA prices. I have almost maxed out my machine (1 GB RAM); and, I have added the second CPU. I can now run a CFD code and gridgenerator simultaneously and get decent interaction with the grid generator. When I try running two CFD jobs, one job complains that the C disk has not enough temp space; and with Windows NT, the C disk (or partition) cannot have more than 2 GB so I have to move some files out of the C partition and unto the D partition. Hopefully the application files that I moved from the C partition will free sufficient space for the temp files.

