CFD Online Discussion Forums

CFD Online Discussion Forums (
-   Main CFD Forum (
-   -   PC-cluster (

Chris January 20, 2000 10:30


I have 2 questions:

1) Which of the commercial CFD-codes can be operated on a PC-cluster with Linux as operating system?

2) Are there any research or other groups who are operating a commercial code or even their own code on a PC cluster?

Jonas Larsson January 20, 2000 11:44

Re: PC-cluster
Most commercial codes now support Linux - Fluent runs on Linux, Star-CD runs on Linux, CFD-ACE+ runs on Linux, ...

We have been running CFD on a Linux cluster for some time now. About 50% of our work is with an in-house CFD code based on PVM, the other 50% is Fluent for the moment. Fluent runs very well on Linux. We've had a few problems with Fluent cross-platform case file compatibility and also with post-processing, but otherwise our experience is very good.

Aaron J. Bird January 20, 2000 13:46

Re: PC-cluster
Our group has been considering this for some time. So far a Linux NFS (Redhat 5.2) and a workstation have been set up with the most difficult part being compatibility with our LAN. However the surplussed computers that are being used are 200 MHz with a max of 64 meg of RAM, so even if we are successful in running a parallelized code on this system, we will still be limited by the memory. On the other hand, the computers were free and won't have too many other processes running on them, so maybe it takes a few days (or weeks...) longer to get a converged solution, but as long as that's planned for in advance, then there wouldn't be any problems or surprises.

Personally I feel that there's nothing wrong with using several old computers and putting them together to make clusters. As cheap as they are now makes it very possible for lots of people or groups to build them. And the guidance that's available on the web is enough to set up these clusters (or NFS w/workstations) with only minor headaches.

If I remember correctly, there might be some guidance on the Linux HowTo's sites. If you build it, let us know how it goes.

Sergei Chernyshenko January 20, 2000 14:17

Re: PC-cluster

Look at, especially, Beowulf Project

clifford bradford January 24, 2000 17:21

Re: PC-cluster
i can answer #2) several people here in penn state's aerospace deptartments have been running a lot of simulations on a linux cluster. mostly cfd and aeroacoustics calculations using in house codes and PUMA and some molecular dynamics. the performance is quite good and the system is used very heavily with good reliability. you can check out for more info.

Sergei Chernyshenko January 25, 2000 06:13

Re: PC-cluster
Hi, Clifford,

Do you know anything about possibility to simulate shared memory architecture, say, about some OMP implementation on such a cluster?

Rgds, Sergei

clifford bradford January 25, 2000 13:18

Re: PC-cluster
sorry. you can send email to the guy (Anirudh Modi) who maintains the page i cited in my previous message to ask. i suppose it would be possible by somehow making the system see all the memory on each separate board as one entity. i don't think it would be efficient though because it would require a lot of communication between physically separate memory over a high latency system. if you were using a low latency system like an SP-2 or SGI (origin etc) the efficiency would be a lot better.

Sergei Chernyshenko January 25, 2000 14:10

Re: PC-cluster
Thanks. I do use Origin. And indeed, the efficiency of a cluster cannot be expected to be high with shared memory, except that there is one interesting possibility. The structure of my problem, as in most cases in CFD, is such that memory partitioning, like that implied by MPI, is quite possible. And when I use shared memory, the possibility is still there. Now, a clever system would watch my shared-memory code at work and do some profiling, and the profile then would be used for optimizing the performance on a cluster. The idea is simple but implementation is, of course, difficult. Nevertheless, sooner or later it will be done, and it is hard to believe that nobody is trying to do this now. It would be very interesting to know about any progress.

andy January 26, 2000 08:27

Re: PC-cluster
I may be wrong but isn't that how the Origin works?

Sergei Chernyshenko January 26, 2000 09:33

Re: PC-cluster
>I may be wrong but isn't that how the Origin works?

Hi, Andy,

Well, Origin is a shared-memory machine, not distributed-memory machine, and it is not clear how to apply the idea there. Looks like there is no need for it. On the other hand, I did not try profiling on Origin and I am not sure myself. May be it is worth looking into it more closely, so, thanks for the idea.

If anyone has experience in profiling on Origin, again, comments would be welcomed.

Rgds, Sergei.

andy January 26, 2000 10:56

Re: PC-cluster
Again, I may well be wrong since I had only a limited a play with an Origin 2000 a few years ago but my understanding is that it is a distributed memory memory machine which can present a shared memory model to the user. In the shared memory mode the compiler parallelises the do loops and distributes the data out to the chunks of distributed memory. In addition, I have vague recollections of being told that data could migrate during execution in order to improve the load balance. SGI used to have an office in Manchester so it should be easy enough to find out what algorithms they use.

Certainly for my programs both the shared memory model and the MPI distributed memory model worked well. It was just a pity the machine was so expensive for the number of processors.

My experience of profiling was that is was very easy and quick. It took only 2 or 3 hours to achieve 75% efficiency for an implicit code (obviously an explicit code would be even easier). To improve things further required the solution procedure to be modified. Nothing major, in fact very similar modifications to those required to vectorise code for the old Crays but I did not bother since I was well into diminishing returns (and the codes I was interested in were parallelised for distributed memory anyway). I did discover that you could lie to the compiler about data dependencies that existed but were not important (e.g. an iterative algorithm which would not be upset by using an old value or a current value).

Sergei Chernyshenko January 26, 2000 12:27

Re: PC-cluster
>my understanding is that it is a distributed memory memory machine which
>can present a shared memory model to the user.

Kghm, This is from

The SGI Origin 2000 System ... The system utilises the Scalable Shared-memory Multi-Processing (S2MP) architecture from Silicon Graphics, permitting both shared memory and distributed memory programming models.

And this is from

The Origin 2000 is a distributed shared memory machine, that is each node has its own memory which every other node has access to through the global address space. Each node consists of two processors, and the nodes are inter-connected in such a way as to create an augmented hypercube...

From my experience, certain features like, for example, memory requirement being (size of the code+data) x (number of processors) look like it is distributed. Indeed, if the memory is shared why so many copies of it are needed?

So, in fact Origin is somewhere in between.

Which profiling tool did you use?

Rgds, Sergei

andy January 26, 2000 12:50

Re: PC-cluster
I am afraid I cannot recall which tools I used to profile the codes. I set a flag on the SGI compilers to generate annotated lists of code to find out which loops failed and why (the important thing to know). I probably then used some derivative of prof although it is possible I just ran the code since it would have had calls to timings routines from previous exercises on other parallel machines not as sophisticated as the SGI.

Sergei Chernyshenko January 26, 2000 15:12

Re: PC-cluster
>to generate annotated lists of code

This I did, too, of course, but it is just parallelization and not optimization for specific architecture.

Well, may be I'll try prof. Thanks.


N. C. Reis January 27, 2000 19:40

Re: PC-cluster

I believe you are right. I think SGI Origin is what they Shared Distributed Memory machine. I guess it uses an SGI technology called NUMA (non-uniform memory access) so that the user can 'see' the entire memory as a shared memory. But the machine is 'aware' the memory time to access each piece of memory is different, since each piece of memory is on a different node. In theory, the system known that, and your aplication is suppose to use the fast bandwidth possible.

I guess, people at SGI or MCC (Manchester Computing Centre - they have one big Origin) can be more specific about that.


All times are GMT -4. The time now is 18:12.