andrewburns February 4, 2008 20:49

I'm just learning how to run in parallel. So far I haven't been able to run a solution over two machines, however I have run potentialfoam on two cores within one machine.

However during the solution I didn't get any of the text that I would usually get in the console telling me what's going on (current time step, iterations, residuals etc). The cores just loaded up for a few minutes and then dropped back to 0 when the solution was done, good for a start but I'd really like to be able to read what's going on while it happens.

The command I used to execute the solution was:

mpirun -hostfile <machines> <root> <case> -parallel > log &

I assume it has something to do with the > log & part but I don't know what options for this there are.

martin February 4, 2008 20:59

> log & means that you are writting the output to the file 'log'.

Type tail -f log to read this file during the simulation.


andrewburns February 4, 2008 21:02

Thanks for the very rapid reply, I'll give this a shot!

mike_jaworski February 4, 2008 23:19

I had a lot of trouble with getting openMPI running across our network myself. The symptoms would involve starting a parallel program and then the machine would just sit and hang there. There's a debug switch with mpirun that will give you a very verbose output to help you spot what's going on.

For me, it turns out that openMPI is very peculiar about how the network should be set up. That is, if you have firewalls on either machine, it'll probably hang it up. An easy test would be to disable the firewalls and then try a parallel run to see if that's the problem.

Otherwise, the openMPI website is probably the best place to get support. There are some "hello world" programs that let you test mpirun so you can isolate it from OpenFOAM as well.

Good luck,
Mike J.

andrewburns February 4, 2008 23:53

Thanks for the tips, I'm affraid it will be a fairly difficult/annoying thing to get a parallel run with the current systems that I have. The problem is that they're running on a network I have very little insight into or control over, between different versions of linux (one tcsh shell and one bash shell if this matters).

The current error I'm getting is:

orte: command not found

And a few others before openmpi drops.

mike_jaworski February 5, 2008 00:07

I would recommend finding the "hello world" program in the OpenMPI website. I think you can also find it by googling openMPI and "hello world". you code it and compile it yourself so you can make sure things are installed correctly. It also allows you to isolate it from openFOAM, which I found helpful being new to both, myself. I recommend reading their tutorials as well, they're pretty informative.

Good luck,
Mike J.

