CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Software User Forums > OpenFOAM > OpenFOAM Programming & Development

Parallel cluster solving with OpenFoam? P2P Cluster?

Register Blogs Community New Posts Updated Threads Search

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   December 4, 2010, 11:21
Default Parallel cluster solving with OpenFoam? P2P Cluster?
  #1
New Member
 
Join Date: Aug 2009
Posts: 12
Rep Power: 16
hornig is on a distinguished road
Hi,

I'm new to OpenFoam and to this forum. I worked with ansys' fluent and gambit for some time and encountered openfoam when a pc-pool-neighbour told me about it in university.

I had the oportunity to let run some of my models on a fluent cluster system, and now I'm curious to know if and how OpenFoam does this.
Is there a cluster available directly in openfoam or does it need another software to do that?

the background of my question is if it would be possible to do some kind of distributed cluster where the cluster nodes are connected via the internet.

We started a distributed computing project called Constellation, that will be a platform for aerospace related simulations. We are using BOINC as a system, but the connected clients can't connect to each other. All workunits are distributed via a central server, but that would be really slow.
I have an idea to connect some clients to one group via p2p or another protocol that they are able to "see" other clients, work together on one workunit without the need to do that via the central server.
Of course there will be some issues to solve, but I have some solutions for that as well.
The only thing I can't do is to code that. I'm not a programmer.
It would be really really great to have such a system, so that we are able to broaden our platform to CFD and other simulations, that need splitted meshes.

What do you think? Is that possible?

Andreas Hornig
--
http://aerospaceresearch.net/constellation/
hornig is offline   Reply With Quote

Old   December 4, 2010, 11:53
Default
  #2
New Member
 
Tino
Join Date: Dec 2010
Posts: 3
Rep Power: 14
tkrks is an unknown quantity at this point
Quote:
The only thing I can't do is to code that. I'm not a programmer.
I see. What are you? A manager? You'll be a very good one some day. And many competent people will suffer.
tkrks is offline   Reply With Quote

Old   December 4, 2010, 15:58
Default
  #3
New Member
 
Join Date: Aug 2009
Posts: 12
Rep Power: 16
hornig is on a distinguished road
Quote:
Originally Posted by tkrks View Post
I see. What are you? ...
And you are a joker, right?

I can code, but I would never ever call myself a programmer, because there are more capable coders out there. And I can rank myself very good .

And I'm just asking was OpenFoam is able to do and if my idea is possible to achive. And your answer is not even close to be usefull to it .

I still want to know what the rest of you think about it.

Best regards,

Andreas
hornig is offline   Reply With Quote

Old   December 4, 2010, 16:43
Default
  #4
New Member
 
Tino
Join Date: Dec 2010
Posts: 3
Rep Power: 14
tkrks is an unknown quantity at this point
Quote:
I still want to know what the rest of you think about it.
The rest of us thinks the same. Hire someone to teach you the limitations of CFD on high latency clusters. I already feel sorry for him.
tkrks is offline   Reply With Quote

Old   December 4, 2010, 17:51
Default
  #5
Retired Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,975
Blog Entries: 45
Rep Power: 128
wyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to all
@Tino:
Quote:
Originally Posted by Sydney J. Harris
If you're not part of the solution, you're part of the problem, but the perpetual human predicament is that the answer soon poses its own problems.
--------------------------------

Hi Andreas and welcome to the OpenFOAM part of the cfd-online forum!

Disclaimer: my experience with clusters is rather limited, but I at least I know the basics

So, you want to build a network similar to Folding@home, but for Computational Fluid Dynamics. Theoretically it can be done, but it most likely would be a waste of energy in most CFD related scenarios. "Why?" you ask? Because like Tino implied, CFD problems usually require high speed data communications between each running parallel process, for both bandwidth and latency.

The only two scenarios I can see OpenFOAM or any other usual CFD software working properly on slow networks would be:
  1. Each worker machine doesn't to share the work load with other machines during execution, but only providing occasional feedback to the user who requested the simulation to be executed. That worker machine could then be even a supercomputer or a regular 2GB personal PC/MAC .
    This would only be useful for people (students perhaps) who only have a netbook, thinclient, smartphone or really old machines (previous to 2007?) and need to run a simulation case in a modern 8 core machine with 12GB of RAM in 8 to 24 hours (versus 1 week on the netbook). In other words, "lend me your machine for a few hours to run this case".
  2. The other is for gigantic CFD simulations, where communications between machines is infrequent. For example, simulate the Atlantic Ocean Keep in mind that each remote worker machine would need something like +24GB of RAM, +12 cores, run each variable (pressure, fluid speed, temperature) iteration for 5 minutes and only communicating for a maximum 1 minute per iteration. But for something like this, you would still possibly need "University Research grade networks", somewhere between 10 Gbit to 10 Tbit networks between workers. Also keep in mind that these are estimates from the top of my head.
As for the way OpenFOAM comunicates between processes: usually OpenFOAM is built to use OpenMPI, but it can pretty much use any MPI software/library that abides to the MPI standards. So, it would just be a question of finding such P2P+MPI software, or VPN+P2P tunnelling between machines, thus working with any MPI software; the latter would simply be another small hatchet job on the communications.



And just for fun: what if we tried to use smartphones for workers?
  1. Usually smartphones use ARM type processors, which would require doing some porting of OpenFOAM code for those processors;
  2. Still, under the same previous guide lines, it would still only be worth it if:
    • the smartphones were only used during charging times, where they worked at full power while simultaneously charging the battery (while we sleep?);
    • each one with 2-8 core 2GHz ARM processors with advanced and independent FPU/GPU units per core;
    • 0.5-1Gbit wi-fi network.
    • 2-8GB of RAM per phone.
    • 2 or more of these networked close by (apartment building or a dorm).
  3. And these futuristic values are still only compared to current desktop technologies! Because by then, real CFD will be done in machines that are about 2-4 times faster and bigger and somewhat more efficient than current ones!

But in the end, something like Amazon EC2 would probably still be the greenest+cheapest solution

Best regards,
Bruno
__________________
wyldckat is offline   Reply With Quote

Old   December 4, 2010, 18:23
Default
  #6
New Member
 
Join Date: Aug 2009
Posts: 12
Rep Power: 16
hornig is on a distinguished road
Hi Bruno and thanks for this kind invite ,

as you can see I did my previous postings about gambit/fluent and I have some experience with it even with clusters (only using them, not administrating them).

That's why I ask how openfoam provides as a cluster system. I haven't used it before, but it will be one of the few software we could usebecause of its licence. And IF openfoam or a side project could provide it's own cluster software we don't have to do that .

I'm well aware that internetconnection will be the bottleneck. That's why we want to do that with quality of Service and certain levels the "responsible" person can set. For ecample the person can decide to just use local pcs on a lan, then the overhead would be minimal and the task can be bigger because the mesh data is transfered faster.
If he thinks it's a minor task with less needed bandwidth for transfering data he can setthe level to a mixture of local and internet-node or just internet nodes. Then the oberhead will be higher, because any internet-node can "disappear" and another node has to be available fast. So the spare node started the calulation before but wasn't used until the other node isn't there anymore.
http://img215.imageshack.us/i/p2pinstantgridboinc.png/

And BOINC is generating a lot of "heat" right now, because ever workunit needs a specific quorum of results to be validated. So there is overhead that can't be avoided, because of validationing reasons. Perhaps we will raise this a little bit , but it would be woth it.
Not everyone is able to buy it's own super cluster.

Andreas
hornig is offline   Reply With Quote

Old   December 4, 2010, 19:11
Default
  #7
Retired Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,975
Blog Entries: 45
Rep Power: 128
wyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to all
Hi Andreas,

Quote:
Originally Posted by hornig View Post
That's why I ask how openfoam provides as a cluster system. I haven't used it before, but it will be one of the few software we could usebecause of its licence. And IF openfoam or a side project could provide it's own cluster software we don't have to do that .
Well, OpenFOAM doesn't come with an explicit "cluster software"; it only comes with OpenMPI and a couple of scripts to aid in launching simulations in parallel. In other words, it doesn't provide the job scheduling software. But like I said before, it can on the other hand work with an already existing "cluster software", by doing some minor tweaks to the build options for it to use the cluster's MPI library.

Mmmm... if I'm not mistaken, BOINC is something very similar to the Sun Grid Engine. If this is the case, then it should be possible, since it already has been done various times; for example as you can see this thread: http://www.cfd-online.com/Forums/ope...id-engine.html

Quote:
Originally Posted by hornig View Post
I'm well aware that internetconnection will be the bottleneck. That's why we want to do that with quality of Service and certain levels the "responsible" person can set. For ecample the person can decide to just use local pcs on a lan, then the overhead would be minimal and the task can be bigger because the mesh data is transfered faster.
OK, this seems feasible.

Quote:
Originally Posted by hornig View Post
If he thinks it's a minor task with less needed bandwidth for transfering data he can setthe level to a mixture of local and internet-node or just internet nodes. Then the oberhead will be higher, because any internet-node can "disappear" and another node has to be available fast. So the spare node started the calulation before but wasn't used until the other node isn't there anymore.
This is veeeery bad and when I think of the worst case scenarios, it's impractical for CFD.
  • First of all, one of the "laws of programming" is that the programmer should always assume that the user is... well... dumb OK, only to a certain extent, but we should never lead the user to let him hang on his own rope.
  • Second of all, if a node drops, this means that all of the other nodes will have to go back to the previous save point and start over from there, including the substitute nodes; and if this does happens too often, then the simulation will never get completed And synchronising all nodes back to the previous snapshot could be devastating for the otherwise worthwhile large case scenarios.

Quote:
Originally Posted by hornig View Post
And BOINC is generating a lot of "heat" right now, because ever workunit needs a specific quorum of results to be validated. So there is overhead that can't be avoided, because of validationing reasons. Perhaps we will raise this a little bit , but it would be woth it.
Well, validating parallel executions seems logical to me in such a distributed scenario, and probably it's something that should never be discarded in that environment.

Quote:
Originally Posted by hornig View Post
Not everyone is able to buy it's own super cluster.
Indeed! You might be also interested in seeing what other universities are up to in this sector; I suggest that you visit this Extend Project's page: OpenFOAM(R) Clusters - Overview
Er, apparently you're already registered there as well and posed the same question

Good luck! Best regards,
Bruno
__________________
wyldckat is offline   Reply With Quote

Old   December 5, 2010, 10:36
Default
  #8
New Member
 
Join Date: Aug 2009
Posts: 12
Rep Power: 16
hornig is on a distinguished road
Hi,

Quote:
Originally Posted by tkrks View Post
You made my day. "The internet" is slow. Let's do it with quality of service!
...
hmmm, you read, or at least understood, 10% of my text. try a little bit harder next time!
btw: you can also join me in your bullshit area as well.

Quote:
Originally Posted by wyldckat View Post
Hi Andreas,

Well, OpenFOAM doesn't come with an explicit "cluster software"; it only comes with OpenMPI and a couple of scripts to aid in launching simulations in parallel. In other words, it doesn't provide the job scheduling software. But like I said before, it can on the other hand work with an already existing "cluster software", by doing some minor tweaks to the build options for it to use the cluster's MPI library.
Hi bruno

but there is an option to do parallel computation with openfoam, even if it is tricky? My personal worst case would be if I could use openfoam but I need to use another sofware for the cluster and even buy it. I like solutions from one source, because it is made for each other and it helps to find sources for errors, if they occur.

Quote:
Originally Posted by wyldckat View Post
Mmmm... if I'm not mistaken, BOINC is something very similar to the Sun Grid Engine. If this is the case, then it should be possible, since it already has been done various times; for example as you can see this thread: http://www.cfd-online.com/Forums/ope...id-engine.html
I don't know Sun Grid Engine, but perhaps this can be used for boinc as well. I will go into details with it and keep an eye on it. The only problem could be, that oracle's stuff is not open source. I would prefer an open source approach.

Quote:
Originally Posted by wyldckat View Post
OK, this seems feasible.

This is veeeery bad and when I think of the worst case scenarios, it's impractical for CFD.
  • First of all, one of the "laws of programming" is that the programmer should always assume that the user is... well... dumb OK, only to a certain extent, but we should never lead the user to let him hang on his own rope.
  • Second of all, if a node drops, this means that all of the other nodes will have to go back to the previous save point and start over from there, including the substitute nodes; and if this does happens too often, then the simulation will never get completed And synchronising all nodes back to the previous snapshot could be devastating for the otherwise worthwhile large case scenarios.
you're so right! But with better internet connectionslike vdsl50 withmore and more uploadthe only limitation left is latency. that's why to use levels and QoS that you can decide what you simulation needs. If clients doesn't fulfill your level requirement and would slow down the workunit, it wouldn't be allowed to participate.

Quote:
Originally Posted by wyldckat View Post
Well, validating parallel executions seems logical to me in such a distributed scenario, and probably it's something that should never be discarded in that environment.
That's neccesary, because of differences in results due to OS, hardware and software to reduce computational errors.

Quote:
Originally Posted by wyldckat View Post

Indeed! You might be also interested in seeing what other universities are up to in this sector; I suggest that you visit this Extend Project's page: OpenFOAM(R) Clusters - Overview
Er, apparently you're already registered there as well and posed the same question

Good luck! Best regards,
Bruno
I already did, because openfoam hasn't aforum of its own, so I found this website and posted there, too. And I posted it here, because I used this forum before and thought some of your fine users would know a solution .

Andreas
hornig is offline   Reply With Quote

Old   December 5, 2010, 16:06
Default
  #9
Retired Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,975
Blog Entries: 45
Rep Power: 128
wyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to all
Hi Andreas,

Quote:
Originally Posted by hornig View Post
but there is an option to do parallel computation with openfoam, even if it is tricky? My personal worst case would be if I could use openfoam but I need to use another sofware for the cluster and even buy it. I like solutions from one source, because it is made for each other and it helps to find sources for errors, if they occur.
Er, there are so many Linux distributions prepared for clusters! Why waste money?

If you build OpenFOAM with the default options, it will also build OpenMPI. With the full default OpenFOAM installation in all of the machines or shared among the machines (as long as the OS is compatible among them), it becomes as easy as it is explained here: Running a decomposed case

AFAIK, what you don't get with OpenMPI is the Job Scheduler software that clusters usually have. I expect that BOINC will fill in the gap - note: I'm not familiar with the inner workings of BOINC, so I can only assume that it comes with a Job scheduler with it as well.

Quote:
Originally Posted by hornig View Post
I don't know Sun Grid Engine, but perhaps this can be used for boinc as well. I will go into details with it and keep an eye on it. The only problem could be, that oracle's stuff is not open source. I would prefer an open source approach.
I expect that BOINC is something similar to the Sun Grid Engine, therefore you don't need SGE Nonetheless, the necessary preparations and/or adaptations could also be similar between both.

Quote:
Originally Posted by hornig View Post
But with better internet connectionslike vdsl50 withmore and more uploadthe only limitation left is latency. that's why to use levels and QoS that you can decide what you simulation needs. If clients doesn't fulfill your level requirement and would slow down the workunit, it wouldn't be allowed to participate.
Uhm, like I've said before, using workers over the internet to work among each other, would only make sense for something like simulating the entire Atlantic Ocean in one go. In other words, this will only make sense in very special simulation scenarios, something like less of 0.05% of the cases; the normal case scenarios will always need a local cluster for themselves, even if they share the cluster with other jobs (X machines for one job, Y machines for another job and so on).

Don't get me wrong, the fast connection will still be very useful for normal cases, because jobs will take less time to setup in the remote cluster/machine and less time to be delivered back to the user. But for using the Constellation grid to use it as a cluster of clusters for a single CFD simulation, there will be very limited simulations that can take advantage of that. And in those cases, at least in the short time, I think that the resources will be better invested for the normal cases to be sent to the Constellation system, thus releasing, for example, the University clusters/supercomputers, so that the really heavy hitters can take advantage of the spare resources in the University clusters/supercomputers.

Best regards,
Bruno
__________________
wyldckat is offline   Reply With Quote

Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Darcy-Forchheimer law for specifying Porous Zones Ger_US OpenFOAM Running, Solving & CFD 37 April 10, 2022 03:38
Moving mesh Niklas Wikstrom (Wikstrom) OpenFOAM Running, Solving & CFD 122 June 15, 2014 06:20
solving BFS with rhoTurbFoam in OpenFOAM 1.5 kiran OpenFOAM Running, Solving & CFD 0 June 11, 2010 00:36
Error log vw.cfd OpenFOAM 6 August 7, 2009 05:44
Parallel rasInterFoam openfoam_user OpenFOAM Running, Solving & CFD 4 November 1, 2008 04:14


All times are GMT -4. The time now is 16:18.