CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Main CFD Forum

Parallelization and Processor Interaction

Register Blogs Members List Search Today's Posts Mark Forums Read

Like Tree1Likes

Reply
 
LinkBack Thread Tools Display Modes
Old   January 19, 2012, 17:29
Default
  #21
Member
 
Join Date: Jul 2011
Location: US
Posts: 39
Rep Power: 5
Docfreezzzz is on a distinguished road
To cfdnewbie: I'd actually be surprised if you couldn't get the scaling past 90% with an explicit code. I would hate to imply that the code is trivial (it's not) but the problem of parallelizing an explicit only code is very straightforward. Most of our second semester graduate students could get excellent scaling up to a few hundred cores with an explicit code.

Implicit codes are another story and can be extremely involved. Solving a distributed matrix system is just not easy. As far as optimizing codes with parameters in an implicit solver I just don't understand what you mean. We do spend a lot of time running simulations but we also spend a lot of time working on parallel algorithms. However, I'd hardly say that my time is dominated with the time integration portion of the code... That code is pretty dumb in fact. Newton loop surrounding a second order backwards difference. That's all we need.

Arguing over convergence levels actually seems pretty ridiculous and we'd almost never have that kind of discussion in our group. If the convergence is below the scale of the flow features you are looking for then in general its time to call it a day. And a shouting match? Lol. I love academics some times but we sure are a pain in the a$$ to deal with on other days. We are more likely to perform grid convergence studies than anything else as the grid tends to drive the error terms.

To Martin: we do lag the BCs but we have the capability to not lag them if we aren't running steady state solutions. The fringe boundaries are lagged (sort of). Their values are actually updated from the most recent information available so they are lagged by ~(half an SGS cycle) for computational efficiency reasons .

If you formulate the problem correctly the navier-stokes equations can be written as a fixed-point solution (namely with the delta Q as the variable to solve for). By doing this we can wrap a timestep in a Newton iteration and thus update the BCs multiple times per step and thus bringing them up to date with the flowfield. Clever trick I'll give you that. I certainly didn't come up with it.

Oh and on a final note my code scales at 95+%. OpenFOAM certainly doesn't give you that. So, in that regard we do perform some very clever optimizations to keep things rolling in the right direction. It's not easy but it's a thing of beauty when you really need the extra horsepower for something complicated.
__________________
CFD engineering resource
Docfreezzzz is offline   Reply With Quote

Old   January 19, 2012, 17:55
Default
  #22
Senior Member
 
cfdnewbie
Join Date: Mar 2010
Posts: 551
Rep Power: 11
cfdnewbie is on a distinguished road
Quote:
Originally Posted by Docfreezzzz View Post
To cfdnewbie: I'd actually be surprised if you couldn't get the scaling past 90% with an explicit code. I would hate to imply that the code is trivial (it's not) but the problem of parallelizing an explicit only code is very straightforward. Most of our second semester graduate students could get excellent scaling up to a few hundred cores with an explicit code.
I guess you are referring to weak scaling? I agree, weak scaling above 90% is more or less trivial. Strong scaling on 100k cores is definitely not, and in fact if your students can get above 90% strong scaling on 100k cores, they should submit their results - publication guaranteed (again, not implying trivial code). I have been to a few Teragrid / XD workshops, and a strong scaling of 90% would win you a podium

Quote:
Implicit codes are another story and can be extremely involved. Solving a distributed matrix system is just not easy. As far as optimizing codes with parameters in an implicit solver I just don't understand what you mean. We do spend a lot of time running simulations but we also spend a lot of time working on parallel algorithms. However, I'd hardly say that my time is dominated with the time integration portion of the code... That code is pretty dumb in fact. Newton loop surrounding a second order backwards difference. That's all we need.
Well, I can only speak from hearsay, but the hearsay comes from people at MIT, Berkeley and such groups. They often debate about things like preconditioners, no. of Newton cycles, residual tolerances and such... again, I am no expert, but my impression (last AIAA): go to any session which has explicit and implicit solvers. Explicit people present their results on the flow field, implicit people present their clever way of managing their matrices...

It just seems to me that some people doing implicit are overwhelmed by the additional level of complexity, and I'm wondering if it keeps them from the stuff they set out to do. I'm not denying the usefulness of implicit methods, they just seem very difficult to work with. But I'd be very interested in learning more about this. Have you any published papers describing your code?


Quote:
Oh and on a final note my code scales at 95+%.
Strong scaling? on how many cores? publish that, seriously!
cfdnewbie is offline   Reply With Quote

Old   January 19, 2012, 18:07
Default
  #23
Senior Member
 
cfdnewbie
Join Date: Mar 2010
Posts: 551
Rep Power: 11
cfdnewbie is on a distinguished road
Just to add to the discussion about strong / weak scaling:
http://users.ices.utexas.edu/~benkirk/talk.pdf
look at slide 17, seems like the guys from NASA and Sandia can't get their implicit code to scale strongly above 50 cores or so. Of course, you can't compare two codes without having their sources side by side, but I'm pretty sure those guys know what they are doing. so if you get 95+ on 10^3 or 10^4, that would be awesome and indeed something to publish!
cfdnewbie is offline   Reply With Quote

Old   January 19, 2012, 18:12
Default
  #24
Member
 
Join Date: Jul 2011
Location: US
Posts: 39
Rep Power: 5
Docfreezzzz is on a distinguished road
Nope.. Remember. Proving strong scaling for us is very hard... Memory bound I can show strong scaling out to several 10s cores but past that I'm either in the cache or I can't show any kind of regression back to one proc. Weak scaling is a different story but our problems are mostly due to I/O on huge machines. I'd never know what other evils lurk since our datasets bog down the pipeline.

And yes, for the students I was referring to strong scaling on several hundred procs. I'd doubt that 100k is trivial at all. However, I'd be surprised if there wasn't some room for improvement. Though maybe I'm wrong here.

Preconditioners I do spend some great amount of time working on but not specifically for the implicit solver. Mostly I have issues with conditioning with h.o. methods and my discrete adjoint sensitivity code. There is really no need that I'm aware of for preconditioners at Ma > 0.4 with compressible NS solvers... That seems odd. We also mostly present on techniques for improving resolution of the physics but some present good quality papers on flow control techniques, etc. All of us use implicit methods. I presented at AIAA in Nashville as well and my work was on reacting flowfield methods.

I haven't published any papers describing my code in detail... I could write a whole book. I have published papers describing certain methods however.
__________________
CFD engineering resource
Docfreezzzz is offline   Reply With Quote

Old   January 19, 2012, 18:20
Default
  #25
Senior Member
 
Martin Hegedus
Join Date: Feb 2011
Posts: 492
Rep Power: 9
Martin Hegedus is on a distinguished road
Quote:
Originally Posted by cfdnewbie View Post
It just seems to me that some people doing implicit are overwhelmed by the additional level of complexity, and I'm wondering if it keeps them from the stuff they set out to do. I'm not denying the usefulness of implicit methods, they just seem very difficult to work with. But I'd be very interested in learning more about this. Have you any published papers describing your code?
Implicit codes are difficult to program and unfortunately if there is a bug in the code, they do get very flaky.

Assuming the code does not have a bug and that the user knows that they are doing, implicit codes are robust. I do agree with you that people have trouble using them. Implicit codes can be unforgiving to those without the knowledge. Unfortunately the know how from those in the "ivory towers" doesn't seem to get down to the people in the fields, or maybe it is just lost in translation.
Martin Hegedus is offline   Reply With Quote

Old   January 19, 2012, 18:23
Default
  #26
Senior Member
 
cfdnewbie
Join Date: Mar 2010
Posts: 551
Rep Power: 11
cfdnewbie is on a distinguished road
Quote:
Originally Posted by Docfreezzzz View Post
Nope.. Remember. Proving strong scaling for us is very hard... Memory bound I can show strong scaling out to several 10s cores but past that I'm either in the cache or I can't show any kind of regression back to one proc. Weak scaling is a different story but our problems are mostly due to I/O on huge machines. I'd never know what other evils lurk since our datasets bog down the pipeline.
allright, I understand. So I guess we are on the same page in terms of weak scaling, I guess. One could of course argue which one is more important, weak or strong, but I guess that's besides the point.

Just a related question: How do you actually prove to the people running the supercomputers that you are able to fully exploit them if you can't really show strong scaling? I'm asking because they are always pestering us about strong scaling results before they will let us use 100k or 300k procs... is there an understanding that implict schemes just can't show their strong scaling? If you can't show it, your code might not really benefit from running on 100k cores instead of 1k cores, right?

I'm just wondering about the policy of the people dishing out the time on the big dippers here...



Quote:
Preconditioners I do spend some great amount of time working on but not specifically for the implicit solver. Mostly I have issues with conditioning with h.o. methods and my discrete adjoint sensitivity code.
oh, now it gets really interesting What type of h.o are you talking about? fv? dg? sth else?

Quote:
There is really no need that I'm aware of for preconditioners at Ma > 0.4 with compressible NS solvers... That seems odd. We also mostly present on techniques for improving resolution of the physics but some present good quality papers on flow control techniques, etc. All of us use implicit methods. I presented at AIAA in Nashville as well and my work was on reacting flowfield methods.
oh, nice, I guess I missed you in that session. I'll look up the papers, though. opryland was a big letdown, by the way.

Quote:
I haven't published any papers describing my code in detail... I could write a whole book. I have published papers describing certain methods however.

allright, would be cool if you could point me to one or two, if you like.
thank you and cheers!
cfdnewbie is offline   Reply With Quote

Old   January 19, 2012, 18:24
Default
  #27
Member
 
Join Date: Jul 2011
Location: US
Posts: 39
Rep Power: 5
Docfreezzzz is on a distinguished road
Quote:
Originally Posted by cfdnewbie View Post
Just to add to the discussion about strong / weak scaling:
http://users.ices.utexas.edu/~benkirk/talk.pdf
look at slide 17, seems like the guys from NASA and Sandia can't get their implicit code to scale strongly above 50 cores or so. Of course, you can't compare two codes without having their sources side by side, but I'm pretty sure those guys know what they are doing. so if you get 95+ on 10^3 or 10^4, that would be awesome and indeed something to publish!

Hmm... That looks a bit fishy to me. That is also FE code and not a FV code. The coupling is a bit different so maybe there is something there. I also wonder what machine they are running on. It seems that there is an upper bound on some comm. size that they are running into. I've seen data suggesting that the BlueGene architecture scales much better that TerraGrid machines for many problem types. Maybe the interconnect is more complicated than what we use? From what I can read they used a generalized FE framework. I'd be willing to bet that quite a few optimizations can be made for specific problem types. Of course, again, I could be wrong. Interesting results.

Thanks!
__________________
CFD engineering resource
Docfreezzzz is offline   Reply With Quote

Old   January 19, 2012, 18:25
Default
  #28
Senior Member
 
cfdnewbie
Join Date: Mar 2010
Posts: 551
Rep Power: 11
cfdnewbie is on a distinguished road
could you recommend a good book / review article for an up-to-now explicit guy? Just the basics, something to clear the jungle of all those abbreviations and such... I know basic ILU and ADI, but that's about it. Anything more modern would be nice!
cfdnewbie is offline   Reply With Quote

Old   January 19, 2012, 18:28
Default
  #29
Senior Member
 
cfdnewbie
Join Date: Mar 2010
Posts: 551
Rep Power: 11
cfdnewbie is on a distinguished road
maybe I should add here that the 88% strong scaling was indeed on a BlueGene, and that results on other systems were indeed not quite as good (lower 80s), so the interconnect is a large factor in this whole game.


Found another good read from the MIT guy about scaling for you implicit guys:
http://raphael.mit.edu/darmofalpubs/...er_FEM_CFD.pdf

just browsed through, will read tonight, sounds interesting!
cfdnewbie is offline   Reply With Quote

Old   January 19, 2012, 18:34
Default
  #30
Senior Member
 
Martin Hegedus
Join Date: Feb 2011
Posts: 492
Rep Power: 9
Martin Hegedus is on a distinguished road
I would also love to see a practical how-to guide for implicit methods.

Something that address issues like this, http://www.hegedusaero.com/examples/...ifiedDADI.html. Though not as technical. I still need to put it in a best/worst-practices form.
Martin Hegedus is offline   Reply With Quote

Old   January 19, 2012, 18:45
Default
  #31
Senior Member
 
cfdnewbie
Join Date: Mar 2010
Posts: 551
Rep Power: 11
cfdnewbie is on a distinguished road
looks interesting, thank you very much. I assume you don't have a latex version? all the khats make my head spin a little. but thanks again very much!
cfdnewbie is offline   Reply With Quote

Old   January 19, 2012, 18:46
Default
  #32
Member
 
Join Date: Jul 2011
Location: US
Posts: 39
Rep Power: 5
Docfreezzzz is on a distinguished road
It's my experience that you can get time on the big machines for running cases that literally won't fit in memory. So in that regard, weak scaling is all you really need. The case keeps getting bigger (mesh size) and so does the machine. Anything else you can build a small cluster for and run in house. For instance, running a full compressor might take a whole lot of grid points and we need the bigger machine just for the ram. Honestly, I don't find myself in need of that kind of thing very often.

The h.o. schemes that I'm referring to are FV. Again, not anything too interesting or bleeding edge but it does cause my adjoint solver some big headaches. Hence the preconditioning.

The last two papers I was involved in were
AIAA-2011-3700
AIAA-2012-1239

I get the feeling we work in two vastly different areas of CFD but maybe that'll give you some sense for what I am involved in.
__________________
CFD engineering resource
Docfreezzzz is offline   Reply With Quote

Old   January 19, 2012, 18:52
Default
  #33
Senior Member
 
cfdnewbie
Join Date: Mar 2010
Posts: 551
Rep Power: 11
cfdnewbie is on a distinguished road
Quote:
Originally Posted by Docfreezzzz View Post
The last two papers I was involved in were
AIAA-2011-3700
AIAA-2012-1239

I get the feeling we work in two vastly different areas of CFD but maybe that'll give you some sense for what I am involved in.
Thank you, I'll check them out tonight. Yes, it seems that we are on opposite ends of the cfd spectrum, but that makes it interesting!
cfdnewbie is offline   Reply With Quote

Old   January 19, 2012, 18:54
Default
  #34
Senior Member
 
Martin Hegedus
Join Date: Feb 2011
Posts: 492
Rep Power: 9
Martin Hegedus is on a distinguished road
Quote:
Originally Posted by cfdnewbie View Post
looks interesting, thank you very much. I assume you don't have a latex version? all the khats make my head spin a little. but thanks again very much!
Sorry, best I could do with the time I had. I do agree, the khats are a pain.
Martin Hegedus is offline   Reply With Quote

Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT -4. The time now is 05:52.