CFD Online Discussion Forums

CFD Online Discussion Forums (https://www.cfd-online.com/Forums/)
-   Main CFD Forum (https://www.cfd-online.com/Forums/main/)
-   -   Do you build a serial version of your code? (https://www.cfd-online.com/Forums/main/246342-do-you-build-serial-version-your-code.html)

aerosayan November 28, 2022 04:30

Do you build a serial version of your code?
 
Do you write and maintain a separate serial version of your code?

I don't know if many do this, but I've seen some devs build a serial version of their code first, and then make a copy of it, and parallelize the copy later.

There's not really any major benefit in doing this, other than simplifying the development and debugging process, since coding for serial execution is easier.

Do you also write your code like this, or with enough experience, do you just start developing the parallel version from the beginning?

sbaffini November 28, 2022 05:33

Quote:

Originally Posted by aerosayan (Post 840137)
Do you write and maintain a separate serial version of your code?

I don't know if many do this, but I've seen some devs build a serial version of their code first, and then make a copy of it, and parallelize the copy later.

There's not really any major benefit in doing this, other than simplifying the development and debugging process, since coding for serial execution is easier.

Do you also write your code like this, or with enough experience, do you just start developing the parallel version from the beginning?

That probably makes sense for shared memory implementations. I mean, that's how I would probably do it (but I actually never did shared memory). Still, at some point, I think you have to abandon the serial version and just maintain the parallel one.

For distributed memory, absolutely not. You need to think parallel for every detail of the code, to the point that it makes no sense at all to have a serial version.

Still, the distributed memory version is also the one that, when correctly implemented, would more closely resemble a serial version (and it actually does in terms of computational steps), with all the advantages in maintenance and debugging.

flotus1 November 28, 2022 05:58

Build a serial version first: mostly yes. It can be useful as a general proof of concept. And to help during the early development stages of the parallel version.
Maintain a serial version alongside a parallel one: aw hell no! You would need a very good reason to justify the extra effort, once your parallel version works.

Generalizations are tricky here. If the parallel implementation of an algorithm requires serious effort -maybe even an implementation that is inferior to what you can do in serial- the benefits of serial implementations can be questionable.

arjun November 28, 2022 07:49

Quote:

Originally Posted by aerosayan (Post 840137)
Do you write and maintain a separate serial version of your code?

I don't know if many do this, but I've seen some devs build a serial version of their code first, and then make a copy of it, and parallelize the copy later.

There's not really any major benefit in doing this, other than simplifying the development and debugging process, since coding for serial execution is easier.

Do you also write your code like this, or with enough experience, do you just start developing the parallel version from the beginning?



Interesting question you asked since i have come to conclusion to do it. First for the plasma related code and then for wildkatze.

Why so?

I come to conclusion that many people do not try Wildkatze or any other solver if they have to install something. Even with wildkatze we avoided installation and we provide an executable that one can directly run, it still needs openmpi installed and user might not want to install it.


So i come to conclusion that there has to be a serial version that user can run without installing anything. At least this way it will be easier to try out.

So yes in December there will be single processor version for Wildkatze and another plasma project.


PS: Same code will be doing the both and for every release a serial version will also be compiled.

sbaffini November 28, 2022 08:06

Quote:

Originally Posted by arjun (Post 840152)
Interesting question you asked since i have come to conclusion to do it. First for the plasma related code and then for wildkatze.

Why so?

I come to conclusion that many people do not try Wildkatze or any other solver if they have to install something. Even with wildkatze we avoided installation and we provide an executable that one can directly run, it still needs openmpi installed and user might not want to install it.


So i come to conclusion that there has to be a serial version that user can run without installing anything. At least this way it will be easier to try out.

So yes in December there will be single processor version for Wildkatze and another plasma project.


PS: Same code will be doing the both and for every release a serial version will also be compiled.

Don't take me wrong but:

- I expect you will be doing this by conditional compilation flags and not a completely separate code

- This sounds like a company adding a wrong feature for business related reasons (in the sense that you didn't see a point in this for different reasons). Legit in business, but still...

arjun November 28, 2022 08:13

Quote:

Originally Posted by sbaffini (Post 840156)
Don't take me wrong but:

- I expect you will be doing this by conditional compilation flags and not a completely separate code

- This sounds like a company adding a wrong feature for business related reasons (in the sense that you didn't see a point in this for different reasons). Legit in business, but still...



Not completely new code. Just one file that is responsible for parallel exchange. Thats it.

andy_ November 28, 2022 11:55

Quote:

Originally Posted by aerosayan (Post 840137)
Do you write and maintain a separate serial version of your code?

No. Single code for serial, shared memory, distributed or whatever. Mixture of conditional compilation and runtime flags. Often create dummy libraries for MPI and such for the code to link against which generate an error if any function is called. Enabling the code to be self contained and able to build and run anywhere is useful particularly for computers with limited or no operating system and libraries.

sbaffini November 28, 2022 13:59

Let me also add that, as someone with experience in distributed parallel programming in CFD (or, at least, I like to think so), I have all the tools at my disposal for most things I would probably ever try to develop, so the first thing I ask when starting something new, even trivial, is if a distributed parallel approach would be useful. If it is so, I have no actual friction in using it from the scratch, because I likely have already everything I might need. Probably, the same thing would apply for someone using shared memory (yet, I doubt it would lend to the same level of encapsulation possible with MPI).

In contrast, someone who hasn't actually developed any parallel code before will obviously experience a lot of friction when starting for the first time.

So, with respect to your question, I think it is obviously relevant who are we talking about. It probably makes sense to start serial for someone who never went parallel in the first place. Maybe even totally legit. But if you know the field and that it requires a parallel approach and if you already have all the required tools at your fingertips then, well, to me it looks like a waste of time to start parallel (i.e., something with no obvious return).

aerosayan November 28, 2022 23:09

Quote:

Originally Posted by sbaffini (Post 840191)
It probably makes sense to start serial for someone who never went parallel in the first place. Maybe even totally legit. But if you know the field and that it requires a parallel approach and if you already have all the required tools at your fingertips then, well, to me it looks like a waste of time to start parallel (i.e., something with no obvious return).

Yes that's true. My reasoning was more aligned to make it easier to debug and develop something, and less that parallelization would be difficult. After some time parallelization becomes easy, and introducing something like a new limiter or flux becomes easy because they're not dependent on parallelism.

Serial code seems best for prototyping. Even Dr. Nishikawa's codes are serial and optimized for this reason.

So, I'm thinking mainly to develop my prototype serial code in Julia, and only develop the final parallel code in C++. Obviously we can design and validate our codes quickly if they're serial.

Yeah it's double work, but it honestly seems to be the fastest way to completion of my work, and keep my company happy.

aerosayan November 28, 2022 23:17

Quote:

Originally Posted by arjun (Post 840152)
I come to conclusion that many people do not try Wildkatze or any other solver if they have to install something. Even with wildkatze we avoided installation and we provide an executable that one can directly run, it still needs openmpi installed and user might not want to install it.

So i come to conclusion that there has to be a serial version that user can run without installing anything. At least this way it will be easier to try out.


Respectfully I would caution you against that. Installing MPI is not difficult. Let them figure it out. Everyone in scientific community knows about MPI and can find external help on how to install it. Showing them how to do it with a few tutorials would be helpful. If possible, try to statically build everything, so they don't need external libraries. Calculix CCX statically builds everything and can run on everywhere. Since you're only trying to show the application to the users, adding additional work for yourself, by maintaining a separate serial code seems to be difficult to manage in the long run.

andy_ November 29, 2022 03:44

Quote:

Originally Posted by aerosayan (Post 840230)
So, I'm thinking mainly to develop my prototype serial code in Julia, and only develop the final parallel code in C++. Obviously we can design and validate our codes quickly if they're serial.

Yeah it's double work, but it honestly seems to be the fastest way to completion of my work, and keep my company happy.

Are you prototyping GUI code or CFD modelling? If the latter and it isn't easier to do in your existing CFD code then I would suggest you might need to look at how your existing code manages the modelling. I am referring to full 3D models rather than 1D proof of concept modelling.

Why would your company be happy? When I worked in a CFD group in industry the group leader wouldn't have permitted me to waste the company's time duplicating effort or implementing models that he or others in the group could not pickup and work on at a later date if, like much of the speculative modelling, it didn't make it into the production version of the code. Perhaps I misunderstand your position and/or the kind of CFD code you are developing. We also didn't code in C++ which may be a factor depending on how it is being used for the main code in driving people to want to sort things out in a more efficient language.

arjun November 29, 2022 04:34

Since wildkatze does not use any external library the static linking part does not affect it. The solver does run without installing anything other than mpi which is needed because it is compiled using mpi.

Since the code is well done, compiling a single processor version is one time a day of work (actually few hours). Rest of the times will be able to also compile a single processor version and that would require 5 minutes every release.

This is why i am thinking of it. Because then if someone wants to see ,he does not even need mpi.

(though most people already have mpi if they are running simulations).

If i needed more efforts i would not think of doing it but as Wildkatze source is very agile thanks to the way i coded, things work fast in it.

Quote:

Originally Posted by aerosayan (Post 840231)
Respectfully I would caution you against that. Installing MPI is not difficult. Let them figure it out. Everyone in scientific community knows about MPI and can find external help on how to install it. Showing them how to do it with a few tutorials would be helpful. If possible, try to statically build everything, so they don't need external libraries. Calculix CCX statically builds everything and can run on everywhere. Since you're only trying to show the application to the users, adding additional work for yourself, by maintaining a separate serial code seems to be difficult to manage in the long run.


sbaffini November 29, 2022 04:37

Quote:

Originally Posted by aerosayan (Post 840230)
Yes that's true. My reasoning was more aligned to make it easier to debug and develop something, and less that parallelization would be difficult. After some time parallelization becomes easy, and introducing something like a new limiter or flux becomes easy because they're not dependent on parallelism.

Serial code seems best for prototyping. Even Dr. Nishikawa's codes are serial and optimized for this reason.

So, I'm thinking mainly to develop my prototype serial code in Julia, and only develop the final parallel code in C++. Obviously we can design and validate our codes quickly if they're serial.

Yeah it's double work, but it honestly seems to be the fastest way to completion of my work, and keep my company happy.

Ok, for prototyping I'm completely with you. I actually work a lot in MATLAB, because that's what I know and because I want a battle tested reference to use as much I can without implementing it myself.

But I certainly don't use it for the whole code. Sometimes I just need to study how a certain numerical expression behaves around some possible singularity. Some other time I need to come up with a reference implementation of something I never did before but, still, always a very confined numerical problem. In general, if I'm doing prototyping it means I'm actually studying the matter and not really implementing, so that code doesn't really count and gets abandoned as soon as the matter is figured out, hopefully within a week or less.

Two personal examples of stuff I did in MATLAB for prototyping/understanding are these scripts on wall functions and this for non work related stuff.

In some other cases, or sometimes just after prototyping, I might need to figure out how certain programming aspects of a matter work. Again, I need prototyping but, as now I have to figure out the actual stuff, I have to work in the actual language where I'm going to need the stuff. For example, you might remember of when I worked on this. In this case Fortran (but most modern languages do) has my back, I just need to work in a module with certain precautions and the work is reused just after finished and tested.

The common frame here is that we are talking about short framed things. I actually do my best to abandon them and make it impossible to maintain them because, well, I know I don't want to.

But a full production code is a full production code and, as we tried to figure out in this post, it's certainly not something that a single man can figure out by himself without a proper previous experience and, even in that case, it would take an inordinate amount of time which is not compatible with any business. What this means is that you need a team working on it concurrently and to test it along the way. If everyone in the team would be doing this two separate code stuff I don't think the project would be going anywhere.

Let me put it in another way. In my opinion, somewhere along the line of developers being paid for a project there must be at least one of them who knows exactly what must be done, how it has to be done and what tools are needed, possibly also able to do it himself alone given proper time. It is considered healthy that such person is among those paid the most and not the less. If such person is around, he doesn't simply throws at you stuff to figure out by yourself, unless you are in R&D and you are not expected to produce any useful result. If such person is around he knows your capabilities and limitations and how much you can handle alone or in team.

So, reframing what I already wrote in my previous post, I think this is ok in a context of exploration, for example if you are paid to explore a matter for which the company you work for has limited or no knowledge at all.


All times are GMT -4. The time now is 21:29.