CFD Online Discussion Forums

CFD Online Discussion Forums (https://www.cfd-online.com/Forums/)
-   Main CFD Forum (https://www.cfd-online.com/Forums/main/)
-   -   C++ Vs Fortran (https://www.cfd-online.com/Forums/main/16404-c-vs-fortran.html)

jughead March 2, 2009 12:56

Re: C++ Vs Fortran
 
I am more proficient in Fortran and C so it would be easy for me to say that C++ is slower and not worth using and so on.

If you want to do OOP with C or Fortran you might as well go with C++.


Diablo March 2, 2009 15:53

Re: C++ Vs Fortran
 
How about using Python? Both fortran and C code can be run on it and it lends itself to parallel computing, making use of GPGPU technology

Jed March 4, 2009 06:07

Re: C++ Vs Fortran
 
This is mostly FUD. Whether a code is object oriented or not is mostly orthogonal to performance. It is easy to write C++ code which is inefficient or for which it is difficult to reason about performance, but that does not mean that OO is necessarily slow. Having extremely fine-grained objects is often inefficient, because it may put more stress on memory bandwidth or because of function-call overhead.

As an example of excellent performance from modern C++, see Eigen2. It may be very hard to write a library like this or to reason about it's performance because the compiler is doing so much manipulation, but it does use all the C++ features you claim are slow.

Assuming you are familiar with all languages involved, my advice is to write in C or C++ and use objects extensively for high-level components, but not use objects for the fine-grained stuff like individual elements of a mesh.

Jed March 4, 2009 06:18

Re: C++ Vs Fortran
 
considering all things, a plane Fortran code usually perform a bit faster than a palne CPP code under the same condition

The only practical reason for this is pointer aliasing (otherwise the compiler's intermediate representation may be identical for a given algorithm implemented in C, C++, and Fortran). C99 adds the <code>restrict</code> keyword which nullifies this issue, you can confirm by comparing the generated assembly.

I agree that understanding the hardware (especially memory) is much more important to high performance than which language or compiler you are using. Also note that SSE intrinsics and CUDA/OpenCL are easier to use from C.

jughead March 4, 2009 09:19

Re: C++ Vs Fortran
 
Python is slow. I think it can be used to build test scripts or something on top of the Fortran/C++ program.


Tom March 5, 2009 04:52

Re: C++ Vs Fortran
 
"As an example of excellent performance from modern C++, see Eigen2. It may be very hard to write a library like this or to reason about it's performance because the compiler is doing so much manipulation, but it does use all the C++ features you claim are slow."

So where's the equivalent optimized FORTRAN in those benchmarks? Also how well does the same code perform on a non-intel machine such as an NEC or IBM without any changes?

You're giving compilers too much credit for being able to fix/optimize the code. My example of comparing vanilla FORTRAN with a (also FORTRAN) program using derived types and operator overloading (two aspects of OOP) demonstrates that the compiler cannot rearrange the code to be optimal. The same thing happens when you compare C and C++. This has nothing to do with "badly" written code - it's an aspect of the compiler not been able to figure out what the best rearrangement of the code is in the more complicated case.


Jed March 5, 2009 06:38

Re: C++ Vs Fortran
 
So where's the equivalent optimized FORTRAN in those benchmarks?

It's comparing favorably to vendor-optimized assembly. Ary you claiming that Intel's MKL could be sped up by rewriting it in vanilla Fortran?

using derived types and operator overloading (two aspects of OOP)

Indirect function calls (virtual functions) add less than 10 cycle overhead on modern hardware. If you use very fine-grained derived types and address them through the generic interface (i.e. you don't put the extra effort to resolve the calls statically) then this overhead may be significant, but using objects for your matrix, vector, preconditioner, Krylov accelerator, Newton solver, and mesh types does not incur any measurable performance overhead. You do several (like 8 or more) orders of magnitude more work each time you make these calls. Also, a preconditioner application or matrix-multiply usually calls MPI and every MPI implementation uses such polymorphism (they are written in C using object oriented design) which is still negligible compared to hitting the network or performing a system call.

If you use fine-grained objects, such as for the components of velocity at a point, or for each cell/face/vertex of a mesh then you'll have to be sure that all function calls are resolved statically (at compile time) and preferably inlined, otherwise performance will suffer. Even so, storing a mesh in this form will generally use more space, hence slow the application for memory bandwidth limited operations.

Also note that operator overloading has absolutely nothing to do with OOP and does not incur any overhead compared to the equivalent named functions (though I dislike it for other reasons). It is equivalent to function overloading which (in C++) just writes the types of the arguments into the name of the function (look at the symbols in the object file). Such functions are still statically resolved and incur zero overhead (except with shared libraries in which case the loader needs to resolve longer symbol names, but that only happens once and is irrelevant for scientific codes).

Operator overloading has a bad rap because it would appear to prevent rearranging computations so that lots of intermediate objects are not created. For instance, consider the expression <code>r = (a+b)(c+d)</code> where <code>a,b,c,d,r</code> are vectors and the operation is to be performed poinwise. Naive operator overloading will create as many as three temporary vectors, putting obscene pressure on the memory bus when the vectors are large, hence performing very poorly. Eigen2 uses a technique called expression templates which forces the compiler to perform the necessary transformations and produce the tight loop with no temporary vectors. Expression templates are reasonably advanced use of the C++ type system and can be tricky to debug (the error messages are hideous) but they are a prime counterexample to your claim.

Tom March 5, 2009 07:47

Re: C++ Vs Fortran
 
"It's comparing favorably to vendor-optimized assembly. Ary you claiming that Intel's MKL could be sped up by rewriting it in vanilla Fortran?"

The Intel MKL are not in assembly language they're written in C as I recall with compiler directive inserted and some hand optimization (loop unrolling etc) + "trimming" of the resulting assembly language. None of the routines are written in pure machine code! And actually, for most of the operations they are benchmarking, the intel fortran compiler would produce code as fast (or faster) in just a few lines. Try it! Most of these operations are relatively simple to write in straight FORTRAN

If all of these features are so efficient and compilers are so good why do people (myself included) spend time adjusting/optimizing very large FORTRAN/C programs every few years when we purchase a new supercomputer. Running relatively small codes on PC's is clouding you're judgement - you need to look at portability across different processor architectures.

If large CFD codes in FORTRAN need hand optimizing to help the compiler on different platforms how much work do you think C++ code with all the bells and whistles would take?

And if you want a "proper" OOP language you should use Ada and not C++:)


Jed March 5, 2009 09:40

Re: C++ Vs Fortran
 
they're written in C as I recall with compiler directive inserted and some hand optimization

SSE intrinsics are almost the same as assembly. Such code is definitely optimized by reading the produced assembly, it's just often easier to write and maintain such assembly by having the compiler produce it from C+intrinsics.

the intel fortran compiler would produce code as fast (or faster) in just a few lines.

Yes, the BLAS is easy to implement in Fortran (or just download it from netlib), but people still use vendor BLAS because it's a hell of a lot faster than compiling the reference BLAS (F77), even with good compilers. After the usual loop transformations and tuning your block/tile sizes, you'll likely end up with something similar to ATLAS, which isn't that good. Without SSE-style intrinsics or other compiler extensions, you don't have a way to exploit alignment and vector instructions beyond what the compiler can do automatically.

Note: BLAS beyond level 1 is irrelevant for most CFD applications since we don't have large dense matrices (if you do then you fail the scalability test). Level 1 is memory bandwidth limited for large vector sizes so your choice of BLAS has little impact on performance. Eigen2 is just an example of modern C++ delivering excellent performance (possibly better than any existing Fortran implementation).

Running relatively small codes on PC's is clouding you're judgement

My experience is limited to sub-4000 CPU machines with x86, x86_64, and POWER4/5. While it's not all possible architectures, they are the most widely used today. But, the architecture has nothing to do with language choice or OOP, provided reasonable compilers are available (they are, on every viable HPC platform for each language we are discussing). Optimizing for such architectures has everything to do with the memory architecture, which you deal with in the same way from each language. Furthermore, parallel scalability is all about the algorithm, hence the language is doubly irrelevant.

If large CFD codes in FORTRAN need hand optimizing to help the compiler on different platforms how much work do you think C++ code with all the bells and whistles would take?

The most relevant optimizations concern the memory hierarchy and are orthogonal to which language or design principles are being used. Good software design will make it easier to identify the hotspots and maintain hand-tuned versions for different architectures and compilers. This can extend arbitrarily deeply, for example you may have a structure (e.g. matrix, mesh, something needed by the preconditioner) which needs to be stored differently to exploit the memory architecture on the BG/P versus the XT5. With appropriate use of OOP, it is not difficult to keep both versions around and verify their correctness as the code evolves. If you are not using OO design, then many parts of your code may depend on the storage format in which case you essentially fork the project when optimizing for each architecture.

I agree with you that a weakness of "modern" C++ relative to C and Fortran is that it's more difficult to reason about performance, but this is not because of compiler inadequacies.

And if you want a "proper" OOP language you should use Ada and not C++:)

I'm not a fan of C++ or it's object system, but it's not because of performance.

Tom March 6, 2009 06:15

Re: C++ Vs Fortran
 
My final comments on this:-

"SSE intrinsics are almost the same as assembly. Such code is definitely optimized by reading the produced assembly, it's just often easier to write and maintain such assembly by having the compiler produce it from C+intrinsics."

But you don't need to add SSE intrinsics yourself the FORTRAN compiler will do it for you. A lot of compiler optimizations are actually just hints to the compiler (i.e. compiler directives) which look like comment lines (bit like openmp). Nobody should need to write assembly language to do there scientific computing (especially with FORTRAN:).

Also the use of BLAS etc can actually cause code to run slower; i.e. if you write the code to use the BLAS library excessively you will miss many optimizations for your specific problem.

My experience is limited to sub-4000 CPU machines with x86, x86_64, and POWER4/5

Why bring the Ghz rating into this? it's only a meaningful number if you are talking about the same processor family! I've used a 500Mz machine whose CPU's are more than twice as fast as a 2Ghz x86 - in real terms not benchmarks. The numbers not even meaningful when comparing a x86 to a P5.

"The most relevant optimizations conc ..."

yes memory optimizations are important but your example is misleading - we usually know the optimal order a priori in FORTRAN and so it rarely changes between platforms. What usually happens is we either waste some temporary memory so that the loops can be rearranged for pipe-lining or try to reduce it for caching. In many cases the FORTRAN compiler will try to do some of this for you.

I agree with you that a weakness of "modern" C++ relative to C and Fortran is that it's more difficult to reason about performance, but this is not because of compiler inadequacies.

I didn't say it was an inadequacy of the compiler I said you were expecting too much from it - each language has it's strengths and weaknesses.

It is amusing though that you need to download all these classes and libraries to make C++ as fast a pure, out the tin, FORTRAN for scientific computing though.

Jed March 6, 2009 08:58

Re: C++ Vs Fortran
 
But you don't need to add SSE intrinsics yourself the FORTRAN compiler will do it for you.

Such intrinsics can exploit knowledge that is not available from Fortran, such as that a particular array will always be 16-byte aligned. You can also give application-aware prefetch instructions.

Also the use of BLAS etc can actually cause code to run slower

It's well known that BLAS is terrible for small matrices and vectors, but if your objects are large and the operation you are doing is exactly a BLAS operation (not a variant or combination), then the BLAS implementation is broken if it doesn't perform at least as well as your code.

Why bring the Ghz rating into this?

I didn't, that was a processor count.

we usually know the optimal order a priori in FORTRAN

Which order is better has nothing to do with it being Fortran, C, or otherwise. A C compiler has all the same opportunity for loop optimizations as the Fortran compiler has, in many cases, the intermediate representation that the optimizer works with is identical.

each language has it's strengths and weaknesses

Yes, but in this case the differences are mostly orthogonal to performance, they have to do with maintainability (this includes who knows the language), library support, etc.

Considering that the intermediate representation is typically the same, Fortran's historic edge comes from explicitly prohibiting aliasing while C and C++ allow it. C99 adds the <code>restrict</code> qualifier which nullifies this advantage. Although <code>restrict</code> is not in the C++ standard, many compilers support it.

kenn k.q. zhang March 6, 2009 14:00

Re: C++ Vs Fortran
 
i used fortran a lot last century, and i couldn't stand my friends in other fields, elctrical engineering for example, laughing at me;

so i gave it up;

now i have been using c++ for 8 yars, but i virtually never oop and definitely no pointers,

and my code looks so beautiful -- at least my self think so

kenn k.q. zhang March 6, 2009 14:19

Re: C++ Vs Fortran
 
i have been using c++ for 8 years, and

1. never used any pointers 2. virtually never used any oop

and my codes show no slowness than fortran;

basically, for scientific computing, you may choose one languange from the following three

1. assembly, which will always be the fastest but you have to build all simple operations/functions by yourself

2. high-level language, such as fortran77 (if this is not for numerical, a good example could be java). high language is generally easy to operate but lacks some functionalities

3. middle languange, such as c, which is a blance between functionality & easiness

c++ is an upgrade of c, but you can turn off the ++ part of c++. as i said, i virtually never used oop

i like frotran77, it was small, neat, and strict-- so that it's less error prone;

but i hate fortran 90, 95, 2003, and so on so forth. what's the point of these things? if you want efficiency, go to assembly, if you want simplicity ofr numerics, go to fortran77, if you want the balance & transition to people in other fields, go to c++

what the hell is the unique stuff of fortran 90/95???

and remember, 10% of performance advantage in scientific computing is nothing; even twice faster is NOTHING. because, they are still in the same order of magnitude!!!

that fortran is promoting ugly programming is simple to undestand:

it tolerates bad programming, so it is more error prone


kenn k.q. zhang March 6, 2009 14:25

Re: C++ Vs Fortran
 
in scientific programming, even 200% of improvement in speed means nothing.

because

1. it's in the same order of magnitude

2. by choosing a better numerical method, your code could be 10 times faster. so, get that benefit first; then if you are greedy, get the faster compiler

3. each to program, easy to debug, easy to maintain, easy to scale up, easy to communicate with others in other field, are a lot of more important. that's why the "world language" was invented long long ago by linguists but now we are all using english in scientific research

kenn k.q. zhang March 6, 2009 14:26

Re: C++ Vs Fortran
 
i forgot to say "world language" was much more scientific than english.

kenn k.q. zhang March 6, 2009 14:31

Re: C++ Vs Fortran
 
sorry, "world languange" is not a good example, and lacks logical conncetion to what we are debating.

but my point is, easy to communicate with people in other fields is a big factor to choose a computer language to use.

kenn k.q. zhang March 6, 2009 14:34

Re: C++ Vs Fortran
 
that was A typo,

i meant fortran was invented in 1950s by ibm; c++ was invented in 1970s by sb;

but the "s" after "1970" was lost somehow

kenn k.q. zhang March 6, 2009 14:37

Re: C++ Vs Fortran
 
i heard about fortran guys converted into c++

but i never heard of c++ guys converted into fortran

it tells everything

kenn k.q. zhang March 6, 2009 14:39

Re: C++ Vs Fortran
 
please stop using fortran,

otherwise the communicty of cfd will continue to be discriminated by other communities, such ieee

kenn k.q. zhang March 6, 2009 14:50

Re: C++ Vs Fortran
 
10% improvement in speed means nothing;

even 200% improvment in speed means NOTHING

because

1. codes in scientific computer are not intended for real-time situations

2. by selcting a better numerical method, your code could be 10 times faster; for example, multigrid is a lot faster, have you used it? spectral is a lot faster, have you used it???

3. parallelization can improve the speed of your code by 10 times, 100 times -- as long as you have resources and proper algorithms/implementations

4. how to save your time during development, maintetance of codes, is lot more significant. you have to understand, most of codes we developed eventually will be thrown into trash cans. so code developing is a temparary thing and focus on how to save your own time, not for the computer.



All times are GMT -4. The time now is 10:40.