CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > General Forums > Main CFD Forum

how do we unify cpu+gpu development?

Register Blogs Community New Posts Updated Threads Search

Like Tree4Likes
  • 1 Post By sbaffini
  • 2 Post By LuckyTran
  • 1 Post By LuckyTran

 
 
LinkBack Thread Tools Search this Thread Display Modes
Prev Previous Post   Next Post Next
Old   March 25, 2022, 02:25
Default how do we unify cpu+gpu development?
  #1
Senior Member
 
Sayan Bhattacharjee
Join Date: Mar 2020
Posts: 495
Rep Power: 8
aerosayan is on a distinguished road
Hello everyone,

Developing separate code bases for CPUs and GPUs might be a limiting factor in future.

OpenHyperflow for example, has separate branches setup for its CPU and CUDA code : https://github.com/sergeas67/OpenHyperFLOW2D

Maintaining such a codebase becomes difficult in the long run, and if possible it would be good to have the same code run on both CPUs and GPUs.

Is this a worthwhile effort?

Another dev asked this same question before : https://stackoverflow.com/questions/9631833/ and the main answer stated that due to the hardware architectural differences, the codes needs to be optimized for the device it runs on.

That's true. The differences will be even more severe as the GPU architecture evolves in the future. CPU architecture has been more or less stable for one or two decades. But GPU architecture is continuously improving and that will result in way more different coding strategies for both.

So I don't know if it's possible to have the same code run on CPUs and GPUs.

Should we just stick with MPI/OpenMP based parallelization for CPUs?

OpenMP can now technically be used for coding GPUs : https://stackoverflow.com/questions/28962655/ but since this new feature is not being widely used today, we still don't know what are the limiting factors.

Currently I can only see one strategy that might be useful : represent the equations in standard BLAS or LAPACK form, and use libraries that are either CPU/GPU accelerated.

For example, we can create separate CPU/GPU accelerated function to add two arrays of floats. Then we would just need to use the appropriate function in our code.

I'm looking for other strategies that might be good, but this is the only strategy which looks feasible to me currently. Since we need parallelization to do large amounts of work, it makes sense to write our code to allow better parallelization of these simple but very widely used mathematical methods.

Some other non BLAS/LAPACK example of such a widely used method would be Gauss-Siedel Elimination, LU decomposition, Cholesky factorization, Fast Fourier Transforms etc....

Is this the correct approach? What can we do better?

Thanks
~sayan
aerosayan is offline   Reply With Quote

 


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
General recommendations for CFD hardware [WIP] flotus1 Hardware 18 February 29, 2024 12:48
GPU acceleration in Ansys Fluent flotus1 Hardware 63 May 12, 2023 02:48
[Resolved] GPU on Fluent Daveo643 FLUENT 4 March 7, 2018 08:02
Superlinear speedup in OpenFOAM 13 msrinath80 OpenFOAM Running, Solving & CFD 18 March 3, 2015 05:36
Star cd es-ice solver error ernarasimman STAR-CD 2 September 12, 2014 00:01


All times are GMT -4. The time now is 05:18.