CFD Online Discussion Forums

CFD Online Discussion Forums (http://www.cfd-online.com/Forums/)
-   Main CFD Forum (http://www.cfd-online.com/Forums/main/)
-   -   GPU based CFD code (http://www.cfd-online.com/Forums/main/78159-gpu-based-cfd-code.html)

harry July 14, 2010 07:19

GPU based CFD code
 
is there some researches regarding the development of FVM based codes under GPU computational environments?

gocarts July 14, 2010 08:59

Gpu cfd
 
Yes, check out SpeedIT Toolkit 0.9 for OpenFOAM.

arjun July 14, 2010 09:13

Quote:

Originally Posted by gocarts (Post 267241)

have you guys implimented BiCGstab method??
If yes how was the performance.

From the link you posted , a quick glance only show matrix vector product timings.

How about AMG in GPU???


PS: I recently tried BiCGstab but did not observe much speed up. (but my gpu only has 110 cores, it is gts 240)

Dinocrack July 14, 2010 12:07

Sorry for my question, but what are the main differences between FVM based codes under GPU computational environments and FVM based codes under CPU computational environments?

If we want to write a code based under GPU enviroment what are the main points to take in mind?

There is some literature?

Thanks very many

Nuno

harry July 14, 2010 19:42

Quote:

Originally Posted by gocarts (Post 267241)

this is a very interesting work! It is not clear what kind of systems this linear equation solver is based on, linux/windows, or both.
Hope we can see its complete release soon.

Cheers,
Harry

arjun July 14, 2010 21:19

Quote:

Originally Posted by Dinocrack (Post 267267)
Sorry for my question, but what are the main differences between FVM based codes under GPU computational environments and FVM based codes under CPU computational environments?

If we want to write a code based under GPU enviroment what are the main points to take in mind?

There is some literature?

Thanks very many

Nuno


Here are the basic rules of this game:

A) You have many cores that take the task and do it. You divide them into troops and each troop take a part of the job and work on it.
Within this troop you have worker, each worker can work on something.

B) all the workers will do the same thing. Imagine that you want to calculate
C[i] = A[i] + B[i];
each worker will do this , only difference is it will use different i, but the action is same.


C) each worker should not interfere with other worker's work. Or in other words , the work each worker is doing is independent of others.

For example c[i] = a[i] + b[i] is independent of other's values so could be programmed by GPU.

but for c[i] = c[ i ] + c[ i-1] + c[ i - 2 ]

c[ i] is dependent on i-1 and i-2 values, imagine that when worker on i is working on , the worker on i-1 and i-2 did not finish their job. This will cause error in results. So not directly parallelizable.

There are some more basics, but this is main idea behind it.

andrea.pasquali September 14, 2010 11:53

Hi,
I'm interesting to GPU for OpenFOAM.
I installed the SpeedIT Classic version but I have a problem you can see here:

http://www.cfd-online.com/Forums/ope...-openfoam.html

Could anyone help me?

Thanks


Andrea

Lukasz May 22, 2012 10:12

Quote:

Originally Posted by arjun (Post 267246)
have you guys implimented BiCGstab method??
If yes how was the performance.

From the link you posted , a quick glance only show matrix vector product timings.

How about AMG in GPU???

We have just released SpeedIT 2.1 with AMG preconditioner.
See our benchmarks at vratis.com/blog & speed-it.vratis.com for details.

arjun May 22, 2012 20:23

Quote:

Originally Posted by Lukasz (Post 362473)
We have just released SpeedIT 2.1 with AMG preconditioner.
See our benchmarks at vratis.com/blog & speed-it.vratis.com for details.


Just one small comment. I think your comparison with GAMG is not fair. Smoothed aggregation is very fast compared to GAMG and hence if you run the both on CPUs also you will see that solver with smoothed aggregation is fast.

Lukasz May 23, 2012 03:49

Could you be more specific and provide some examples where indeed CG+AMG is faster than GAMG? We did some tests with icoFoam and cavity3D, simpleFoam for Ahmedbody and Cabin cases: http://vratis.com/blog/?page_id=2
and in all these examples GAMG seemed to outperform other methods. Although we could only compare with CG+DIC/diagonal on CPU.
Our multi GPU solution with AMG seems to be about 60% faster than OpenFOAM (1.6x acceleration of nGPU vs. N CPU, where n is the number of CPU cores and N is the number of GPU cards, measured for various cases for 10 first iterations).

arjun May 23, 2012 04:20

Quote:

Originally Posted by Lukasz (Post 362603)
Could you be more specific and provide some examples where indeed CG+AMG is faster than GAMG? We did some tests with icoFoam and cavity3D, simpleFoam for Ahmedbody and Cabin cases: http://vratis.com/blog/?page_id=2
and in all these examples GAMG seemed to outperform other methods. Although we could only compare with CG+DIC/diagonal on CPU.
Our multi GPU solution with AMG seems to be about 60% faster than OpenFOAM (1.6x acceleration of nGPU vs. N CPU, where n is the number of CPU cores and N is the number of GPU cards, measured for various cases for 10 first iterations).


I can not at the moment (because i do not use openfoam) but I have implemented smoothed aggregation (both single matrix and coupled matrix (u,v,w and p) (iNavier uses smoothed aggregation preconditioned BiCGStab for press by default). It is my experience that BiCGStab preconditioned with smoothed aggregation is more than 2 times faster compared to simple AMG.
the gap is much more bigger when mesh sizes increases. (It could even be more than 5 times). If you really want to see the difference try 5 million or more cells.

This is based on my experience with smoothed aggregation, classical AMG and Bi CGSTab.

Note: CG preconditioned with AMG is not that fast. (sound strange but it is true in practice).



Edited to add: You are looking for timing of Navier stokes by changing Solvers. Which i think is not linearly related matrix solvers timing. You should be only comparing pressure equations convergence and time taken by matrix solvers.

Lukasz May 24, 2012 13:36

This is interesting and we will remember this for future.
FYI, we have never used BCGstab+AMG as most of the time of our solver was spent in solving the pressure equation with CG. Solving u,v,w with BCGstab took much less time: a few iterations per time step comparing to hundreds to solve pressure equation.

BTW, we are just writing a paper about ARAEL, our new Navier-Stokes solver that we completely implemented on GPU. Profiling analysis will be added there as well for several tests. if you are interested I could send you a camera-ready version once this is ready.

cfdnewbie May 24, 2012 13:42

This is interesting, Lukasz. May I ask what type of solver is it? compressible/incompressible? FV or something else?

Lukasz May 24, 2012 13:49

Incompressible for steady-state and transient flows. Take a look at this presentation for more details.

cfdnewbie May 24, 2012 13:52

Thanks a lot, very impressive.

arjun May 24, 2012 16:36

Quote:

Originally Posted by Lukasz (Post 362957)
This is interesting and we will remember this for future.
FYI, we have never used BCGstab+AMG as most of the time of our solver was spent in solving the pressure equation with CG. Solving u,v,w with BCGstab took much less time: a few iterations per time step comparing to hundreds to solve pressure equation.

I never solve u,v, w with AMG. It is not very efficient.

Quote:

Originally Posted by Lukasz (Post 362957)
BTW, we are just writing a paper about ARAEL, our new Navier-Stokes solver that we completely implemented on GPU. Profiling analysis will be added there as well for several tests. if you are interested I could send you a camera-ready version once this is ready.

I wrote full navier stokes in GPU in year 2010 but the code is for single GPU. I did not release that version of iNavier because I could not find time to create GPU version for rest of the things (like turbulence etc).


PS: I can not do AMG creation part in GPU. I know how CUSP lib does though.

Please do send it to me. I am definitely very interested. Thank you.

arjun May 24, 2012 16:37

Quote:

Originally Posted by Lukasz (Post 362960)
Incompressible for steady-state and transient flows. Take a look at this presentation for more details.


I am going to.


All times are GMT -4. The time now is 23:43.