[RapidCFD] Discussion thread on how to install and use RapidCFD

Jurado · March 24, 2018, 14:57

Quote:

Originally Posted by satish000yadav

Thank you so much friend! It worked.

You're welcome, if you manage to succesfully make a computation could you tell me and also how you would have done it ?

satish000yadav · March 24, 2018, 15:23

Quote:

Originally Posted by Jurado

You're welcome, if you manage to succesfully make a computation could you tell me and also how you would have done it ?

Definitely.

ashishmagar600 · May 5, 2018, 15:31

Hello Everyone. Quick reminder for anyone wanting to install RCFD with CUDA v8.0, please follow as below. If you are using latest clone from git, then no need to add patch for mathfunctions, it is already debugged for the error.

Quote:

Tracking errors of not compiling the solvers were traced to uncompiled sources of reading and formatting stl files. This was due to the updated version on flex on Linux distros.

Please do the following changes in :

$Home/RapidCFD/RapidCFD-dev/src/triSurface/triSurface/interfaces/STL/readSTLASCII.L#L58
$Home/RapidCFD/RapidCFD-dev/src/surfMesh/surfaceFormats/stl/STLsurfaceFormatASCII.L#L53

Code:

from
#if YY_FLEX_SUBMINOR_VERSION < 34
to
#if YY_FLEX_MINOR_VERSION < 6 && YY_FLEX_SUBMINOR_VERSION < 34
Thanks to Daniel Jasiński (daniel-jasinski; on git-hub) for all the help.

Cheers.

Please keep this space for posts regarding RCFD only.

edaymo · May 29, 2018, 10:38

Hello,

There have been a few updates to RapidCFD which address some of the items discussed in this thread.

All of the fixes to YY_FLEX discussed in this thread are now part of the code base. Thus, Rapid CFD will compile without errors (save the mpi.h error if openmpi is not installed in a ThirdParty directory) on Ubuntu 16.04 and CUDA 8.0.

If the compute capability is changed in wmake/rules/linux64Nvcc/c++ (e.g., change -arch=sm_30 to the appropriate CC value for your GPU) the application will now execute, and also startup quickly for cards with CC > 3.0 (e.g., a couple of posts in this thread reference the long startup time). This change was done at the expense of disabling the read-only data cache for CC >=3.5, and thus this further improvement remains an open issue. However, since compilation was typically done setting CC=3.0, the read-only data cache feature was not used anyway if RapidCFD was compiled with -arch=sm_30.

RapidCFD should compile and execute with CUDA 9.x (I've successfully compiled with Ubuntu 18.04 + CUDA 9.1). Note that I receive a "device free error" with CUDA 9 (not CUDA 8, though) when an application ends. While not affecting the results in my tests, the cause of this message still needs to be tracked down.

Hopefully these updates help with your applications.

Best,

Eric Daymo
Tonkomo, LLC
http://www.tonkomo.com

mesh-monkey · August 1, 2018, 07:10

It's very exciting to see the progress of this program over the last year or so! Thanks to those involved in fixing the code base and for the install steps.

I have been able to compile RCFD and am able to run cases on a Quadro M4000 (CC5.2). (Woohoo, this is a big improvement from last time when I couldn't get anything running!)
However, I have found that it is slllowwww; about 1/10th of single CPU on the motorbike test case (cells: 352147).

I will try and bump up to a larger case, but wanted to ask:

Has anyone actually been able to see a speedup??
Any particular settings to look for to make the most of CUDA?

Thanks! Tom.

chengtun · September 18, 2018, 09:14

Hi, I have met some problems and eagerly hope to get some help. Thanks in advance!
I have installed Openfoam 6 and cuda 9.2 and the latest version of Rapid CFD. I have compiled successfully but when I calculate the case, it showed the following error:
terminate called after throwing an instance of 'thrust::system::system_error'
what(): parallel_for failed: invalid device function
Could you give me a hint on solving this?
Moreover, when I tried the tutorial cavity case in icoFoam, it instructed me to change the files in the form of Openfoam 2.x. For example, change nu to nu nu like in https://github.com/OpenFOAM/OpenFOAM...portProperties
And it doesn't have the nonslip boundary. Could I just change the case file form or should I also install and compile with Openfoam 2.x? Thanks!

edaymo · September 18, 2018, 09:27

Hello - Yes, RapidCFD follows the dictionary format of OpenFOAM 2.3. I have a working copy of the icoFoam cavity case at https://github.com/TonkomoLLC/RapidCFD-Tests. I do not know the reason for your error offhand, but I have noticed with CUDA 9.1 that a "device free failed" error is thrown at the end of a case. The quick solution here (at least for CUDA 9.1) is to edit /usr/include/thrust/system/cuda/detail and comment out line 87:

Quote:

// cuda_cub::throw_on_error(status, "device free failed");

Good luck with your tests.
Eric

chengtun · September 18, 2018, 12:46

Thank you so much for your reply! I noticed in https://github.com/Atizar/RapidCFD-dev/issues/39 you mentioned the comment is in malloc_and_free.h
It is a pity for cuda 9.2 in this file there is no such line(in attachment). And the error I met is at the beginning of simulation not the end.
Maybe I should try cuda8/9.1 later.
Another question is should I install Openfoam 2.3 and change the foamdict to this version instead of Openfoam 6? Or should I just change the calculation case format and that's enough?
Thanks a lot!
chengtun

edaymo · September 18, 2018, 13:26

Hi,

Oh yes, the file for the CUDA 9.1 "device free" error is malloc_and_free.h, as you noted. I left out that important filename accidentally in post #47 (I only gave the path). This specific path probably does not work for you because the path that I gave here (I also should have noted) is for CUDA installed from the Ubuntu 18.04 repository. I did not download CUDA from Nvidia in this case. You may have to find malloc_and_free.h by searching in /usr/local/cuda, which is where I think the Nvidia download installs itself.

I have not yet encountered the "parallel_for" error you describe, possibly because I never tried RapidCFD with CUDA 9.2. If the icoFoam cavity case that I referenced at my GitHub site does not run with RapidCFD + CUDA 9.2 due to the parallel_for error, then yes, you may want to try CUDA 8 or CUDA 9.1. If the switch to CUDA 8 or CUDA 9.1 solves the parallel_for problem, that would be nice to know.

For future reference, the parallel_for error you describe seems to be generated by the file parallel_for.h. If the error persists (or even it is just due to some change in CUDA 9.2), to troubleshoot one will need to figure out what action in RapidCFD throws this error.

I am not sure if I fully understand your question about installing OpenFOAM 2.3. You need not have OpenFOAM 2.3 installed alongside RapidCFD for RapidCFD to function. You only need to have your RapidCFD case dictionaries in the right format, which sometimes won't be the same format as OpenFOAM 6 dictionaries. To restate, you can successfully run RapidCFD on a machine that only has OpenFOAM 6 installed. As you probably figured out, and has been posted in several places on the OpenFOAM wiki and in these forums, you can create aliases in .bashrc to switch between different versions of OpenFOAM. This said, I like to have OpenFOAM 2.3.x installed on the machine I use RapidCFD so that I can access OpenFOAM 2.3.x tutorials, which generally have the correct dictionary format for RapidCFD (so I can use the OpenFOAM 2.3.x tutorial dictionaries as templates for RapidCFD usage). I hope that i i answered your question.

I hope you will be successfully using RapidCFD soon...

Best regards,

Eric

chengtun · September 19, 2018, 04:37

Hi Eric thank you sincerely for your generous help!
I found that file and tried to comment it, but it still didn't work in my cuda 9.2. So I will change to cuda 8 and hopefully it will work later.
Thanks for your explanation I understand the requirement of format now.
Another question is how to utilize the maximum capacity of GPU. I have a Tesla K20 so there is no problem of multiple GPUs. If I run 4 cases, how can I ensure they use the total capacity, by setting which parameter before running the case? Thanks in advance!

edaymo · September 19, 2018, 08:41

Hello, hopefully everything works after you revert to CUDA 8.
If you have a single Tesla K20, you can run multiple cases until the memory is utilized (memory usage can be checked with the "nvidia-smi" command from the Linux prompt).

So, suppose you want to run multiple icoFoam cases on a single K20 card. From each RapidCFD case directory, simply run "icoFoam" -- no mpirun needed. You can check memory usage and confirm that everything is running with the"nvidia-smi" command. As you alluded, mpirun comes into play when you have multiple GPU cards.

I don't have a lot of experience running multiple RapidCFD cases simultaneously on the same GPU. You may want to do some speed tests to make sure that this approach does not degrade performance. As well, as has been discussed in this thread, RapidCFD tends to perform better as the case becomes larger, and thus 4GB of GPU memory on a single K20 can be a barrier to larger problems. Thus I think a lot of interesting problems for RapidCFD would tend to lend themselves to situations where you would not be able to run more than one case at a time on a single GPU.

Good luck with your testing.

Best regards,

Eric

chengtun · September 20, 2018, 00:30

Thanks a lot for your detailed reply!
Unfortunately I tried cuda 8.0, 9.0 and 9.2 but all failed with the same error.（By the way I all used the latest cuda driver 9.2 and nvidia-smi all works. If I use a lower version of driver nvidia-smi doesn't work）I asked elsewhere https://stackoverflow.com/questions/...emsystem-error and got the reply invalid device function usually means that your code is compiled for an architecture that does not match the GPU you are running on
So I tried to change the architecure, for K20 Compute Capability is 3.50.
I changed the file in wmake/rules/linux64Nvcc/c++ to sm_35 as you mentioned in #44(Is this change enough?)

Quote:

If the compute capability is changed in wmake/rules/linux64Nvcc/c++ (e.g., change -arch=sm_30 to the appropriate CC value for your GPU)

I recompiled without error.(Attachment:compile1) But when running the case it still show the previous error after creating the mesh.(Attachment

isoFoam)

Quote:

Create time

Create mesh for time = 0

Reading field p

Reading field U

Reading/calculating face flux field phi

Selecting incompressible transport model Newtonian
Selecting turbulence model type RASModel
Selecting RAS turbulence model kOmega
terminate called after throwing an instance of 'thrust::system::system_error'
what(): parallel_for failed: invalid device function
Aborted (core dumped)

I think my problem is similiar to the #19 and #20
In #39 it posted solution but I failed at the following step.

Quote:

4) Change the architecture to adapt it to your graphic card ( Compute Capacity)

aller dansRapidCFD/RapidCFD-dev/wmake/rules/linux64Nvcc
ouvrir C et C++ et changer
CC = nvcc -Xptxas -dlcm=cg -m64 -arch=sm_30
en CC = nvcc -Xptxas -dlcm=cg -m64 -arch=sm_61 (maybe yours is not 61 check it)

Thanks again for your help and patience!

edaymo · September 20, 2018, 08:19

Hello,

I recommend you start over with a freshly cloned copy of the source code, or wclean the src and application directories. I interpret compile1.txt to show that there were no changes to the source code, so the compiler skipped over each segment of the code because it thought everything was up-to-date. So I in the end, I don't have the impression that anything was actually compiled.

Your comment that nividia-smi doesn't work for anything other than CUDA 9.2 is a little concerning. I did a quick check and found that nvidia-smi is included in the driver install. Anyhow, you may want to confirm that your nvidia driver is properly installed (maybe need to reinstall your nvidia driver??), and check "nvcc --version" matches the CUDA version you want (8 or 9.1) before proceeding with the compilation described above.

Good luck with getting everything sorted out.

Best,
Eric

chengtun · September 21, 2018, 01:41

THANK you Eric for your help! Your suggestions are really valuable! I really appreciate it!
Finally I can successfully run the case. I am using cuda 8.0 and cuda driver 9.2. And after recompiling a new Rapid CFD with sm_35(I think this is the problem) it works now! And what I meant is that nvidia-smi failed with cuda 8.0 and cuda driver 8.0,so I update the driver to latest and there is no problem with cuda.
And I can introduce my case for reference as #49 Tom asked. I am using pisoFoam and my mesh contains 45,0000 cells. Running the case in RCFD uses 650MB of GPU memory. So as Eric suggested, with K20 I could run 4G/0.65G=6 cases at the same time at most.
I run two same cases on GPU and CPU at the same time(Maybe seperately is better).The acceleration ratio of GPU over CPU is approximately 2 times as the figure shows.
And I think #49 Tom posted a good question

Quote:

Any particular settings to look for to make the most of CUDA?

I wonder if the Rapid CFD provides any adjustable parameter to improve the CUDA performance(like threads per block).
Thanks again for your help!

chengtun · September 21, 2018, 08:39

I can introduce another case for reference a. I am using pisoFoam and LES model. My mesh contains 80,0000 cells. Running the case in RCFD uses 1033MB of GPU memory. The acceleration ratio of GPU over CPU is approximately 2.5-3 times as the figure shows.

edaymo · September 21, 2018, 11:29

Hello, Chengtun.

Firstly, really glad I can help. The only technical comment I have is that I suspect that nvidia-smi is coming from your nvidia driver, not CUDA... but maybe CUDA 9.2 installed the nvidia driver at the same time...and that driver persisted when you switched to CUDA 8. I have no idea for sure what happened, but if you've got things working I'm would just keep moving forward.

Next, I am really happy to learn that you are helping to respond to Tom's (#45) question about speedup. I've had replying to Tom on my list for a couple of months now and I haven't had the chance to put together a nice summary and publish a "tutorial" case that could be run on CPU and GPU.

Are you willing to publish your test case so we can all see it and try it? There are always a ton of details.... for example, you published your GPU type (K20), but not your CPU model #. Also, does 1 CPU = 1 core? So 1 K20 = 2 cores? Or by CPU, if you had 4 or 8 cores on the CPU, does one CPU = all 4 or 8 cores? If you ran these cases simultaneously on the K20 I am not sure if that made a difference. Your solver settings might be different than if I tried to setup the same case on my own.

I understand if you can't publish a case, and I am just as guilty for not doing so yet.

As far as settings go, I welcome others to chime in. I find that I have to fiddle with the fvSolution settings for the linear solvers to improve performance. There is some discussion above about GAMG and the number of cells in the coarsest level, I think. If not here than on the GitHub page for RapidCFD.

Anyhow, I hope that you can publish your test case, but fully understand if you do not.

Best,

Eric

chengtun · September 21, 2018, 13:18

Hi Eric I am grateful for your help and glad if I could help. I can send my case to you by email(You can send me yours) because I tried to upload it for several times but met the error

Quote:

Your submission could not be processed because a security token was missing.

If this occurred unexpectedly, please inform the administrator and describe the action you performed before you received this error.

.
It uses pisofoam and LES model. My CPU is Intel(R) Xeon(R) CPU E5620 @ 2.40GHz. I didn't implement parallel computation on CPU so I suppose it only use one core. I was running several cases on GPU and CPU at the same time, so the time result may not be accurate. But I am satisfied that the GPU boosts the computation evidently and is not so concerned with the acceleration ratio. Rougly it is about 2-3 times faster.
Thanks again for your wonderful job and generous help!

edaymo · September 21, 2018, 14:59

Hello, Chengtun,

Thank you for the discussion over PM - Your case is large (over 30 MB compressed), so maybe it is tough to post in this forum. But I appreciate your being agreeable to placing it on my GItHub site. You can find this case under the "coil-pisoFoam" directory at

I have not speed tested this yet, but I did confirm that the first few time steps execute with RapidCFD with Ubuntu 18.04 and CUDA-9.1 (Nvidia TITAN Black GPU).

Thank you for your contribution to address Tom's question, above, and providing a specific test case that we can all access and refer to as a basis for discussing relative speedup with RapidCFD.

Best regards,

Eric

ozes · May 9, 2019, 23:06

Hi chengtun i try to install rapidcfd con ubutñntu 16.04 and i hace somo problema, would you discribe the step to install? I have cuda8 and dirver 384.

ramune · June 4, 2019, 02:42

Hello
I am in trouble with RapidCFD
I think this problem is similar to Tun's problem(#52).
The compilation itself succeeds, but if you try to execute your code you will get an error. We expect it to be a problem with CUDA as it runs well with OpenFOAM.

error contents

Quote:

terminate called after throwing an instance of 'thrust :: system :: system_error'
what (): parallel_for failed: invalid device function

My environment is as follows.
Ubuntu 18.04 ppc64le
Cuda compilation tools, release 9.2, V9.2.148
The version of cuDDN is 7.21
The GPU I am using is Tesla V100.

Look at the previous posts and post the code you executed before compiling.

Quote:

# Setting of parallelization variable
export WM_NCOMPPROCS = 16

# Change settings
vi ~ / RapidCFD / RapidCFD-dev / etc / config / settings.sh
vi ~ / RapidCFD / RapidCFD-dev / etc / config / settings.csh
Using vi
:% s / ppc64 / ppc64le / g

# Change directory name
mv / RapidCFD / RapidCFD-dev / wmake / rules / linux64Nvcc / RapidCFD / RapidCFD-dev / wmake / rules / linuxPPC64Nvcc

# Setting PATH variable
export CUDA_HOME = / usr / local / cuda-9.2
export PATH = / usr / local / cuda-9.2 / bin $ {PATH: +: $ {PATH}}
export LD_LIBRARY_PATH = / usr / local / cuda-9.2 / lib64 $ {LD_LIBRARY_PATH: +: $ {LD_LIBRARY_PATH}}

# Change settings for graphic card
vi RapidCFD / RapidCFD-dev / wmake / rules / linuxPPC64Nvcc / c and c ++
CC = nvcc-Xptxas-dlcm = cg-m64-arch = sm_30
-> CC = nvcc-Xptxas-dlcm = cg-m64-arch = sm_70

Thank you

August 1, 2018, 07:10	Thanks! Working, but slowly...	#45
mesh-monkey New Member Tom Join Date: Dec 2015 Location: Melbourne, Australia Posts: 8 Rep Power: 10	It's very exciting to see the progress of this program over the last year or so! Thanks to those involved in fixing the code base and for the install steps. I have been able to compile RCFD and am able to run cases on a Quadro M4000 (CC5.2). (Woohoo, this is a big improvement from last time when I couldn't get anything running!) However, I have found that it is slllowwww; about 1/10th of single CPU on the motorbike test case (cells: 352147). I will try and bump up to a larger case, but wanted to ask: Has anyone actually been able to see a speedup?? Any particular settings to look for to make the most of CUDA? Thanks! Tom.

September 19, 2018, 08:41		#51
edaymo Member Eric Daymo Join Date: Feb 2015 Location: Gilbert, Arizona, USA Posts: 48 Rep Power: 12	Hello, hopefully everything works after you revert to CUDA 8. If you have a single Tesla K20, you can run multiple cases until the memory is utilized (memory usage can be checked with the "nvidia-smi" command from the Linux prompt). So, suppose you want to run multiple icoFoam cases on a single K20 card. From each RapidCFD case directory, simply run "icoFoam" -- no mpirun needed. You can check memory usage and confirm that everything is running with the"nvidia-smi" command. As you alluded, mpirun comes into play when you have multiple GPU cards. I don't have a lot of experience running multiple RapidCFD cases simultaneously on the same GPU. You may want to do some speed tests to make sure that this approach does not degrade performance. As well, as has been discussed in this thread, RapidCFD tends to perform better as the case becomes larger, and thus 4GB of GPU memory on a single K20 can be a barrier to larger problems. Thus I think a lot of interesting problems for RapidCFD would tend to lend themselves to situations where you would not be able to run more than one case at a time on a single GPU. Good luck with your testing. Best regards, Eric Last edited by edaymo; September 19, 2018 at 11:41.

May 29, 2018, 10:38		#44
edaymo Member Eric Daymo Join Date: Feb 2015 Location: Gilbert, Arizona, USA Posts: 48 Rep Power: 12	Hello, There have been a few updates to RapidCFD which address some of the items discussed in this thread. All of the fixes to YY_FLEX discussed in this thread are now part of the code base. Thus, Rapid CFD will compile without errors (save the mpi.h error if openmpi is not installed in a ThirdParty directory) on Ubuntu 16.04 and CUDA 8.0. If the compute capability is changed in wmake/rules/linux64Nvcc/c++ (e.g., change -arch=sm_30 to the appropriate CC value for your GPU) the application will now execute, and also startup quickly for cards with CC > 3.0 (e.g., a couple of posts in this thread reference the long startup time). This change was done at the expense of disabling the read-only data cache for CC >=3.5, and thus this further improvement remains an open issue. However, since compilation was typically done setting CC=3.0, the read-only data cache feature was not used anyway if RapidCFD was compiled with -arch=sm_30. RapidCFD should compile and execute with CUDA 9.x (I've successfully compiled with Ubuntu 18.04 + CUDA 9.1). Note that I receive a "device free error" with CUDA 9 (not CUDA 8, though) when an application ends. While not affecting the results in my tests, the cause of this message still needs to be tracked down. Hopefully these updates help with your applications. Best, Eric Daymo Tonkomo, LLC http://www.tonkomo.com satish000yadav likes this.

September 18, 2018, 09:14		#46
chengtun New Member Tun Join Date: Mar 2018 Posts: 12 Rep Power: 8	Hi, I have met some problems and eagerly hope to get some help. Thanks in advance! I have installed Openfoam 6 and cuda 9.2 and the latest version of Rapid CFD. I have compiled successfully but when I calculate the case, it showed the following error: terminate called after throwing an instance of 'thrust::system::system_error' what(): parallel_for failed: invalid device function Could you give me a hint on solving this? Moreover, when I tried the tutorial cavity case in icoFoam, it instructed me to change the files in the form of Openfoam 2.x. For example, change nu to nu nu like in https://github.com/OpenFOAM/OpenFOAM...portProperties And it doesn't have the nonslip boundary. Could I just change the case file form or should I also install and compile with Openfoam 2.x? Thanks!

September 18, 2018, 13:26		#49
edaymo Member Eric Daymo Join Date: Feb 2015 Location: Gilbert, Arizona, USA Posts: 48 Rep Power: 12	Hi, Oh yes, the file for the CUDA 9.1 "device free" error is malloc_and_free.h, as you noted. I left out that important filename accidentally in post #47 (I only gave the path). This specific path probably does not work for you because the path that I gave here (I also should have noted) is for CUDA installed from the Ubuntu 18.04 repository. I did not download CUDA from Nvidia in this case. You may have to find malloc_and_free.h by searching in /usr/local/cuda, which is where I think the Nvidia download installs itself. I have not yet encountered the "parallel_for" error you describe, possibly because I never tried RapidCFD with CUDA 9.2. If the icoFoam cavity case that I referenced at my GitHub site does not run with RapidCFD + CUDA 9.2 due to the parallel_for error, then yes, you may want to try CUDA 8 or CUDA 9.1. If the switch to CUDA 8 or CUDA 9.1 solves the parallel_for problem, that would be nice to know. For future reference, the parallel_for error you describe seems to be generated by the file parallel_for.h. If the error persists (or even it is just due to some change in CUDA 9.2), to troubleshoot one will need to figure out what action in RapidCFD throws this error. I am not sure if I fully understand your question about installing OpenFOAM 2.3. You need not have OpenFOAM 2.3 installed alongside RapidCFD for RapidCFD to function. You only need to have your RapidCFD case dictionaries in the right format, which sometimes won't be the same format as OpenFOAM 6 dictionaries. To restate, you can successfully run RapidCFD on a machine that only has OpenFOAM 6 installed. As you probably figured out, and has been posted in several places on the OpenFOAM wiki and in these forums, you can create aliases in .bashrc to switch between different versions of OpenFOAM. This said, I like to have OpenFOAM 2.3.x installed on the machine I use RapidCFD so that I can access OpenFOAM 2.3.x tutorials, which generally have the correct dictionary format for RapidCFD (so I can use the OpenFOAM 2.3.x tutorial dictionaries as templates for RapidCFD usage). I hope that i i answered your question. I hope you will be successfully using RapidCFD soon... Best regards, Eric

September 19, 2018, 04:37		#50
chengtun New Member Tun Join Date: Mar 2018 Posts: 12 Rep Power: 8	Hi Eric thank you sincerely for your generous help! I found that file and tried to comment it, but it still didn't work in my cuda 9.2. So I will change to cuda 8 and hopefully it will work later. Thanks for your explanation I understand the requirement of format now. Another question is how to utilize the maximum capacity of GPU. I have a Tesla K20 so there is no problem of multiple GPUs. If I run 4 cases, how can I ensure they use the total capacity, by setting which parameter before running the case? Thanks in advance!

September 20, 2018, 08:19		#53
edaymo Member Eric Daymo Join Date: Feb 2015 Location: Gilbert, Arizona, USA Posts: 48 Rep Power: 12	Hello, I recommend you start over with a freshly cloned copy of the source code, or wclean the src and application directories. I interpret compile1.txt to show that there were no changes to the source code, so the compiler skipped over each segment of the code because it thought everything was up-to-date. So I in the end, I don't have the impression that anything was actually compiled. Your comment that nividia-smi doesn't work for anything other than CUDA 9.2 is a little concerning. I did a quick check and found that nvidia-smi is included in the driver install. Anyhow, you may want to confirm that your nvidia driver is properly installed (maybe need to reinstall your nvidia driver??), and check "nvcc --version" matches the CUDA version you want (8 or 9.1) before proceeding with the compilation described above. Good luck with getting everything sorted out. Best, Eric

September 21, 2018, 11:29		#56
edaymo Member Eric Daymo Join Date: Feb 2015 Location: Gilbert, Arizona, USA Posts: 48 Rep Power: 12	Hello, Chengtun. Firstly, really glad I can help. The only technical comment I have is that I suspect that nvidia-smi is coming from your nvidia driver, not CUDA... but maybe CUDA 9.2 installed the nvidia driver at the same time...and that driver persisted when you switched to CUDA 8. I have no idea for sure what happened, but if you've got things working I'm would just keep moving forward. Next, I am really happy to learn that you are helping to respond to Tom's (#45) question about speedup. I've had replying to Tom on my list for a couple of months now and I haven't had the chance to put together a nice summary and publish a "tutorial" case that could be run on CPU and GPU. Are you willing to publish your test case so we can all see it and try it? There are always a ton of details.... for example, you published your GPU type (K20), but not your CPU model #. Also, does 1 CPU = 1 core? So 1 K20 = 2 cores? Or by CPU, if you had 4 or 8 cores on the CPU, does one CPU = all 4 or 8 cores? If you ran these cases simultaneously on the K20 I am not sure if that made a difference. Your solver settings might be different than if I tried to setup the same case on my own. I understand if you can't publish a case, and I am just as guilty for not doing so yet. As far as settings go, I welcome others to chime in. I find that I have to fiddle with the fvSolution settings for the linear solvers to improve performance. There is some discussion above about GAMG and the number of cells in the coarsest level, I think. If not here than on the GitHub page for RapidCFD. Anyhow, I hope that you can publish your test case, but fully understand if you do not. Best, Eric

September 21, 2018, 14:59		#58
edaymo Member Eric Daymo Join Date: Feb 2015 Location: Gilbert, Arizona, USA Posts: 48 Rep Power: 12	Hello, Chengtun, Thank you for the discussion over PM - Your case is large (over 30 MB compressed), so maybe it is tough to post in this forum. But I appreciate your being agreeable to placing it on my GItHub site. You can find this case under the "coil-pisoFoam" directory at I have not speed tested this yet, but I did confirm that the first few time steps execute with RapidCFD with Ubuntu 18.04 and CUDA-9.1 (Nvidia TITAN Black GPU). Thank you for your contribution to address Tom's question, above, and providing a specific test case that we can all access and refer to as a basis for discussing relative speedup with RapidCFD. Best regards, Eric

May 9, 2019, 23:06		#59
ozes New Member José Yovany Galindo Díaz. Join Date: Jun 2016 Posts: 2 Rep Power: 0	Hi chengtun i try to install rapidcfd con ubutñntu 16.04 and i hace somo problema, would you discribe the step to install? I have cuda8 and dirver 384.