CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Software User Forums > OpenFOAM > OpenFOAM Programming & Development

OpenFOAM Parallel Numerical Linear Algebra Post

Register Blogs Members List Search Today's Posts Mark Forums Read

Like Tree3Likes
  • 1 Post By klausb
  • 1 Post By klausb
  • 1 Post By klausb

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   September 3, 2018, 07:50
Default OpenFOAM Parallel Numerical Linear Algebra Post
  #1
Senior Member
 
Domenico Lahaye
Join Date: Dec 2013
Posts: 720
Blog Entries: 1
Rep Power: 17
dlahaye is on a distinguished road
Dear all,

I posted information of the parallel numerical linear algebra in OpenFOAM at

https://www.linkedin.com/pulse/openf...e/?published=t

Ideas to further develop these notes would be valuable to receive.

Thanks. Domenico.
dlahaye is offline   Reply With Quote

Old   October 13, 2022, 15:13
Default
  #2
Senior Member
 
Klaus
Join Date: Mar 2009
Posts: 250
Rep Power: 22
klausb will become famous soon enough
Two topics come to my mind:


1: How to do linear algebra operations for alternative preconditioners with the OpenFOAM L-D-U matrix structure e.g. How to compute something like: (I-L*D^-1) or L^T ... maybe even with a scaled matrix / linear system for further improved preconditioning


2: How to extend PCG or LGMRES to mixed precision fp64/fp32 with fp64 error correction by adding an inner fp32 loop or embedded solver; I suggest here LGMRES because it converges smoother (similar to BICGStab) than GMRES
dlahaye likes this.
klausb is offline   Reply With Quote

Old   October 14, 2022, 04:58
Default
  #3
Senior Member
 
Domenico Lahaye
Join Date: Dec 2013
Posts: 720
Blog Entries: 1
Rep Power: 17
dlahaye is on a distinguished road
Very interesting.

I do understand that the LDU format is convenient to use in the face-based structure of OpenFoam. I wonder, however, whether it makes sense to switch to a more versatile matrix format for linear algebra.

Possibly using https://petsc.org/release/docs/manualpages/DMPlex/ or similar allows to retain face-based addressing for discretization and switch to other for faster linear algebra.
dlahaye is offline   Reply With Quote

Old   October 14, 2022, 07:20
Default
  #4
Senior Member
 
Klaus
Join Date: Mar 2009
Posts: 250
Rep Power: 22
klausb will become famous soon enough
I don`t think it makes sense to switch to a "more versatile matrix format for linear algebra" for several reasons:

1: There are numerous matrix storage formats including COO, CSR, CSR5, ELL, SpELL, HYB etc.. Which one to use for optimal performance depends on the linear algebra operation and hardware e.g. matrix-matrix operations are faster using a specific format while matrix-vector operations are faster using another format and the optimal formats can be different depending on what hardware is used - at least when GPUs are used. On top of that, it can be usefull to store section of the matrix separately e.g. Petsc splits a matrix into sections of a number of rows and stores for each section the diagonal and off-diagonal entries separately in CSR format and works with optimized algorithms leveraging also a range of MPI functionalities to optimize performance. The LDU format stores the matrix structure L, D, U in separate parts in COO format. Be aware the coefficient matrix is incomplete as some elements, the "boundary contributions" are stored/handled separately.

2: Maybe more important, linear algebra operations are often based on L, D or U or transformations of one of them so storing the complete matrix to extract L, D or U later on as needed for a particular linear algebra operation doesn`t bring a beneft to the table I think. A team of Asian researchers working on an OpenMP version of OpenFOAM ended up with storing the L, D, U structure using the CSR format to store each of the parts separately for best performance.

3: OpenFOAM is fast! But there`s room for improvement particularly in the field of preconditioners and mixed precision both on the CPU and in combination with GPUs but simple CPU-GPU-offloading or moving to an external linear algebra library yields little benefit in my experience.

The challenges I experience are in the complexity that comes with the inherent optimizations in OpenFOAM where algorithms leverage "cell" indexing and/or "face" indexing and/or LDU-addressing for which I have never been able to find a comprehensive documentation together with the boundary contributions to be considered and the decomposition of the linear system for parallel computations which alltogether would enable me to make the extensions I have in mind. A hands-on, examples based tutorial explaining how to implement linear algebra operations and matrix (section) transformations and operations would help me a lot.
dlahaye likes this.
klausb is offline   Reply With Quote

Old   October 14, 2022, 14:27
Default
  #5
Senior Member
 
Domenico Lahaye
Join Date: Dec 2013
Posts: 720
Blog Entries: 1
Rep Power: 17
dlahaye is on a distinguished road
I agree.

A tutorial case (e.g. extending https://github.com/UnnamedMoose/Basi...mmingTutorials) with linear algebra aspects would be valuable.

I suggest to start from laplacianFoam in sequential, information in Darwish-Moukalled-Mangani and other sources. The scope should be to clarify

1. [loop over internal faces]: how the matrix and right-hand side vector is assembled by a loop over inner faces (laplacianSchemes assume no non-orthogonal corrections are required);

2. [loop over boundary faces]: how Dirichlet and Neumann boundary conditions are treated and how the matrix is stored in LDU format (U = L^T and D contains negative row sums);

3. [Krylov solve]: how the linear system is solved using unpreconditioned Krylov methods by calling the member function solve() of the lduMatrix class involving BLAS-1 (vector and vector-vector) and BLAS-2 (matrix-vector) routines;

4. [preconditioning]: how the convergence of the Krylov subspace method can be accelerated by involving a preconditioner (one call to the preconditioner at each Krylov subspace iteration)

Your input on how to further develop my notes or on taking an alternative route would be valuable here.

I disagree.

The LDU matrix format does not allow for any fill-in. ILU preconditioning allowing some form of fill-in or Galerkin coarsening (in GAMG) is thus hard to accomplish.

OpenFOAM furthermore does not allow the transparant profiling of linear solver performance that e.g. PETSc allows (using -log_summary).
dlahaye is offline   Reply With Quote

Old   October 14, 2022, 15:38
Default
  #6
Senior Member
 
Klaus
Join Date: Mar 2009
Posts: 250
Rep Power: 22
klausb will become famous soon enough
If these limitations exist it makes sense.

I wouldn`t only consider Petsc. I think it would make sense to have a look at Trilinos and Hypre even so Hypre preconditioners can be accessed via Petsc, too. Many OpenFoam-Trilinos specifics are covered in the thesis by Bob Dröge titled "A software interface for fully implicit flow simulations on block-structured grids" which can be found online. In particular in chapter 5..

Another very strong but less known contender should be GaspiLS (http://gaspils.de/). See GaspiLS-Petsc-Hypre scalability comparisons on the main webpage.
dlahaye likes this.
klausb is offline   Reply With Quote

Old   October 17, 2022, 05:47
Default
  #7
Senior Member
 
Domenico Lahaye
Join Date: Dec 2013
Posts: 720
Blog Entries: 1
Rep Power: 17
dlahaye is on a distinguished road
Dear Klaus,

1/ Valuable information on the implementation of boundary conditions in OpenFoam is in Section 18.2 of the Moukalled-Mangani-Darwish book;

2/ I do see some information on Trilinois and OpenFoam in the master thesis of Droge. Information on how to couple both remains limited. I have previous experience using PETSc;

I will continue to expand and get back here.

Kind wishes, Domenico.
dlahaye is offline   Reply With Quote

Old   October 17, 2022, 09:56
Default
  #8
Senior Member
 
Klaus
Join Date: Mar 2009
Posts: 250
Rep Power: 22
klausb will become famous soon enough
Dear Domenico,


there`s already the PETSc4FOAM extension.

When scaling is the objective, GaspiLS (http://gaspils.de/) is probably the better choice.


BR,


Klaus
klausb is offline   Reply With Quote

Old   October 17, 2022, 10:27
Default
  #9
Senior Member
 
Domenico Lahaye
Join Date: Dec 2013
Posts: 720
Blog Entries: 1
Rep Power: 17
dlahaye is on a distinguished road
Dear Klaus,

Thank you for getting in touch.

I am aware of PETSc4FOAM. I am trying to increase my understanding of why the speedup that PETSc4FOAM remains limited or how to use more advanced features of PETSc such as DMPLEX (to avoid the conversion to CRS after discretization) or FieldSplit (for Schur complement preconditioning)

Can you please elaborate on GaspiLS (or ginko) versus the access to GPU that PETSc (or Trilinois) provides?

Thank you. Kind wishes. Domenico.
dlahaye is offline   Reply With Quote

Old   October 17, 2022, 11:22
Default
  #10
Senior Member
 
Klaus
Join Date: Mar 2009
Posts: 250
Rep Power: 22
klausb will become famous soon enough
GaspiLS uses GPI-2 (http://www.gpi-site.com/), not MPI for parallel operations which uses a "more non-blocking-communication-concept". I can´t say whether GPUs are yet supported.

Ginko is a library designed for GPU computing and there`s and experimental OpenFOAM extension supporting both Nvidia and AMD GPUs.

Petsc provides GPU-offloading via cuda to Nvidia GPUs and via OpenCl to AMD GPUs. Why should it be a lot faster? I heard comments that it should be but no reasons.

Trilnos offers GPU support (see: https://trilinos.github.io/mpi_x.html) but I have been struggling with the Trilinos documentation in general and have not been able to implement it. It supports apparently Nvidia, Intel and AMD hardware.


Maybe RapidCFD is the way to go when it comes to GPU computing.
klausb is offline   Reply With Quote

Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
OpenFOAM Training Jan-Jul 2017, Virtual, London, Houston, Berlin CFDFoundation OpenFOAM Announcements from Other Sources 0 January 4, 2017 07:15
A turbulent test case for rhoCentralFoam immortality OpenFOAM Running, Solving & CFD 13 April 20, 2014 07:32
Something weird encountered when running OpenFOAM in parallel on multiple nodes xpqiu OpenFOAM Running, Solving & CFD 2 May 2, 2013 05:59
how to modify fvScheme to converge? immortality OpenFOAM Running, Solving & CFD 15 January 16, 2013 14:06
Summer School on Numerical Modelling and OpenFOAM hjasak OpenFOAM 5 October 12, 2008 14:14


All times are GMT -4. The time now is 02:40.