CFD Online Discussion Forums - Efficient Conversion of Local Stiffness Matrix to Global Stiffness

CFD Online Discussion Forums (https://www.cfd-online.com/Forums/)

- Main CFD Forum (https://www.cfd-online.com/Forums/main/)

- - Efficient Conversion of Local Stiffness Matrix to Global Stiffness (https://www.cfd-online.com/Forums/main/80078-efficient-conversion-local-stiffness-matrix-global-stiffness.html)

Efficient Conversion of Local Stiffness Matrix to Global Stiffness

Hi all,

I work with FEM and was once told by a professor that assembling the stiffness matrix should be roughly 10% of computation time while the solver eats the remaining computation time. This is not the case with my code as assembly for problems larger than 10^6 DOF seems to take as long as the solver. That said, I am looking to improve the section of the code which takes values at the local stiffness level and populates the global stiffness matrix. We are using sparse storage by rows so this involves repeated searching of blocks of column information to determine where in the global array local coefficients should be stored.

We use unstructured meshes so there is no order to the global node numbering but I am wondering if there is any a priori nodal renumbering?, or clever search strategy, etc? that could be employed to speed up this tedious but necessary part of the code.

Cheers

Did you allocate the pattern of the sparse matrix before usage? It sounds a bit like there are many reallocations involved. Also, you should assemble into a small dense element matrix and add the local contributions all at once to the global system to avoid as many lookups as possible. If you did both of those things, then maybe your sparse matrix implementation is in some point inefficient. Did you try different implementations, e.g. PETSc, MTL, DUNE-ISTL, ...?

This is what I do in my FEM code. It may be inefficient for problems
with that many DOF but it seems to work fine for up to 10^5 unknowns
which is what I have tried. Also the assembly process can
be made parallel fairly simply so that may be an option.

! Loop on element columns for element i
do j=1,elem_dof(i)

! Global column number
l=compr_map(elem_ft(i,j),1)

if(l.ne.0)then

! Loop on element rows
do k=1,elem_dof(i)

! Global row number
m=compr_map(elem_ft(i,k),1)

! CSC format
! Loop on row pointers for global column l

if(sparse_store.eq.0)then
add=.false.
search1:do ll=pointers(l),pointers(l+1)-1
if(indices(ll).eq.m)then
kk=ll
add=.true.
exit search1
endif
enddo search1

! CSR format
! Loop on column pointers for global column m
elseif(sparse_store.eq.1)then
add=.false.
if(m.ne.0)then
search2:do ll=irn(m),irn(m+1)-1
if(jcn(ll).eq.l)then
kk=ll
add=.true.
exit search2
endif
enddo search2
endif
endif

if(add)then
K_work(kk,1)=K_work(kk,1)+kelem(k,j)
if(iassemble.eq.0.or.iassemble.eq.1)then
Mass_work(kk,1)=Mass_work(kk,1)+melem(k,j)
if(damp_type.eq.1)then
Damp_work(kk,1)=Damp_work(kk,1)+delem(k,j)
endif
endif
endif

enddo
endif
enddo
endif