|
[Sponsors] |
Performance/Useability problem in my C++ matrix library |
|
LinkBack | Thread Tools | Search this Thread | Display Modes |
January 11, 2021, 04:12 |
Performance/Useability problem in my C++ matrix library
|
#1 |
Senior Member
Sayan Bhattacharjee
Join Date: Mar 2020
Posts: 495
Rep Power: 8 |
Summary :
I had a performance problem in my C++ matrix library. I found a basic solution. I need to make it easier to use. Body: I'm using Fortran-C++ hybrid programming model for my solver. Fortran *only* does the core calculations in the solver. C++ handles everything else like file IO, memory management, error handling. For each 2D matrix, I allocate a NROWS*NCOLS size linear contiguous block of memory using malloc in C++. I can then send the pointer to the first element of the allocated memory block to Fortran and use it directly as a NROWS*NCOLS matrix. However in C++, I have to write my own methods to calculate the array index of an element (i,j) to access the data stored at index (i,j). For higher dimensions, I have functions that calculate the array index from (i,j,k) in 3D, (i,j,k,q) in 4D etc. Unfortunately, the address/index calculation becomes notoriously complex and slow for the higher dimensions. That's a biiiiiiiiiiiiiiiiiiiiiiiiggggg problem. For 1D: Code:
/// Linear accessor /// NOTE : used when the operator() becomes slow for higher dimensions /// inline REAL& operator [] (int i) { return _head[i]; } Code:
/// Access 2D matrices /// inline REAL& operator () (int irow, int icol) { return _head[nrows*icol + irow]; } Code:
/// Access 4D matrices /// inline REAL& operator () (int i, int j, int k, int q) { return _head[ndim1*ndim2*ndim3*q + ndim1*ndim2*k + ndim1*j + i]; } This is slow. This is horrendously slow. A linear memory access shouldn't be this much complex. I hate it. I found a basic solution that works for now. Instead of accessing the elements through the operator() overloads, I'm directly calculating the index to the first element of the column (fortran style memory allignment is used) that I want to see, then I use the 1D accessor (operator [] overload) shown above to access the elements. This is fast, but I want to know if there is any other way to do it. Now, the final code looks like: Code:
/// Fast 2D matrix access *pseudocode for now* /// I haven't tested it, so the logic could have bugs, /// but kindly assume it's correct. /// for(j=0; j<NCOLS; ++j) { // We get the index to the first element of each column int col_start_index = matrix_object.get_element_index(i,j); for(i=0; i<NROWS; ++i) { cout << matrix_object[col_start_index + i] << endl; // We access the matrix linearly } } - Notice how i,j need to be defined outside of the loops (like in fortran). I don't like it. I'm solving it by doing : for(int i=0, j=0; j<NCOLS; ++j) { for(i=0; i<NROWS; ++i) { CODE(); } } EDIT : If I think about it, for(int i=0, j=0; j<NCOLS; ++j){ i=matrix_object.get_element_index(i,j); for( ; i<i+NROWS; ++i) { cout << matrix_object[i] << endl; }} could be used. - Notice how writing stencils will quickly become more complex. This will become more complex in higher dimensions. The only solution I can think of, is to use C++ iterators. I'm looking for other better options. I want to improve this. What do? Last edited by aerosayan; January 11, 2021 at 05:01. Reason: formatting is really hard |
|
Tags |
c++, matrix operations, performance |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
ERROR: unable to find library | HJH | CFX | 6 | February 26, 2019 06:52 |
Is there any library for computing the inverse of a sparse matrix using parallelism? | HectorRedal | Main CFD Forum | 7 | November 23, 2018 04:07 |
Forcing a solver to use your custom library. | cdunn6754 | OpenFOAM Programming & Development | 0 | March 30, 2017 16:05 |
Problem compiling custom boundary condition as a library | leroyv | OpenFOAM Programming & Development | 2 | January 21, 2014 10:44 |
CG, BICGSTAB(2) : problem with matrix operation and boundary conditions | moomba | Main CFD Forum | 2 | February 17, 2010 03:37 |