MPI send and receive of non primitive elements
I have a question regarding the use of MPI_Send and MPI_Recv class of functions.
I need to share between processors a list of non primitive elements (let`s say a List<face> to fix ideas).
What I usually use to pass data between different processors is a structure of this kind with otherGlobalFaces being a List or a Field.
This structure is working fine in case otherGlobalFaces is a scalar, vector, tensor or any other primitive element Field.
If however I declare otherGlobalFaces as a List<face>, even though I replace the bufferSize entry with the proper buffer size (equal to the buffer size of the respective send), I`m not able to pass the data from one processor to another.
Do you have any idea on how to solve this?
As a workaround, since my faces belong to a cuttingPlane and are all triangles, I casted the faces into vectors and the algorithm could actually work, but I`m not really satisfied with this option in fact I`m not sure whether the cuttingPlane faces constructed once the points are found. Can you please confirm that the faces of a cutting plane are always triangles and if not, under which assumption this is true?
Thanks a lot for helping,
In this particular case no but I do not think the problem will be solved since apparently the Pstream read and write just do the same things I1m doing, do you believe it too?
The direct use of mpi functions belongs to the need for specifying a msgType or better a msg tag in order to allow multiple communications between the same processors avoiding mixing of the data.
This piece of code also belong from 1.3 where if I'm not mistaken it was not possible to easily select the communication type for each sending receive operation (I might be wrong because I do not have the source with me, it is just a memory, so it is very likely to fail :)).
Thanks a lot,
I've had trouble with this too... It stems from the fact that faceLists are compound data types which include header information (like the size of the entire faceList, and the size of each face, in addition to point labels). So the size of these elements need to be included as well. However, it isn't entirely clear how they should be sent, and I've run into segFaults all the time. I've resorted to packing them into regular arrays to get the job done, but as you said, it isn't a very elegant solution.
(Think about it: the size of a face is not a priori defined. So if you replace a face of size 3 at position N with a face of size 4 all the following faces would have to be shifted in memory)
What I meant with the streams was writing the Field into the stream and reading it from another PStream on the target processor. Between the processors it would be an ASCII-stream which means some overhead (printing, parsing), but I'd first try to get the algorithm right and then worry about optimization
you could try packing using OstringStream (note: haven't tried this). Something like:
MPI_Send( contents.begin(), contents.size());
Thanks Matjis I tried your piece of code (there was a typo error for further reference take it here)
MPI_Send( contents.begin(), contents.size());
but it is not clear to me how you then receive a IStringStream and especially how you cast it back on a faceList.
Thank you Bernard for the hint. I also believe the problem to be related to the receiving field where the size of each face is in fact unknown so it is not possible even assigning the correct buffer size to distinguish which piece of the stream belongs to which face. Anyhow I was not capable of receiving the field as a Pstream and then cast it back on a faceList as well.
Anyhow I solved the problem, allowing a little extra communication, using gatherList and scatterList that seems to work with non primitive list too (I looked in the code searching for the trick but I was not able to find it).
Another doubt arises for the same utility still related to parallel communication: what method is actually more efficient between the two proposed below that are equivalent to me?
List<vector> globalPoints(globalPointSize,pTraits<vector>::zero );
globalPoints[processorStartPointPosition_ + pointi] = cut_.points()[pointi];
reduce(globalPoints, sumOp<Field<vector> >());
List<List<point> > globalPointsList(Pstream::nProcs());
globalPointsList[Pstream::myProcNo()] = localPoints;
List<point> globalPoints = ListListOps::combine<List<point> >
It seems to me that this second method implies fewer processor communication but I`m not so sure regarding the combine method
Thanks a lot again,
|All times are GMT -4. The time now is 15:43.|