Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
405 views
in Technique[技术] by (71.8m points)

c - MPI asynchronous broadcast from unknown source

I have a C-project that has n numbers of processors working on a kind of tree search. At any given time of the program, any of these processes may find something of interest and want to send this to all other processors asynchronously.

How can I listen for new messages on the other processes without having to loop through all possible senders each loop iteration?

I have read other questions about this, for example this one (MPI - Asynchronous Broadcast/Gather), however, all I've seen so far either doesn't handle unpredictable senders or loops through each possible sender, which I don't really fancy.

EDIT for clarification:

Sending the found value to a root rank and distributing it from there is out of the option. Zulan's answer would work, if I did not have that condition, so for others this might be of help. In my case it can (and most definitely will) happen that different ranks find something they need to share multiple times (which means that race conditions could also occur).

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

After a lot of trial and error (and other feedback) I found that the most performant solution uses asynchronous broadcasts.

My first approach was inspired by Petru's answer using multiple non blocking MPI_Isend() calls to distribute data and only one single MPI_Irecv() call (with any source) to periodically receive data. This approach turned out to be quite slow, since the non blocking MPI_ISend() calls have to be checked by the sender with MPI_Test() on every single MPI_Request handle that is created. The reason for that is, that asynchronous operations aren't truly asynchronous in the sense that they work in the background, but rather have to be checked periodically.

In my case the problem also presents the possibility that new solutions can (and will) be found multiple times, which resulted in a lot of management issues with existing, open MPI_Request handles, which had to be waited for or managed otherwise.

My second approach was using a central communicator as proposed by Zulan. This approach was very quick when there weren't many messages, but the root rank got clogged up when there were many found solutions at the same time, making it very slow in special cases. Most importantly the root rank was not as fast as the others anymore (which was to be expected), which resulted in an overall slower program in my case.

My third and final approach was using multiple non blocking MPI_Ibcast() calls, one for every possible rank that could send a message. For example that would mean that in a case with 3 different ranks that

  • rank 0 has two open MPI_Ibcasts with root 1 and 2
  • rank 1 has two open MPI_Ibcasts with root 0 and 2
  • rank 2 has two open MPI_Ibcasts with root 0 and 1

When a rank then finds a solution, it posts the last necessary MPI_Ibcast with itself as the root. At first this may seem similar to approach one, however, in this case the sender always only has to monitor a single request handle (the one from the final MPI_Ibcast) and check it periodically with MPI_Test(). The other ranks always have the same amount of open broadcasts (namely the world's size minus 1), which can be stored in an array and checked with MPI_Testany(). The difficulty with this approach is, however, that each open broadcast must operate on it's own communicator, essentially needing as many communicators as there are ranks.

So, my findings are that the second approach is acceptable sometimes, making it a quick and viable option, when there are not a lot of messages. It was the easiest to handle and implement. The third approach was faster under heavier load and also made my program termination very easy, since there always is a known amount of open connections at all time. It was the hardest and longest to implement and used up a bit more memory than the other implementations. (I don't know the reason for this, but it seems to have something to do with the broadcast's buffer management and possibly the additional communicators.) Lastly I cannot stress enough that the first approach seems easy at first, however when you really want to keep track of every single request, then it is hard to manage and slow as well, unfortunately.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...