-
Notifications
You must be signed in to change notification settings - Fork 10
Open
Description
Description
If in the same kernel there are 2+ SMI channels opened toward the same destination, if the message length is small (=1 network packet) the rendezvous mechanism could cause a stall
Example
SMI_Channel chan_send1=SMI_Open_send_channel(2, SMI_INT, my_rank+1, 0, comm);
SMI_Channel chan_send2=SMI_Open_send_channel(2, SMI_INT, my_rank+1, 1, comm);
for(int i=0;i<2;i++)
<push the data in the two channels>
On the receiver side symmetric operations are applied. This is broken in the following case:
- when i=1, we first push the second data elements in the channel. The network packet is sent. We have zero tokens and the rendezvous mechanism wait for a message from my_rank+1. This prevents the execution of the second push
- on the receiver side, we received the network packet. We perform the first pop. However, tokens is not zero, therefore we will not send the rendevous message. The second pop is stalled because the data will never arrive
Possible solution
Change tokens condition on push?
Metadata
Metadata
Assignees
Labels
No labels