This version uses nonblocking operations for both sending and receiving; primarily, this is to handle the buffering issues. In this case, the sends are posted first, allowing receiver-pull rendezvous protocols to often avoid synchronization delays (but without guarenteeing that)

A separate example shows the use of nonblocking operations to express the overlap of communication and computation.