This example does allow all data transfers to take place simultaneously. Whether or not the underlying MPI implementation and hardware can do this is something that this example evaluates.

By using MPI persistent operations, the overhead of creating the internal MPI datastructures for these operations is reduced.