Systems which must use point-to-point communication for MPI_Allreduce will probably display a log(p) behavior for the cost of an MPI_Allreduce. Systems that can use either a special network or shared memory may have faster reductions with different scaling.

Some of these optimizations (in particular, special networks) apply only to MPI_COMM_WORLD or a communicator that contains the same processes as MPI_COMM_WORLD.