Parallel Python

Part 3: Distributed map/reduce

MPI4Py provides a low-level interface for creating full MPI-style programs but it also has a simpler API which allow you to call submit() which is equivalent of Pool.apply and map which provides the features of and Pool.starmap all in one. You can find out a lot about it in the documentation.
from functools import reduce

from mpi4py.futures import MPIPoolExecutor

def product(x, y):
    """Return the product of the arguments"""
    return x*y

def sum(x, y):
    """Return the sum of the arguments"""
    return x+y

if __name__ == "__main__":

    a = range(1,101)
    b = range(101, 201)

    with MPIPoolExecutor() as executor:
        results =, a, b)

    total = reduce(sum, results)

    print("Sum of the products equals %d" % total)

You'll see that the only change from how we were running it previously is that the pool creation has changed from something like:

with ProcessPoolExecutor() as pool:
    results =, a, b)


with MPIPoolExecutor() as executor:
    results =, a, b)

We've also had to import the module with from mpi4py.futures import MPIPoolExecutor.

The way in which you run the script will depend on the version on MPI that you have installed on your system. This is outside of the scope of this course and the best approach is to talk to your local HPC team.

In summary, you run the script through a standard MPI tool called mpiexec who's job it is to set up the communication between the potentially multiple computers taking part in the calculation and start your Python script on each. This will usually look something like:

mpiexec -n 1 -usize 17 python

which will start one process which manages the workers and 16 workers to run the map over. The number of workers you create should depend on the cluster that you are running on. Again, talk to your local HPC team, based on the example mpiexec line above they'll know what to do.

The output of the script should look something like:

Sum of the products equals 843350