MPI4Py provides a low-level interface for creating full MPI-style programs but it also has a simpler API which allow you to call
submit() which is equivalent of
map which provides the features of
Pool.starmap all in one. You can find out a lot about it in the documentation.
from functools import reduce from mpi4py.futures import MPIPoolExecutor def product(x, y): """Return the product of the arguments""" return x*y def sum(x, y): """Return the sum of the arguments""" return x+y if __name__ == "__main__": a = range(1,101) b = range(101, 201) with MPIPoolExecutor() as executor: results = executor.map(product, a, b) total = reduce(sum, results) print("Sum of the products equals %d" % total)
You'll see that the only change from how we were running it previously is that the pool creation has changed from something like:
with ProcessPoolExecutor() as pool: results = executor.map(product, a, b)
with MPIPoolExecutor() as executor: results = executor.map(product, a, b)
We've also had to import the module with
from mpi4py.futures import MPIPoolExecutor.
The way in which you run the script will depend on the version on MPI that you have installed on your system. This is outside of the scope of this course and the best approach is to talk to your local HPC team.
In summary, you run the script through a standard MPI tool called
mpiexec who's job it is to set up the communication between the potentially multiple computers taking part in the calculation and start your Python script on each. This will usually look something like:
mpiexec -n 1 -usize 17 python mapreduce.py
which will start one process which manages the workers and 16 workers to run the map over. The number of workers you create should depend on the cluster that you are running on. Again, talk to your local HPC team, based on the example
mpiexec line above they'll know what to do.
The output of the script should look something like:
Sum of the products equals 843350