pympipool.Executor

pympipool.Executor#

class pympipool.Executor(max_workers: int = 1, max_cores: int = 1, cores_per_worker: int = 1, threads_per_core: int = 1, gpus_per_worker: int = 0, oversubscribe: bool = False, cwd: str | None = None, executor=None, hostname_localhost: bool = False, backend: str = 'auto', block_allocation: bool = False, init_function: callable | None = None, command_line_argument_lst: list[str] = [], disable_dependencies: bool = False, refresh_rate: float = 0.01)[source]#

Bases: object

The pympipool.Executor leverages either the message passing interface (MPI), the SLURM workload manager or preferable the flux framework for distributing python functions within a given resource allocation. In contrast to the mpi4py.futures.MPIPoolExecutor the pympipool.Executor can be executed in a serial python process and does not require the python script to be executed with MPI. It is even possible to execute the pympipool.Executor directly in an interactive Jupyter notebook.

Parameters:
  • max_workers (int) – for backwards compatibility with the standard library, max_workers also defines the number of cores which can be used in parallel - just like the max_cores parameter. Using max_cores is recommended, as computers have a limited number of compute cores.

  • max_cores (int) – defines the number cores which can be used in parallel

  • cores_per_worker (int) – number of MPI cores to be used for each function call

  • threads_per_core (int) – number of OpenMP threads to be used for each function call

  • gpus_per_worker (int) – number of GPUs per worker - defaults to 0

  • oversubscribe (bool) – adds the –oversubscribe command line flag (OpenMPI and SLURM only) - default False

  • cwd (str/None) – current working directory where the parallel python task is executed

  • hostname_localhost (boolean) – use localhost instead of the hostname to establish the zmq connection. In the context of an HPC cluster this essential to be able to communicate to an Executor running on a different compute node within the same allocation. And in principle any computer should be able to resolve that their own hostname points to the same address as localhost. Still MacOS >= 12 seems to disable this look up for security reasons. So on MacOS it is required to set this option to true

  • backend (str) – Switch between the different backends “flux”, “mpi” or “slurm”. Alternatively, when “auto” is selected (the default) the available backend is determined automatically.

  • block_allocation (boolean) – To accelerate the submission of a series of python functions with the same resource requirements, pympipool supports block allocation. In this case all resources have to be defined on the executor, rather than during the submission of the individual function.

  • init_function (None) – optional function to preset arguments for functions which are submitted later

  • command_line_argument_lst (list) – Additional command line arguments for the srun call (SLURM only)

Examples

` >>> import numpy as np >>> from pympipool import Executor >>> >>> def calc(i, j, k): >>>     from mpi4py import MPI >>>     size = MPI.COMM_WORLD.Get_size() >>>     rank = MPI.COMM_WORLD.Get_rank() >>>     return np.array([i, j, k]), size, rank >>> >>> def init_k(): >>>     return {"k": 3} >>> >>> with Executor(cores=2, init_function=init_k) as p: >>>     fs = p.submit(calc, 2, j=4) >>>     print(fs.result()) [(array([2, 4, 3]), 2, 0), (array([2, 4, 3]), 2, 1)] `

__init__(max_workers: int = 1, max_cores: int = 1, cores_per_worker: int = 1, threads_per_core: int = 1, gpus_per_worker: int = 0, oversubscribe: bool = False, cwd: str | None = None, executor=None, hostname_localhost: bool = False, backend='auto', block_allocation: bool = True, init_function: callable | None = None, command_line_argument_lst: list[str] = [], disable_dependencies: bool = False, refresh_rate: float = 0.01)[source]#

Methods

__init__([max_workers, max_cores, ...])