User Tools

Site Tools


unite_python_mpi_4_py

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
unite_python_mpi_4_py [2026/04/10 13:00] nshegunovunite_python_mpi_4_py [2026/04/10 13:49] (current) nshegunov
Line 1: Line 1:
-== MPI4py Ping-Pong Example for Cluster ==+====== MPI4py Ping-Pong Example for Cluster ======
  
-MPI4py ping-pong demonstrates point-to-point communication between rank 0 (master) and rank 1 (worker)Rank 0 sends increasingly larger messages to rank 1, which echoes them back; timing measures bandwidth.+This example shows a NumPy-based ping-pong benchmark using the ''mpi4py'' library on a cluster. Two MPI ranks exchange a NumPy array for several message sizes and measure round-trip time, one-way latency, and effective bandwidth. 
 + 
 +=====  Instructions ===== 
 + 
 +To run this example, create the Python script and Slurm batch script provided below. 
 + 
 +The Python script uses the mpi4py library to measure MPI communication performance with the classic ping-pong benchmark. The Slurm script loads the required modules and submits the job across 2 nodes. 
 + 
 +Optional interactive testing: 
 +<code bash> 
 +srun --partition=short --ntasks=2 --gres=gpu:--time=02:00:00 --pty bash 
 +</code> 
 + 
 +Use this command to request an interactive session on a single node for experimentation.
  
 ===  Python Script (pingpong_mpi4py.py) ==== ===  Python Script (pingpong_mpi4py.py) ====
  
 <code python pingpong_mpi4py.py> <code python pingpong_mpi4py.py>
 +#!/usr/bin/env python3
 from mpi4py import MPI from mpi4py import MPI
 import numpy as np import numpy as np
-import time+import sys
  
 comm = MPI.COMM_WORLD comm = MPI.COMM_WORLD
Line 14: Line 28:
 size = comm.Get_size() size = comm.Get_size()
  
-if size 2:+if size != 2:
     if rank == 0:     if rank == 0:
-        print("Need at least 2 processes"+        print("Need exactly 2 processes!") 
-    exit()+    sys.exit(1)
  
-1000  # max message size +partner 1 - rank 
-extent = 100  # message size steps + 
-sbuf = np.zeros(1, dtype='d'+nrounds = 100 
-rbuf = np.zeros(1, dtype='d')+msg_sizes = [1, 8, 64, 512, 1024, 4096, 16384, 65536, 262144] 
 + 
 +for nbytes in msg_sizes: 
 +    nelems = max(1, nbytes // np.dtype(np.uint8).itemsize) 
 + 
 +    sendbuf = np.zeros(nelems, dtype=np.uint8
 +    recvbuf = np.empty(nelems, dtype=np.uint8) 
 + 
 +    comm.Barrier() 
 + 
 +    t0 = MPI.Wtime() 
 + 
 +    for i in range(nrounds): 
 +        if rank == 0: 
 +            comm.Send(sendbuf, dest=partner, tag=100) 
 +            comm.Recv(recvbuf, source=partner, tag=200) 
 +        else: 
 +            comm.Recv(recvbuf, source=partner, tag=100) 
 +            comm.Send(recvbuf, dest=partner, tag=200) 
 + 
 +    t1 = MPI.Wtime() 
 + 
 +    if rank == 0: 
 +        total_time = t1 - t0 
 +        avg_rtt = total_time / nrounds 
 +        latency_us = (avg_rtt / 2.0) * 1.0e6 
 +        bandwidth_mb_s = nbytes / latency_us 
 + 
 +        print( 
 +            f"size={nbytes:8d} bytes | " 
 +            f"RTT={avg_rtt*1.0e6:10.2f} us | " 
 +            f"latency={latency_us:10.2f} us | " 
 +            f"bandwidth={bandwidth_mb_s:10.2f} MB/s" 
 +        )
  
 if rank == 0: if rank == 0:
-    t0 = time.time() +    print("Ping-pong benchmark completed.")
-    for i in range(extent): +
-        size = (i + 1) * N +
-        sbuf = np.zeros(size, dtype='d') + rank +
-        t1 = time.time() +
-        comm.send(sbuf, dest=1, tag=i) +
-        comm.Recv(rbuf, source=1, tag=i) +
-        t2 = time.time() +
-        latency = (t2 - t1) * 1000  # ms +
-        bandwidth = (size * 8) / (t2 - t1) / 1e6  # MB/s +
-        print(f"Size {size}: latency {latency:.2f}ms, BW {bandwidth:.1f} MB/s"+
-    total_time = time.time() t0 +
-    print(f"Total time: {total_time:.2f}s"+
-elif rank == 1: +
-    for i in range(extent): +
-        rbuf = comm.recv(source=0, tag=i) +
-        comm.send(rbuf, dest=0, tag=i)+
 </code> </code>
  
Line 62: Line 93:
 mpirun -np $SLURM_NTASKS python3 ./pingpong_mpi4py.py mpirun -np $SLURM_NTASKS python3 ./pingpong_mpi4py.py
 </code> </code>
 +
 +===== Run =====
 +
 +<code bash>
 +sbatch slurm_pingpong.job
 +</code>
 +
 +===== Example output =====
 +
 +<code bash>
 +Loading unite/python/3.14/mpi4py
 +  Loading requirement: unite/python/3.14/python-3.14.0 unite/mpi/4.1
 +
 +size=       1 bytes | RTT=     10.08 us | latency=      5.04 us | bandwidth=      0.20 MB/s
 +size=       8 bytes | RTT=      3.80 us | latency=      1.90 us | bandwidth=      4.21 MB/s
 +size=      64 bytes | RTT=      4.49 us | latency=      2.24 us | bandwidth=     28.54 MB/s
 +size=     512 bytes | RTT=     27.33 us | latency=     13.66 us | bandwidth=     37.47 MB/s
 +size=    1024 bytes | RTT=      6.46 us | latency=      3.23 us | bandwidth=    317.16 MB/s
 +size=    4096 bytes | RTT=      8.90 us | latency=      4.45 us | bandwidth=    920.86 MB/s
 +size=   16384 bytes | RTT=     15.53 us | latency=      7.77 us | bandwidth=   2109.61 MB/s
 +size=   65536 bytes | RTT=     29.86 us | latency=     14.93 us | bandwidth=   4390.07 MB/s
 +size=  262144 bytes | RTT=     72.79 us | latency=     36.40 us | bandwidth=   7202.47 MB/s
 +Ping-pong benchmark completed.
 +
 +</code>
 +
unite_python_mpi_4_py.1775815214.txt.gz · Last modified: 2026/04/10 13:00 by nshegunov

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki