User Tools

Site Tools


examples

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
examples [2025/12/29 21:29] – [GPU] dimitarexamples [2025/12/30 23:27] (current) – [C++ program which uses GPU] dimitar
Line 1: Line 1:
-====MPI4PI==== +====Description====
-TOD+
  
----- +This page provides examples on how to use the cluster. There are language specific examples for **C/C++**, and **Python**, which showcase how you can compile and run applications which are written in those languages on the cluster. Additionally, there are examples for how to leverage the different resources of the cluster. These examples are written in **C++**, but the concepts apply to a program written in any language.
-==== PyTorch==== +
-Consider the following simple python test script( "pytorch_test.py"): +
-  +
-<code python> +
-import torch+
  
-def test_pytorch(): 
-    print("PyTorch version:", torch.__version__) 
-    print("CUDA available:", torch.cuda.is_available()) 
-     
-    if torch.cuda.is_available(): 
-        print("CUDA device:", torch.cuda.get_device_name(0)) 
-        device = torch.device("cuda") 
-    else: 
-        device = torch.device("cpu") 
-     
-    # Simple tensor operation 
-    x = torch.tensor([1.0, 2.0, 3.0], device=device) 
-    y = torch.tensor([4.0, 5.0, 6.0], device=device) 
-    z = x + y 
-    print("Tensor operation result:", z) 
- 
-test_pytorch() 
-</code> 
- 
-To test it on the unite cluster you can use the folling sbatch scrpit to run it: 
-<code bash> 
-#!/bin/bash 
-#SBATCH --job-name=pytorch_test 
-#SBATCH --output=pytorch_test.out 
-#SBATCH --error=pytorch_test.err 
-#SBATCH --time=00:10:00 
-#SBATCH --partition=a40 
-#SBATCH --gres=gpu:1 
-#SBATCH --mem=4G 
-#SBATCH --cpus-per-task=2 
- 
-# Load necessary modules (modify based on your system) 
-module load python/pytorch-2.5.1-llvm-cuda-12.3-python-3.13.1-llvm 
- 
-# Activate your virtual environment if needed 
-# source ~/your_env/bin/activate 
- 
-# Run the PyTorch script 
-python3.13 pytorch_test.py 
- 
-</code> 
 ---- ----
-====Pandas==== 
-Consider the following simple python test script( “pandas_test.py”): 
-<code python> 
-import pandas as pd 
-import numpy as np 
  
-# Create a simple DataFrame 
-data = { 
-    'A': [1, 2, 3, 4], 
-    'B': [5, 6, 7, 8], 
-    'C': [9, 10, 11, 12] 
-} 
-df = pd.DataFrame(data) 
-print("Original DataFrame:") 
-print(df) 
- 
-# Test basic operations 
-print("\nSum of each column:") 
-print(df.sum()) 
- 
-print("\nMean of each column:") 
-print(df.mean()) 
- 
-# Adding a new column 
-df['D'] = df['A'] + df['B'] 
-print("\nDataFrame after adding new column D (A + B):") 
-print(df) 
- 
-# Filtering rows 
-filtered_df = df[df['A'] > 2] 
-print("\nFiltered DataFrame (A > 2):") 
-print(filtered_df) 
- 
-# Check if NaN values exist 
-print("\nCheck for NaN values:") 
-print(df.isna().sum()) 
-</code> 
- 
-You can use the following snatch script to run it: 
-<code bash> 
-#!/bin/bash 
-#SBATCH --job-name=pytorch_test 
-#SBATCH --output=pytorch_test.out 
-#SBATCH --error=pytorch_test.err 
-#SBATCH --time=00:10:00 
-#SBATCH --partition=a40 
-#SBATCH --gres=gpu:1 
-#SBATCH --mem=4G 
-#SBATCH --cpus-per-task=2 
- 
-# Load necessary modules (modify based on your system) 
-module load python/3.13.1-llvm 
-module load python/3.13/pandas/2.2.3 
- 
-# Activate your virtual environment if needed 
-# source ~/your_env/bin/activate 
- 
-# Run the PyTorch script 
-python3.13 pandas_test.py 
-</code> 
----- 
 ====Simple C/C++ program==== ====Simple C/C++ program====
 The following is a simple **C/C++** program which performs element-wise addition of 2 vectors. It does **not** use any dependent libraries: The following is a simple **C/C++** program which performs element-wise addition of 2 vectors. It does **not** use any dependent libraries:
Line 572: Line 465:
  
 ---- ----
-====MPI====+====C++ program which uses MPI====
  
 The following is an example **C/C++** application which uses **MPI** to perform element-wise addition of two vectors. Each **MPI** task computes the addition of its local region and then sends it back to the leader. Using **MPI** with **Python** is similar assuming that you know how to manage **Python** dependencies on the cluster which is described in the previous section. What is important here is to understand how to manage the resources of the system. The following is an example **C/C++** application which uses **MPI** to perform element-wise addition of two vectors. Each **MPI** task computes the addition of its local region and then sends it back to the leader. Using **MPI** with **Python** is similar assuming that you know how to manage **Python** dependencies on the cluster which is described in the previous section. What is important here is to understand how to manage the resources of the system.
Line 791: Line 684:
 ---- ----
  
-====GPU====+====C++ program which uses GPU====
  
 The following is an example **Cuda** application which uses **Nvidia GPU** to perform element-wise addition of two vectors. Using **Cuda** with **Python** is similar assuming that you know how to manage **Python** dependencies on the cluster which is described in a previous section. What is important here is to understand how to manage the resources of the system. The following is an example **Cuda** application which uses **Nvidia GPU** to perform element-wise addition of two vectors. Using **Cuda** with **Python** is similar assuming that you know how to manage **Python** dependencies on the cluster which is described in a previous section. What is important here is to understand how to manage the resources of the system.
Line 875: Line 768:
     CUDA_CHECK(cudaMalloc(&d_c, bytes));     CUDA_CHECK(cudaMalloc(&d_c, bytes));
  
-    double transfer_start = getTime(); 
     CUDA_CHECK(cudaMemcpy(d_a, h_a, bytes, cudaMemcpyHostToDevice));     CUDA_CHECK(cudaMemcpy(d_a, h_a, bytes, cudaMemcpyHostToDevice));
     CUDA_CHECK(cudaMemcpy(d_b, h_b, bytes, cudaMemcpyHostToDevice));     CUDA_CHECK(cudaMemcpy(d_b, h_b, bytes, cudaMemcpyHostToDevice));
Line 899: Line 791:
     CUDA_CHECK(cudaFree(d_b));     CUDA_CHECK(cudaFree(d_b));
     CUDA_CHECK(cudaFree(d_c));     CUDA_CHECK(cudaFree(d_c));
-    CUDA_CHECK(cudaEventDestroy(start)); 
-    CUDA_CHECK(cudaEventDestroy(stop)); 
     free(h_a);     free(h_a);
     free(h_b);     free(h_b);
examples.1767036595.txt.gz · Last modified: 2025/12/29 21:29 by dimitar

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki