User Guide

Complete guide to using the MLW Sapitwa HPC cluster effectively.

Getting Started

Access Requirements

  1. 1 MLW institutional account
  2. 2 Approved PI request
  3. 3 Completed training
  4. 4 MFA setup

Accessing the System

SSH Access

ssh username@sapitwa.mlw.mw

Note: SSH access requires MLW network or VPN connection

Web Portal Access

  1. 1
  2. 2 Click "Log in"
  3. 3 Enter institutional credentials. Access for partner institutions may be granted at the discretion of management.
  4. 4 Complete MFA verification
  5. 5 Setting up your Open OnDemand environment
  6. echo 'export PATH="/usr/local/turbovnc/bin:/opt/TurboVNC/bin:$PATH"' >> ~/.bashrc && source ~/.bashrc

Environment Modules

Basic Module Commands

module avail # List available modules
module list # Show loaded modules
module load name # Load a module
module unload name # Unload a module
module purge # Unload all modules
module spider name # Search for a module

Job Management

Submitting Jobs to CPU Nodes

#!/bin/bash
#SBATCH --job-name=CPU-Test
#SBATCH --output=CPU-Test%j.out
#SBATCH --error=CPU-Test%j.err
#SBATCH -p cpu-nodes        # CPU partition
#SBATCH -N 1               # Number of nodes
#SBATCH -n 16              # Number of cores
#SBATCH --mem=64G          # Memory request
#SBATCH --time=4:00:00     # Maximum runtime (HH:MM:SS)
#SBATCH --mail-type=ALL    # Mail notifications for job status
#SBATCH --mail-user=your-email@mlw.mw

# Load required modules
module load R/4.4.1

# Run R script
Rscript test.R

Submitting Jobs to GPU Nodes

#!/bin/bash
#SBATCH --job-name=GPU-Test
#SBATCH --output=GPU-Test%j.out
#SBATCH --error=GPU-Test%j.err
#SBATCH -p gpu-nodes        # GPU partition
#SBATCH --gres=gpu:1g.10g:1 # Request 1 GPU with 10GB memory
#SBATCH -N 1               # Number of nodes
#SBATCH -n 10              # Number of CPU cores
#SBATCH --mem=64G          # Memory request
#SBATCH --time=4:00:00     # Maximum runtime (HH:MM:SS)
#SBATCH --mail-type=ALL    # Mail notifications for job status
#SBATCH --mail-user=your-email@mlw.mw

# Load required modules
module load CUDA/11.7.0

# Run Python script
python3 c2.py

SLURM Script Parameters Explained

  • -p cpu-nodes/gpu-nodes: Specifies the partition (CPU or GPU)
  • --gres=gpu:1g.10g:1: Request 1 A100 GPU with 10GB memory
  • -N 1: Request one compute node
  • -n 10/16: Request CPU cores (10 for GPU, 16 for CPU jobs)
  • --mem=64G: Request 64GB of RAM
  • --time=4:00:00: Job time limit of 4 hours
  • --mail-type=ALL: Email notifications for job start, end, and failure
  • --output/--error: Output and error files (includes %j for job ID)

Job Control Commands

sbatch script.sh # Submit a job
squeue -u $USER # List your jobs ($USER is a variable for your username)
scancel jobid # Cancel a job
sacct # View job history

Storage Guidelines

Storage Locations

  • /head/NFS/$USER - Home directory (quota limited)
  • /scratch - Temporary storage (periodically purged)

Important: Do not store sensitive data on the cluster

Data Transfer

Copy Files to Cluster

scp localfile username@sapitwa.mlw.mw:/head/NFS/$USER/

Copy Files from Cluster

scp username@sapitwa.mlw.mw:/head/NFS/$USER/file local/

Sync Directories (Recommended for Large Transfers)

rsync -av source/ username@sapitwa.mlw.mw:/head/NFS/$USER/dest/

Transfer Tips

  • • Use rsync for large transfers - it can resume interrupted transfers
  • • For multiple files, compress them first using tar or zip
  • • Always check available space before large transfers

Best Practices

  • Always use job scheduler for computations
  • Clean up temporary files after job completion
  • Monitor your storage quota
  • Use appropriate resources in job requests
  • Document your workflows

Getting Help