How to work

Hardware and software environment of the HybriLIT cluster

HybriLIT heterogeneous computing cluster contains computation nodes with multi-core Intel processors, NVIDIA graphical processors and Intel Xeon Phi coprocessors (see more details in section “Hardware”).

Types of main computation nodes:

  1. Nodes with multi-core CPU and Intel Xeon Phi coprocessors.
  2. Nodes with multi-core CPU and 3 Nvidia Tesla K40 graphic accelerator (GPU).
  3. Nodes with multi-core CPU and 2 (4) Nvidia Teals K80 GPU.
  4. Mix-blade with multi-core CPU, Ontel Xeon Phi coprocessor and Nvidia Tesla K20 GPU.

Heterogeneous computing cluster is under Scientific Linux 6.7 OS; it also includes SLURM task manager and some specific software installed – compilers and packages for development, debugging and profiling of parallel application (including Modules package).

Modules package

Modules 3.2.10 package for dynamic change of environment variables is installed on the cluster. This package allows users change the list of compilers for development of applications with support of the basic programming languages (C/C++, FORTRAN, Java), parallel programming technologies (OpenMP, MPI, OpenCL, CUDA) and use program packages installed on the cluster. Users need to load the required modules before application compilation.

Main commands for work with modules:

drawit-diagram-8

Loaded modules are not saved from session to session. If you need to use the same set of modules, please use the commands below:

You can also load compilers and packages installed in cvmfs (CernVM File System) system.

CernVM File System

CernVM-FS  system is added to the list of installed program packages and allows users to have access to the software installed in CERN. List of available packages can be seen by means of the command below:

It should be noted that /cvmfs/sft.cern.ch/ directory is mounted dynamically at request to its content, however, after some time of inactivity it may disappear from the list of available directories. You can mount it again using the same command below:

In order to use comiplesr and program packages, it is necessary to execute the command below:

Directories’ tree in cvmfs have a special structure. Let’s examine this in terms of ROOT package. The full path to the directory will look like this:

where /6.07.02-f644e – is a version of ROOT package; the nake of the directory/x86_64-slc6-gcc49-opt: x86_64 indicates support for 64-bit version of the package; slc6 – indicates that package compilation from the source code was carried out under Scientific Linux 6; gcc49 – the package was compiled using gcc 4.9.9.
Files with environment values may have one of two names:

  • setup.sh

For example, command for use of gcc 4.9.3 compiler will look like the following:

  • [PACKAGE_NAME]-env.sh

For example, for the ROOT 6.07.0 package, the command will look like this:

Getting started: remote access to the cluster

Remote access to the HybriLIT heterogeneous cluster is available only via SSH.DNS protocol with the following address:

Please see more detailed instruction for different OS below.

For users under OS Linux/ MAC OS X

Start the Terminal and type in the following:

where USERNAME – is the login that you received after registration on the cluster and hydra.jinr.ru  – is the server address.
When asked to enter the password, please enter the password for your  account on the cluster.
Once authorized successfully on the cluster, you will see the following command line on the screen:

drawit-diagram-8
This means you have connected to the cluster and are in your home directory.
At first attempt to access the cluster, you will be notified that IP-address you are trying to connect to is unknown. Please type in “yes” and press Enter. Once done, this address will be added to the list of known hosts.
In order to launch apps with GUI, please start the Terminal and enter:


For users under OS Windows

In order to connect to the cluster for users inder Windows, it is necessary to user a special peogramm – SSH-client, for example, PuTTY.

To install PuTTY on your computer, please download the putty.exe file at http://the.earth.li/~sgtatham/putty/latest/x86/putty.exe and run it.

Please see a step-by-step guide for setting PuTTY to get access to the cluster:

  • In the filed Host Name (or IP address)  enter the server address: hydra.jinr.ru
  • In the filed Saved Sessions enter the connection name (e.g.  hydra.jinr.ru).
  • To connect a remote X11 graphic interface, please switch to the tab Connection > SSH > X11 and select the field “Enable X11 forwarding”
  • Check if this field “Local ports accept connection from other hosts” is selected in the tab Connection > SSH > Tunnels

drawit-diagram-11

  • Then switch back to the tab “Sessions” and press Save to save all changes.
  • Press Open to connect to the HybriLIT cluster and enter login/password that we sent you after registration.

drawit-diagram-12

Once authorized successfully on the cluster, you will see the following command line on the screen:

drawit-diagram-8
This means you have connected to the cluster and are in your home directory.
At first attempt to access the cluster, you will be notified that IP-address you are trying to connect to is unknown. Please type in “yes” and press Enter. Once done, this address will be added to the list of known hosts.
drawit-diagram-8In order to launch apps with GUI, please enter:

SLURM task manager:

SLURM – is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system that provides three key functions:

  • it allocates exclusive and/or non-exclusive access to resources (compute nodes) to users for some duration of time so they can perform work;
  • it provides a framework for starting, executing, and monitoring work (normally a parallel job) on the set of allocated nodes
  • it arbitrates contention for resources by managing a queue of pending work.

1. Main commands

Main commands of SLURM include: sbatch, scancel, sinfo, squeue, scontrol.

sbatch – is used to submit a job script for later execution. The script will typically contain one or more srun commands to launch parallel tasks.
Once being ecexuted, application recievies a jobid according to which it can be found in the list of launched applications (squeue). The results are available in the slurm-jobid.out file.

Example of using sbatch:

squeue – reports the state of jobs or job steps. It has a wide variety of filtering, sorting, and formatting options;
By default, it reports the running jobs in priority order and then the pending jobs in priority order. Launched application may have one of the folloing states:
RUNNING (R) – under execution;
PENDING (PD) – in queue;
COMPLETING (CG) – under termination. (in this case you may need help of the system administrator to remove the terminated application from the queue list.

Example of using squeue:

sinfo – reports the state of partitions and nodes managed by Slurm. It has a wide variety of filtering, sorting, and formatting options.
Computation nodes may be in one of the following states:
idle – node is free;
alloc – m=node is being used by the processor;
mix – node is used partially;
down, drain, drug – node is blocked.

Example of using sinfo:

scancel – is used to cancel a pending or running job or job step. It can also be used to send an arbitrary signal to all processes associated with a running job or job step.
Example of using scancel to cancel a pending application with jobid 141980:

scontrol – is the administrative tool used to view and/or modify Slurm state. Note that many scontrol commands can only be executed as user root.
Example of using scontrol to see specifications of the launched application:

List of specifications contains such parameters as:

Parameter Function
UserId usename
JobState Application state
RunTime Computation time
Partition Partition used
NodeList Nodes used
NumNodes Number of used nodes
NumCPUs Number of used processor cores
Gres Number of used graphic accelerators and coprocessors
MinMemoryCPU Amount of used RAM
Command Location of the file for launching application
StdErr Location of the file with error logs
StdOut Location of the file with output data

Example of using scontrol to see nodes’ specifications:

List of specificatons contains such parameters as:
NodeName – hostname of a computation node;
CPUAlloc – number of loaded computation cores;
CPUTot – total number of computation cores per node;
CPULoad – loading of computation cores;
Gres – number of graphic accelerators and coprocessors available for computations;
RealMemory – total amount of RAM per node;
AllocMem – amount of loaded RAM;
State – node state.


2. Partitions

The process of launching of a job begins with its pending in one of the partitions. Due to the fact that HybriLIT is a heterogeneous cluster, different partitions for using different kind of resources were created.

Currently HybriLIT includes 6 partitions:

    • interactive* – includes 1 computation node with 2 Intel Xeon E5-2695 v2 12-cores, 1 NVIDIA Tesla K20X, 1 Intel Xeon Coprocessor 5110P (* – denotes the default partition). This partition is good for running test programs. Computation time for this partitions is limited and makes 1 hour;
    • cpu – includes 4 computation nodes with 2 Intel Xeon E5-2695 v2 12-cores per each. This partition will be good for running applications that use central processing units (CPU) for computations;
    • gpu – includes 3 computation nodes with 3 NVIDIA Tesla K40 (Atlas) per each. This partition will be good for running applications that use graphic accelerators (GPU) for computations;
    • gpuK80 – includes 3 computation nodes with 2 NVIDIA Tesla K80 per each. This partition will be good for running applications that use GPU for computations;
    • phi – includes 1 computation node with 2 Intel Xeon Coprocessor 7120P. This partition will be good for running applications that use coprocessors for computations;
    • long – includes 1 computation node with NVIDIA Tesla. This partition will be good for running applications that require time-consuming computations (up to 14 days).

3. Description and examples of script-files

In order to run applications by means of sbatch command, you need to use a script-file. Normally, script-file is a common bash file the meets the following rewuirements:
drawit-diagram-8
The first line includes  #!/bin/sh (или #!/bin/bash) , which allows the script being run as a bash-script;
Lines beginning with #  – are comments;
Lines beginning with  #SBATCH , set parameters for SLURM job manager;
All parameters of SLURM are to be set BEFORE launching application;
Script-file includes a command for launching applications.

SLURM has a big number of various parameters (https://computing.llnl.gov/linux/slurm/sbatch.html). Please see the required and recommended parameters for work on HybriLIT below:
-p  – partitions used. In case of absence of this parameter, your job will be sent to the interactive partition the time of execution of which is limited and makes 1 hour. Depending on the type of used resources, application may be executed in one of available partitions: cpu, phi, gpu, gpuK80;
-n  – number of processors used;
-t  – allocated computation time. This is a necessary parameter to be set. The following parameter formats are available: minutes, minutes:seconds, hours:minutes:seconds, days-hours, days-hours:minutes, days-hours:minutes:seconds;
--gres  – number of allocated NVIDIA graphic accelerators and Intel Xeon Phi coprocessors. This parameter is necessary if your application uses gpu OR  Intel Xeon Phi coprocessors;
--mem  – allocated RAM (in Mbytes). This parameter is not necessary to be set, however, consider setting it if your application uses a large amount of RAM;
-N  – number of nodes used. This parameter should be set ONLY in case if your application needs more resources than 1 node possesses;
-o  – name of the output file. By default, all results are written in slurm-jobid.out file.

Please see below examples of scripts that use various resources of the HybriLIT cluster:

For computations using CPU:

For computations using GPU:

For computations using Intel Xeon Coprocessor:

Examples of using script-files for different programming technologies will be given below in the corresponding sections.

5 basic steps to carry out computations on the cluster:

      • We can point out 5 basic steps that describe the workflow on the cluster:

drawit-diagram-3

Compilation and launch of OpenMP-applications

OpenMP (Open Multi-Processing) — is an application programming interface (API) that supports multi-platform shared memory multiprocessing programming in С, С++ and Fortran. It consists of a set of compiler directives, library routines, and environment variables that influence run-time behavior that are meant for development of multi-thread application on multi-processor systems with shared memory. Program model – Fork-Join Model, – the main aim of which is (Pic.1.):

Any program starts in Thread 0 (Master thread). Then by means of compiler directives, Thread 0 creates a set of other threads – FORK, which are executed in parallel. Once created threads terminate their work in the parallel region, all threads then synchronize – JOIN and the program continues its work in the master thread.

drawit-diagram-19

Pic.1.Fork-Join program model.


Compilation

A standard set of compilers with support for OpenMP is being used. GNU and Intel compilers are available. GNU compilers with support for OpenMP are available by default. Before compilation using Intel compilers, it is necessary to load the corresponding modules:

Please see below basic commands for compilation of programs written in С, С++ or Fortran for different compilers:

Intel GNU PGI
C icc -openmp hello.c gcc -fopenmp hello.c pgcc -mp hello.c
 C++ icpc -openmp hello.cpp g++ -fopenmp hello.cpp  pgc++ -mp hello.cpp
Fortran ifort -openmp hello.f gfortran -fopenmp hello.f pgfortran -mp hello.f

In case of successful compilation, a binary executable file is created. By default, name of the binary file for all compilers is - a.out. You can set another name for that file using -o. For example, if we use the following command:

the name of the binary file will be hello.


Launch

Launching of OpenMP-applications is carried out by means of script-file that contains the following data:

The use of the following settings optimizes distribution of threads by computation cores and, as a rule, provides less computation time than that without using this command.

Number of OMP threads may be set using environment variable OMP_NUM_THREADS before executing the program in the command line:

where threads – number of OMP-threads.

Thus, the recommended script-file for OpenMP-applications with, for example, threads looks like this:

Use the following command to launch:

Compilation and launch of MPI-applications

Message Passing Interface (MPI) — is a standardized and portable message-passing system for passing information from process to process that execute one job.

GNU and Intel compilers are available for work with MPI.


GNU compiler

MPI-programs can be compiled using GNU-compilers with OpenMPIlibrary. GNU-compilers are installed on HybriLIT by default. In order to get access to OpenMPI-libraries, it is necessary to add the required modules 1.6.5 or 1.8.8, or 1.10.4, or 2.0.1:

or

or

or

drawit-diagram-8

The latest version is v2.0.1. It is supposed to be the latest stable release and is installed on the cluster.

 

Compilation

Please see below basic commands for compilation of programs written in С, С++ or Fortran for GNU-compiler:

Programming languages Compilation commands
 C mpicc
 C++ mpiCC / mpic++ / mpicxx
Fortran 77 mpif77 / mpifort (*)
Fortran 90 mpif90 / mpifort (*)

drawit-diagram-8(*) It is recommended to use mpifort command instead of mpif77 or mpif90, as they are considered to be out-of-date. By means of mpifort it is possible to compile any Fortran-Fortran-programs that use “ mpif.h ” or “ use mpi” as an interface.

Example of program compilation in С programming language:

If you do not set a custom name to the binary executable file, the name after successful compilation will be a.out by default.

To start the program using OpenMPI modules use this script:


Intel-compiler

In order to use MPI with  Intel-compiler, it is necessary to load a module

This module includes MPI-library.

Compilation

Please see below basic commands for compilation of programs written in С, С++ or Fortran for Intel-compiler:

Programming languages Compilation commands
 C mpiicc
 C++ mpiicpc
Fortran mpiifort

Example of program compilation in Frotran:


drawit-diagram-8Options of compilation optimization: 

Options Purpose
-O0 Without optimization; used for GNU-compiler by default
-O2 Used for Intel-compiler by default
-O3 May be efficient for a particular range of programs
-march=native;           

-march=core2

Adjustment for processor architecture (using optional capability of Intel processors)

Initiation of tasks

In order to initiate a task, use the following command of SLURM:

where script_mpi – – is the name of previously prepared script-file that contains task parameters.

Example of  script-file for launching MPI-applications on two computation nodes 

The following two approaches describe distribution of MPI processes over computational nodes.

Example of using combination of keys --tasks-per-node  and   -n : 10 processes ( -n 10 ) 5 processes per one computational node ( --tasks-per-node=5). Thus, computations are distributed over  2computation nodes:

Example of using combination of keys  --tasks-per-node  and  -N:  10 processes ( -n 10 ) 5 processes per one computational node ( --tasks-per-node=5) and number of nodes ( -N 2). Thus,  10 processes are distributed over 2 computation nodes:

drawit-diagram-8Additional nodes (**) can be ordered if the task requires more than 24 parallel processes.
Example of a simple script-file in which the most important SLURM-directives are present:

Executable filea.out (prepared by the compiler) is sent to the input queue of tasks aimed at computations on cpu-subset of the HybriLIT cluster. The tasks requires 7 cores. The task will be given a unique id in the input queue – let it be 1234. Then the task listing will be presented as slurm-1234.out in the same directory which contains its executable file a.out.

drawit-diagram-8

In order to call MPI-procedures in programs in Fortran, Include ‘mpif.h’ operator is used. This operator can be substituted with a more powerful option – mpi.mod module that is loaded by means of Use mpi command.

Launching tasks

    • While launching tasks, it is necessary to take into consideration limitations on using the resources of the cluster;
    • it is advisable to use the module for setting environmental variable with which the program has been compiled;
    • It is important to not delete executable file and change input data unless the task is complete.