Education and testing polygon is an important part of the HybriLIT heterogeneous platform aimed at investigation of all possibilities of new computing architectures, new IT-solutions, and also for carrying out tutorials on parallel programming technologies, modern tools for developing, debugging and profiling parallel applications.
HybriLIT heterogeneous platform has a unified two-level structure for education and testing polygon and “Govorun” supercomputer.
Hardware
Characteristics of the nodes
CPU | GPU | ||||||||
Model | Sockets | Cores | Hyperthreading | Total logical cores | Model | Sockets | RAM (Gb) | Network (Mb/s) | |
blade01 | Intel Xeon E5-2695 v2 2.40GHz | 2 | 12 | yes | 48 | 128 | 1000 | ||
blade02 | Intel Xeon E5-2695 v2 2.40GHz | 2 | 12 | yes | 40(48) | Tesla K20X | 1 | 128 | 1000 |
blade03 | Intel Xeon E5-2695 v2 2.40GHz | 2 | 12 | yes | 40(48) | 128 | 1000 | ||
blade04 | Intel Xeon E5-2695 v2 2.40GHz | 2 | 12 | yes | 48 | Tesla K40 | 3 | 128 | 1000 |
blade05 | Intel Xeon E5-2695 v2 2.40GHz | 2 | 12 | yes | 48 | Tesla K40 | 3 | 128 | 1000 |
blade06 | Intel Xeon E5-2695 v2 2.40GHz | 2 | 12 | yes | 40(48) | Tesla K40 | 3 | 128 | 1000 |
blade07 | Intel Xeon E5-2695 v2 2.40GHz | 2 | 12 | yes | 40(48) | Tesla K40 | 3 | 128 | 1000 |
blade09 | Intel Xeon E5-2695 v3 2.30GHz | 2 | 14 | yes | 56 | Tesla K80 | 2 | 512 | 1000 |
space12-24 | Intel Xeon E5-2680 v3 2.50GHz | 1 | 6 | no | 6 | 2 | 10000 | ||
n01p016 | Intel Xeon Phi 7290 1.50GHz | 1 | 72 | yes | 288 | 96 | 1000, 100000 | ||
n01p017 | Intel Xeon Phi 7290 1.50GHz | 1 | 72 | yes | 288 | 96 | 1000, 100000 | ||
n01p018 | Intel Xeon Phi 7290 1.50GHz | 1 | 72 | yes | 288 | 96 | 1000, 100000 | ||
n01p019 | Intel Xeon Phi 7290 1.50GHz | 1 | 72 | yes | 288 | 96 | 1000, 100000 | ||
n01p020 | Intel Xeon Phi 7290 1.50GHz | 1 | 72 | yes | 288 | 96 | 1000, 100000 | ||
n01p021 | Intel Xeon Phi 7290 1.50GHz | 1 | 72 | yes | 288 | 96 | 1000, 100000 | ||
zfs | ~300 Tb |
Software environment of the HybriLIT heterogeneous platform:
- OS: Scientific Linux 7.5 (Nitrogen)
- SLURM installed as a task manager
- NFS и EOS file systems
- CernVM-FS – specialized file system that allows sharing software
- MODULES package is used for setting environmental variable for the solution of a specific task by means of using the required software (compilers, libraries; applied software packages, etc.)
- Libraties, packages for running parallel applications on various computing architectures.
Information environment of the HybriLIT heterogeneous platform
Information environment includes a set of services by means of which users can arrange their work in a more efficient way and have access to required and useful information while working on the cluster. Among services provided for users there are those, which contain necessary information about the cluster, and about upcoming events that will be held by the HybriLIT team. Such services include:
- HybriLIT web-page.
- GitLab: – it is a service aimed at mutual parallel development of applications. This is a version control system and it allows follow the changes in the code of projects. System functional is wide and includes such possibilities as separation of access among users, task management system, wiki, code review, etc.With a huge increase of the number of users, there appeared a need to monitor all data on the following issue:
- resources used,
- types of running tasks,
- users who run tasks,
- computation time of particular tasks, etc
- Indico system it is used by the HybriLIT team in order to organize conferences, seminars and meetings dedicated to parallel programming technologies. The system allows creating different events that will take place in the Institute. The created event provides basic information about the event itself, time and place. There is also a possibility to upload all the materials of lectures so that every user can download it.
- HybriLIT User Support: – it is a project developed in the Project Management Service system that allows answer the questions of users, upload useful materials, publish news, etc. This project is developed for a more efficient interaction between users and the HybriLIT team. This system provides possibilities for information distribution about the upcoming events and the current state of the cluster. Users can also create tasks on upgrade and debugging. Thus, interaction between users of the cluster and the developers is carried out quickly and efficiently.