Ecosystem for tasks of machine learning, deep learning and data analysis

Description of the ML/DL ecosystem

The active implementation of the neural network approach, methods and algorithms of machine learning and deep learning (ML/DL) for solving a wide range of problems is defined by many factors. The development of computing architectures, especially while using DL methods for training convolutional neural networks, the development of libraries, in which various algorithms are implemented, and frameworks, which allow building different models of neural networks can be referred to the main factors. To provide all the possibilities both for developing mathematical models and algorithms and carrying out resource-intensive computations including graphics accelerators, which significantly reduce the calculation time, an ecosystem for tasks of ML/DL and data analysis has been created and is actively developing for HybriLIT platform users.


Video “Introduction into ML/DL/HPC Ecosystem”

Useful links:

Oksana Streltsova, Deputy Head of the Group, LIT JINR (in Russian)

Video by A.S. Vorontsov


The created ecosystem has three components (Fig.1):

  • the first component is designed for the development of models and algorithms on the JupyterHub – basis, i.e. a multi-user platform for working with Jupyter Notebook (known as IPython with the possibility to work in a web browser) – https://jhub.jinr.ru;
  • the second component is aimed at carrying out resource-intensive, massive parallel tasks, for example, for neural network training using NVIDIA graphics accelerators – https://jhub2.jinr.ru;
  • the third component JLabHPC is designed for calculations on the compute nodes of the HybriLIT platform, application development and scientific visualization – https://jlabhpc.jinr.ru.
pic. 1
FIGURE 1: TWO-COMPONENT ECOSYSTEM FOR TASKS OF ML/DL AND DATA ANALYSIS

The virtual machine (VM) parameters for the first and third components and the servers with graphics processors NVIDIA Volta are presented in Fig.1.
The most frequently used libraries and frameworks installed on the components for solving tasks of ML/DL and data analysis are given below.

pic.2
FIGURE 2: ECOSYSTEM FOR ML/DL TASKS BUILT ON THE JupyterHub MULTI-USER SERVICE (a multi-user version of the Notebook).

Work within the ML/DL/HPC ecosystem

To get started you need to:
  1. Log in with your HybriLIT account in GitLab:

https://gitlab-hybrilit.jinr.ru/

  1. Enter the components (the autorization is done via GitLab):
Development component
(without graphics accelerators)
Component for carrying out resource-intensive calculations
(with graphics accelerators NVIDIA)
Component for HPC on the HybriLIT platform nodes and data analysis
 (JupyterHub and SLURM)
https://jhub.jinr.ru https://jhub2.jinr.ru https://jlabhpc.jinr.ru/

Jupyter Notebook

After the authorization the Jupyter Notebook interactive environment opens:

pic.3.

Home directories are available for users, the file systems NFS, ZFS, EOS are reinstalled.
How to create a directory or a file is shown on the screenshot above.