Ecosystem for tasks of machine learning, deep learning and data analysis

Description of the ML/DL ecosystem

The active implementation of the neural network approach, methods and algorithms of machine learning and deep learning (ML/DL) for solving a wide range of problems is defined by many factors. The development of computing architectures, especially while using DL methods for training convolutional neural networks, the development of libraries, in which various algorithms are implemented, and frameworks, which allow building different models of neural networks can be referred to the main factors. To provide all the possibilities both for developing mathematical models and algorithms and carrying out resource-intensive computations including graphics accelerators, which significantly reduce the calculation time, an ecosystem for tasks of ML/DL and data analysis has been created and is actively developing for HybriLIT platform users.

The created ecosystem has two components (Fig.1):

  • the first component is designed for the development of models and algorithms on the JupyterHub – basis, i.e. a multi-user platform for working with Jupyter Notebook (known as IPython with the possibility to work in a web browser);
  • the second component is aimed at carrying out resource-intensive, massive parallel tasks, for example, for neural network training using NVIDIA graphics accelerators
pic. 1
FIGURE 1: TWO-COMPONENT ECOSYSTEM FOR TASKS OF ML/DL AND DATA ANALYSIS

The virtual machine (VM) parameters for the first component and the servers with graphics processors NVIDIA Volta are presented in Fig.1
The most frequently used libraries and frameworks installed on the components for solving tasks of ML/DL and data analysis are given below.

pic.2
FIGURE 2: ECOSYSTEM FOR ML/DL TASKS BUILT ON THE JupyterHub MULTI-USER SERVICE (a multi-user version of the Notebook).

Work within the ML/DL ecosystem

To get started you need to:
  1. Log in with your HybriLIT account in GitLab:

https://gitlab-hybrilit.jinr.ru/

  1. Enter the components (the autorization is done via GitLab):
Development component
(without graphics accelerators)
Component for carrying out resource-intensive calculations
(with graphics accelerators NVIDIA)
https://jhub.jinr.ru https://jhub2.jinr.ru

Jupyter Notebook

After the authorization the Jupyter Notebook interactive environment opens:

pic.3.

Home directories are available for users, the file systems NFS, ZFS, EOS are reinstalled.
How to create a directory or a file is shown on the screenshot above.