AI Network Worker

In order to run an AIN worker, you need an ubuntu environment with GPU. The required GPU specifications may differ depending on the types of model to be served, and the requirements for each model can be found in the Model List section. In this tutorial, we will run a worker that provides the gpt-2-large-torch-serving model in the ubuntu 18.04 environment where Tesla K80 is installed.

Run AIN Worker

1. Check Graphics Driver

Before running a worker, you should check the requirements. First, let's check if the graphics driver is installed correctly. Please enter the following command:

$ nvidia-smi

The results will be printed in the following form, and you can check the CUDA version supported by your driver.

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.80.02 Driver Version: 450.80.02 CUDA Version: 11.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla K80 Off | 00002DE1:00:00.0 Off | 0 |
| N/A 44C P0 69W / 149W | 0MiB / 11441MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+

If the driver is not installed or the supported CUDA version is lower than 10.1, refer to here to install the graphics driver.

2. Check Nvidia Docker

The next step is to check whether the docker and Nvidia docker is installed, which allows you to utilize the GPU on docker containers. Please enter the following command:

$ sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi

After you run the above command, you should see something similar to this:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.80.02 Driver Version: 450.80.02 CUDA Version: 11.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla K80 Off | 00002DE1:00:00.0 Off | 0 |
| N/A 44C P0 69W / 149W | 0MiB / 11441MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+

If you're having trouble with the installation, please refer here to install the Nvidia docker.

3. Start Running a Worker

Once the graphics driver and Nvidia docker are installed, you're now ready to run the AIN worker. First of all, download the latest AIN Worker docker image using the docker pull command.

$ sudo docker pull ainblockchain/worker-docker

After that, run AIN Worker with the command below:

$ sudo docker run -d --name ain-worker \
-v {/PATH/TO/CONFIG}:/server/env.json \
-v /var/run/docker.sock:/var/run/docker.sock \
--network host ainblockchain/worker-docker

/PATH/TO/CONFIG contains the path to the config file that contains the parameters required to run a worker. After creating a file in the form below, replace /PATH/TO/CONFIG with the path of the file. After successfully running a worker, keep the config file safe, since it contains your Ethereum address and AIN private key.

{
// Ethereum wallet address to receive rewards
"ETH_ADDRESS": {INPUT YOUR ETHEREUM WALLET ADDRESS HERE},
// Model name you want to serve on the worker
"MODEL_NAME": {INPUT MODEL NAME ON MODEL LIST},
// GPU device number to be used by the worker
"GPU_DEVICE_NUMBER": "0",
// (Optional) If it doesn't exist, it will be created automatically.
"AIN_PRIVATE_KEY": {INPUT YOUR AIN PRIVATE KEY}
}

Model List

The list of models currently supported by AIN Worker is as follows:

Model Name

Supported GPU List

GPU Memory

Requirement

Minimum CUDA version

gpt-2-large-torch-serving

Tesla T4, K80

5 GB

10.1

gpt-2-trump-torch-serving

Tesla T4, K80

2 GB

10.1

‌When you have finished executing the command, you can check whether the worker is running normally through the docker logs. You can check the worker's logs with the following command:

$ sudo docker logs -f ain-worker

If the following log is displayed, the worker has been successfully started and is in the process of preparing to provide a model. This step can take about 15 to 25 minutes.

2020-12-14T04:21:23.362Z [manager/worker] info: [+] Start to create Job Container. It can take a long time.

‌After that, once the following message is displayed, the model is ready and is being served.

2020-12-14T04:37:03.498Z [manager/worker] info: [+] Success to create Job Container.
2020-12-14T04:38:03.654Z [manager/worker] info: [+] Start to listen Job

4. Exit a Worker‌

To terminate the AIN Worker, enter the following command:

$ sudo docker rm -f ain-worker {INPUT MODEL NAME ON MODEL LIST}

Appendix

Install Graphics Driver

Let's install the Nvidia graphics driver. The graphics driver's version must be at least 418.39. Execute the following commands in order:

$ sudo apt-get update -y
$ sudo apt purge nvidia-*
$ sudo add-apt-repository ppa:graphics-drivers/ppa
$ sudo apt update

You can find the appropriate driver version in the following way:

$ sudo apt install ubuntu-drivers-common
$ ubuntu-drivers devices
...
vendor : NVIDIA Corporation
model : GK210GL [Tesla K80]
driver : nvidia-driver-440-server - distro non-free
driver : nvidia-driver-390 - distro non-free
driver : nvidia-driver-410 - third-party free
driver : nvidia-driver-415 - third-party free
driver : nvidia-driver-418-server - distro non-free
driver : nvidia-driver-455 - third-party free recommended
driver : nvidia-driver-450-server - distro non-free
driver : nvidia-driver-450 - distro non-free
driver : xserver-xorg-video-nouveau - distro free builtin
...

Find the version tagged 'recommended' in the list of drivers. In the example above, nvidia-driver-455 is tagged with 'recommended'. Now you can install the appropriate graphics driver with the following command.

// Change `455` to the number that recommended for your system.
$ sudo apt install nvidia-driver-455

After installation is complete, reboot the system.

$ sudo reboot

Use the nvidia-smi command to confirm that the driver installation was successful.

$ nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.80.02 Driver Version: 450.80.02 CUDA Version: 11.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla K80 Off | 00002DE1:00:00.0 Off | 0 |
| N/A 44C P0 69W / 149W | 0MiB / 11441MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+

Install Nvidia Docker

When the graphic driver installation is completed, you need to install the Nvidia docker to run AIN Worker. This guide has been created by referring to the Nvidia docs: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker

First, run the following command to install docker.

$ curl https://get.docker.com | sh \
&& sudo systemctl start docker \
&& sudo systemctl enable docker

After that, install the Nvidia container toolkit.

$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
&& curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \
&& curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

Finally, after installing the Nvidia docker, restart the docker.

$ sudo apt-get update
$ sudo apt-get install -y nvidia-docker2
$ sudo systemctl restart docker

Nvidia docker installation is complete. To check if it's installed properly, run the command below and make sure you see an output similar to the following.

$ sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.51.06 Driver Version: 450.51.06 CUDA Version: 11.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 On | 00000000:00:1E.0 Off | 0 |
| N/A 34C P8 9W / 70W | 0MiB / 15109MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+