In order to run an AIN worker, you need an ubuntu environment with GPU. The required GPU specifications may differ depending on the types of model to be served, and the requirements for each model can be found in the Model List section. In this tutorial, we will run a worker that provides the gpt-2-large-torch-serving
model in the ubuntu 18.04 environment where Tesla K80 is installed.
Before running a worker, you should check the requirements. First, let's check if the graphics driver is installed correctly. Please enter the following command:
$ nvidia-smi
The results will be printed in the following form, and you can check the CUDA version supported by your driver.
+-----------------------------------------------------------------------------+| NVIDIA-SMI 450.80.02 Driver Version: 450.80.02 CUDA Version: 11.0 ||-------------------------------+----------------------+----------------------+| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC || Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. || | | MIG M. ||===============================+======================+======================|| 0 Tesla K80 Off | 00002DE1:00:00.0 Off | 0 || N/A 44C P0 69W / 149W | 0MiB / 11441MiB | 0% Default || | | N/A |+-------------------------------+----------------------+----------------------++-----------------------------------------------------------------------------+| Processes: || GPU GI CI PID Type Process name GPU Memory || ID ID Usage ||=============================================================================|| No running processes found |+-----------------------------------------------------------------------------+
If the driver is not installed or the supported CUDA version is lower than 10.1, refer to here to install the graphics driver.
The next step is to check whether the docker and Nvidia docker is installed, which allows you to utilize the GPU on docker containers. Please enter the following command:
$ sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi
After you run the above command, you should see something similar to this:
+-----------------------------------------------------------------------------+| NVIDIA-SMI 450.80.02 Driver Version: 450.80.02 CUDA Version: 11.0 ||-------------------------------+----------------------+----------------------+| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC || Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. || | | MIG M. ||===============================+======================+======================|| 0 Tesla K80 Off | 00002DE1:00:00.0 Off | 0 || N/A 44C P0 69W / 149W | 0MiB / 11441MiB | 0% Default || | | N/A |+-------------------------------+----------------------+----------------------++-----------------------------------------------------------------------------+| Processes: || GPU GI CI PID Type Process name GPU Memory || ID ID Usage ||=============================================================================|| No running processes found |+-----------------------------------------------------------------------------+
If you're having trouble with the installation, please refer here to install the Nvidia docker.
Once the graphics driver and Nvidia docker are installed, you're now ready to run the AIN worker. First of all, download the latest AIN Worker docker image using the docker pull command.
$ sudo docker pull ainblockchain/worker-docker
After that, run AIN Worker with the command below:
$ sudo docker run -d --name ain-worker \-v {/PATH/TO/CONFIG}:/server/env.json \-v /var/run/docker.sock:/var/run/docker.sock \--network host ainblockchain/worker-docker
/PATH/TO/CONFIG
contains the path to the config file that contains the parameters required to run a worker. After creating a file in the form below, replace /PATH/TO/CONFIG
with the path of the file. After successfully running a worker, keep the config file safe, since it contains your Ethereum address and AIN private key.
{// Ethereum wallet address to receive rewards"ETH_ADDRESS": {INPUT YOUR ETHEREUM WALLET ADDRESS HERE},// Model name you want to serve on the worker"MODEL_NAME": {INPUT MODEL NAME ON MODEL LIST},// GPU device number to be used by the worker"GPU_DEVICE_NUMBER": "0",// (Optional) If it doesn't exist, it will be created automatically."AIN_PRIVATE_KEY": {INPUT YOUR AIN PRIVATE KEY}}
The list of models currently supported by AIN Worker is as follows:
Model Name | Supported GPU List | GPU Memory Requirement | Minimum CUDA version |
gpt-2-large-torch-serving | Tesla T4, K80 | 5 GB | 10.1 |
gpt-2-trump-torch-serving | Tesla T4, K80 | 2 GB | 10.1 |
When you have finished executing the command, you can check whether the worker is running normally through the docker logs. You can check the worker's logs with the following command:
$ sudo docker logs -f ain-worker
If the following log is displayed, the worker has been successfully started and is in the process of preparing to provide a model. This step can take about 15 to 25 minutes.
2020-12-14T04:21:23.362Z [manager/worker] info: [+] Start to create Job Container. It can take a long time.
After that, once the following message is displayed, the model is ready and is being served.
2020-12-14T04:37:03.498Z [manager/worker] info: [+] Success to create Job Container.2020-12-14T04:38:03.654Z [manager/worker] info: [+] Start to listen Job
To terminate the AIN Worker, enter the following command:
$ sudo docker rm -f ain-worker {INPUT MODEL NAME ON MODEL LIST}
Let's install the Nvidia graphics driver. The graphics driver's version must be at least 418.39. Execute the following commands in order:
$ sudo apt-get update -y$ sudo apt purge nvidia-*$ sudo add-apt-repository ppa:graphics-drivers/ppa$ sudo apt update
You can find the appropriate driver version in the following way:
$ sudo apt install ubuntu-drivers-common$ ubuntu-drivers devices...vendor : NVIDIA Corporationmodel : GK210GL [Tesla K80]driver : nvidia-driver-440-server - distro non-freedriver : nvidia-driver-390 - distro non-freedriver : nvidia-driver-410 - third-party freedriver : nvidia-driver-415 - third-party freedriver : nvidia-driver-418-server - distro non-freedriver : nvidia-driver-455 - third-party free recommendeddriver : nvidia-driver-450-server - distro non-freedriver : nvidia-driver-450 - distro non-freedriver : xserver-xorg-video-nouveau - distro free builtin...
Find the version tagged 'recommended' in the list of drivers. In the example above, nvidia-driver-455
is tagged with 'recommended'. Now you can install the appropriate graphics driver with the following command.
// Change `455` to the number that recommended for your system.$ sudo apt install nvidia-driver-455
After installation is complete, reboot the system.
$ sudo reboot
Use the nvidia-smi
command to confirm that the driver installation was successful.
$ nvidia-smi+-----------------------------------------------------------------------------+| NVIDIA-SMI 450.80.02 Driver Version: 450.80.02 CUDA Version: 11.0 ||-------------------------------+----------------------+----------------------+| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC || Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. || | | MIG M. ||===============================+======================+======================|| 0 Tesla K80 Off | 00002DE1:00:00.0 Off | 0 || N/A 44C P0 69W / 149W | 0MiB / 11441MiB | 0% Default || | | N/A |+-------------------------------+----------------------+----------------------++-----------------------------------------------------------------------------+| Processes: || GPU GI CI PID Type Process name GPU Memory || ID ID Usage ||=============================================================================|| No running processes found |+-----------------------------------------------------------------------------+
When the graphic driver installation is completed, you need to install the Nvidia docker to run AIN Worker. This guide has been created by referring to the Nvidia docs: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker
First, run the following command to install docker.
$ curl https://get.docker.com | sh \&& sudo systemctl start docker \&& sudo systemctl enable docker
After that, install the Nvidia container toolkit.
$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \&& curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \&& curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
Finally, after installing the Nvidia docker, restart the docker.
$ sudo apt-get update$ sudo apt-get install -y nvidia-docker2$ sudo systemctl restart docker
Nvidia docker installation is complete. To check if it's installed properly, run the command below and make sure you see an output similar to the following.
$ sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi+-----------------------------------------------------------------------------+| NVIDIA-SMI 450.51.06 Driver Version: 450.51.06 CUDA Version: 11.0 ||-------------------------------+----------------------+----------------------+| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC || Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. || | | MIG M. ||===============================+======================+======================|| 0 Tesla T4 On | 00000000:00:1E.0 Off | 0 || N/A 34C P8 9W / 70W | 0MiB / 15109MiB | 0% Default || | | N/A |+-------------------------------+----------------------+----------------------++-----------------------------------------------------------------------------+| Processes: || GPU GI CI PID Type Process name GPU Memory || ID ID Usage ||=============================================================================|| No running processes found |+-----------------------------------------------------------------------------+