Easy way to build a virtual cluster with KVM

Posted on 2020-10-25 Last updated on 2022-09-05

Updated on 16 May, 2021. The quality of the original version is not satisfying. I've rearranged this article to make it easier to read, and updated some state-of-art approaches widely used today.

This is my first post written in English, and this post is written for my "underboss" and general Linux users who has a little experience on virtual machines. This time, we are going to build our virtual cluster that is similar to old-school supercomputers, which means our configuration will be some kind of outdated and ugly compared to the latest fancy new clusters, but this configuration is much close to mainstream supercomputers in real world.

Overview

This tutorial mainly focuses on building a simple cluster in a simple way. Here is the list about the software we will install, and the architecture we will construct later. But, we will discuss some state-of-art technologies in the last section Advanced Topic to help you build a much powerful cluster in practice.

Software (Minimal Requirements)

Host
- OS: Ubuntu Server LTS (recommended 18.04 or later)
- Virtual Machine Hypervisor: KVM + QEMU + virt-manager
Guest
- OS: Ubuntu Server LTS
- Shared Storage: NFS

Architecture

In real world, there are many ways to categorize the nodes of a cluster. For example,

Compute Node: Run user applications.
Storage Node: Provide shared storage
Master Node: Monitor the cluster and schedule jobs
Login Node: Provide access points for regular users

We let the Master Node in our virtual cluster takes the responsibilities of Storage Node and Login Node. It is fine to merge Master Node, Storage Node, and Login Node together if this virtual cluster is not built for production environment.

Clusters vs PC

To let you have a better understanding of the concept of what we called cluster, I would like to introduce several different characteristics of clusters compared to a regular PC.

Shared Storage

Most of the clusters I used before share their storage, because it make the environment consistent. For example, when a user login an arbitrary node, the exact same .bashrc will be loaded. I've listed some important directories that are shared among clusters.

/home: Directory storing user data
/opt: Shared software and libraries (e.g. Intel Compiler)

Some distributed programs may rely on SSH to control remote machines, such as MPI. Thus, setting up password-less SSH login is required.

User Permission

A regular user should not be granted superuser privileges FOREVER if you don't want anybody abuse the permission to crash the system. Therefore, only the administrators can access to the critical system configuration and install software for all users. The regular user should use the preinstalled software or install software in a gentle way. For instance, using Conda to install software to their home directory or compiling the software from source code. As for account management, we will cover this in Advanced Topic.

Job Scheduler

Sometimes we may face the problem of lacking computational resources. This usually occurs in supercomputers since there are some user applications requesting a large amount of resources at the same time. Thus, a job scheduler will allow users to submit their applications and queue for the resources. Since it is not a mandatory component of our cluster, this topic will be discussed further in Advanced Topic.

Set up KVM hypervisor on Host

You probably have heard VMware, Hyper-V or Virtualbox. KVM is similar to them, which is also capable of creating a highly-isolated virtual environment.

Make sure Virtualization support is enabled

Some manufacturers disabled CPU Virtualization support for security reasons. Please check your BIOS setting and enable CPU Virtualization support. Besides, KVM kernel module is activated on Ubuntu by default. If you are unsure whether everything is correct, this article will guide you to perform some extra examinations.

Install necessary software

The following simple command will install everything we need, including KVM hypervisor. Note that apt is the corresponding packet manager for Ubuntu, just like yum for CentOS.

1	$ sudo apt install virt-manager

Launch GUI VM Manager

It is quite hard to create virtual machines under command line interface (CLI) because you have to manually specify a bunch of arguments. Luckily, there is a powerful graphical tool to manage virtual machines, and also there is a simple way called X11 forwarding, which could let a graphical application running on the remote server appear on your screen by forwarding and rendering the graphical data on your computer.

For Linux / Mac User

For Mac User, one additional software XQuartz is required. Click here to download.

Append the flag -X to ssh command, then the function X11 Forwarding will be enabled.

1 2	local$ ssh -X 10.x.x.x # Your Host's IP vmhost$ sudo virt-manager

Then you will notice virt-manger is already appeared on your screen.

For Windows User

One additional software called VcXsrv is required. Click here to download. Open it, follow the path Multiple windows -> Start a program -> Start a program on remote computer, and fill in your information. Remote program could be virt-manager.

Create the first virtual machine

So far, what we have done is colored green, and what we need to configure is colored yellow.

Create a virtual machine by clicking upload successful , and configure it as what you want. It is recommended that the memory is larger than 1GiB, and the disk is larger than 20GiB. And I prefer to use CD Image ISO File to install the system. Note that currently this VM should have exact one virtual network adapter working under NAT mode.

Some Optional Configurations

CPU and Memory

Copy host CPU configuration: helps Guest OS to identify CPU instruction set correctly

Manually set CPU topology: may improve performance on NUMA architecture.

Disk

qcow: will preallocates a lot of space

vmdk: allocates the space on demand

VirtIO: improves IO performance

Network

NAT: connects to the virtual network

macvtap/Bridge: connects to the external network

A known bug is that

VirtIO: improves IO performance

Display VNC

Spice / VNC: may fix the problem if you cannot input anything to virtual machines

Then install Ubuntu Server LTS in your favorite way, but don't forget to install OpenSSH Server.

Configure nodes (1st time)

Our approach to create multiple nodes is to clone the existing nodes. Thus, every modification to the configuration of Master Node will be applied to all the nodes we will have in the future.

Install necessary software

1	node0$ sudo apt install nfs-kernel-server

Create User Accounts (Optional)

For now, this step is optional because we could create accounts after cloning nodes.

1 2	node0$ sudo adduser user1 # Create a user called user1 node0$ sudo usermod -aG sudo user1 # Promote a user as a superuser

Create `/opt` directory

/opt is a conventional directory that stores the shared software and libraries, but Ubuntu won't create it by default.

1	$ sudo mkdir -p /opt

For every user account, execute the following command, and don't forget the root account.

user1@node0$ sudo su - user2 # Login as user2
user2@node0$ ssh-keygen
user2@node0$ cp .ssh/id_rsa.pub .ssh/authorized_keys
user2@node0$ exit
user1@node0$

Setup sudo without Password (Optional)

Perhaps you are tired with typing the password when acquiring superuser permissions. Setting password-less sudo may degrade the security of the system, but it is really convenient.

1	node0$ sudo visudo

And replace this line with the content below.

1 2	# %sudo ALL=(ALL:ALL) ALL %sudo ALL=(ALL:ALL) NOPASSWD:ALL

Clean up

/etc/machine-id might be generated when you install some software or do something else. However, this ID should vary from machine to machine. If two machines share the same ID, they may get the same IP address from the DHCP server. Therefore, this file should keep empty. Additionally, we utilize cloud-init to generate a new SSH ID for the new machine.

1
2
3

node0$ sudo su
node0# echo -n > /etc/machine-id
node0# cloud-init clean

Then shutdown Master Node.

1	node0# shutdown now

Clone nodes (Master -> Compute 1)

As we mentioned earlier, all the nodes except Master Node is made by cloning. virt-clone is a good utility to clone virtual machines. Here is the usage.

1	vmhost$ sudo virt-clone -o node0 -n node1 --auto-clone

After that, we can notice node1 shown in the window.

Configure nodes (2nd time)

On All Nodes

Check IP address

Check IP addresses of all the existing nodes.

$ ip a

And write them to the file hosts.txt on Master Node.

1 2	192.168.122.154 node0 192.168.122.186 node1

Note that all nodes will share the same hostname with Master Node temporarily, and you should specify the hostnames in hosts.txt manually.

On Master Node

Add a network adapter

So far, all the virtual machines have connected to the virtual network (192.168.122.x), but this subnet cannot be accessed by external machines in other subnet (e.g. our campus network 10.x.x.x). Thus, we need to add another adapter to expose our Master Node as the login node.

Click upload successful to modify the hardware configuration of Master Node. Then click Add Hardware, select Network, then change the network source from NAT to macvtap. Make sure you choose the correct host adapter, which should connect to external network, like campus network.

However, the stupid Ubuntu will not automatically initialize the new virtual adapter like Windows. Therefore, we have to manually configure this adapter.

First, figure out what is the name of the new adapter.

node0$ ip a
1: lo: ...
2: ens3: ...
3: ens9: ...

Our new adapter is called ens9. Then modify the file /etc/netplan/xxx.yaml to add the configuration of ens9.

network:
  ethernets:
    ens3:
      dhcp4: true
    ens9:
    	dhcp4: true
  version: 2

Finally, reboot to apply the new configuration.

Export NFS Storage

Modify the file /etc/exports (sudo required), add the following lines.

1 2	/home 192.168.122.0/24(rw,sync,no_root_squash,no_subtree_check) /opt 192.168.122.0/24(rw,sync,no_root_squash,no_subtree_check)

Then restart NFS service.

1	$ sudo service nfs-kernel-server restart

On Compute Node 1

Import NFS Storage

Add the following lines at the end of /etc/fstab (sudo required), then reboot.

1 2	192.168.122.154:/home /home nfs auto,nofail,noatime,nolock,intr,tcp,actimeo=1800 0 0 192.168.122.154:/opt /opt nfs auto,nofail,noatime,nolock,intr,tcp,actimeo=1800 0 0

Note that this configuration is permanent. The temporary approach is listed here.
1
2
$ sudo mount 192.168.122.10:/home /tmp/home
$ sudo mount 192.168.122.10:/opt /tmp/opt

Clean up again

Refer to Clean up section to clean up Compute Node 1.

Clone nodes (Compute 1 -> Compute N)

Refer to Clone nodes (Master -> Compute 1) section and make remaining nodes.

Set Hostname and Hosts

This is the final step of building our virtual cluster. Do you remember the file hosts.txt, which contains all the IP addresses and the new hostnames? Now it is time to utilize that file to set hostnames and /etc/hosts file. Assume the content of hosts.txt is,

192.168.122.154 node0
192.168.122.186 node1
192.168.122.31  node2
192.168.122.129 node3

The magic command below will automatically set the correct hostname and hosts file for each node listed in hosts.txt. Note that this command must be executed by root.

TMP=$(mktemp) && cat hosts.txt | awk -v hosts="$(base64 -w 0 hosts.txt)" '{printf "ssh -o StrictHostKeyChecking=no %s \"echo %s | base64 -d >> /etc/hosts\" & \nssh -o StrictHostKeyChecking=no %s \"hostnamectl set-hostname %s\" & \n", $1, hosts, $1, $2}' > $TMP && bash $TMP && rm $TMP && echo "OK"

This is the end of the primary part of this tutorial. You have completed all necessary steps to build your toy cluster. Enjoy it!

Advanced Topic

Here are some keywords you can search on Google if you are interested. I will keep updating this list in the future.

OS Deployment

cloud-init
ansible-playbook
Terraform
MAAS

Job Scheduler

IBM Platform LSF
Slurm
PBS

Centralized Authentication

LDAP + Kerberos
LDAP + NSLCD + NSCD

Distributed Filesystem

BeeGFS
Ceph
Lustre

Package Manager

Environment Modules
Spack
Conda

Container

NVIDIA-Docker
Singularity

Cluster Status Monitor

Grafana

Terminal

Terminator / iTerm
ClusterSSH / csshX
tmux-cssh
ClusterShell

Acknowledgment

Tsinghua Student Supercomputing Team / TUNA

Overview

Software (Minimal Requirements)

Architecture

Clusters vs PC

Shared Storage

SSH Login without Password

User Permission

Job Scheduler

Set up KVM hypervisor on Host

Make sure Virtualization support is enabled

Install necessary software

Launch GUI VM Manager

For Linux / Mac User

For Windows User

Create the first virtual machine

Configure nodes (1st time)

Install necessary software

Create User Accounts (Optional)

Create /opt directory

Setup SSH Login without Password

Setup sudo without Password (Optional)

Clean up

Clone nodes (Master -> Compute 1)

Configure nodes (2nd time)

On All Nodes

Check IP address

On Master Node

Add a network adapter

Export NFS Storage

On Compute Node 1

Import NFS Storage

Clean up again

Clone nodes (Compute 1 -> Compute N)

Set Hostname and Hosts

Advanced Topic

OS Deployment

Job Scheduler

Centralized Authentication

Distributed Filesystem

Package Manager

Container

Cluster Status Monitor

Terminal

Acknowledgment

Create `/opt` directory