Before getting hands dirty
This is my first post written in English, and this post is written for my “underboss” and general Linux users who has a little experience on virtual machines. This time, we are going to build our virtual cluster that is similar to old-school supercomputers, which means our configuration will be some kind of outdated and ugly compared to the latest fancy new clusters, but this configuration is much close to mainstream supercomputers in the real world.
Architecture
To let you have a better understanding of what we are going to build, I will introduce the architecture first.
- Physical Server (Host)
- Public IP:
10.20.x.x
(Assigned by Campus) - Private IP:
192.168.122.1
(Static) - KVM
- Login Node
- Public IP:
10.20.y.y
(Assigned by Campus) - Private IP:
192.168.122.10
(Static)
- Public IP:
- Computing Node 1-4
- Private IP:
192.168.122.11-14
(Static)
- Private IP:
- Login Node
- Public IP:
- My Laptop
- Public IP:
10.20.z.z
(Assigned by Campus)
- Public IP:
Static IP is assigned by ourselves.
The list below describes my configuration of software and hardware.
- Server
- CPU:
Intel Xeon E5-2697 v4
- OS:
Ubuntu 18.04 LTS
- CPU:
- Virtual Machines
- OS:
Ubuntu 20.04 LTS
- Job Scheduler:
IBM LSF
- Network Storage:
NFS
- OS:
Set up KVM hypervisor on Host
You probably have heard VMware
, Hyper-V
or Virtualbox
. KVM
is similar to them, which is also capable of creating a highly-isolated virtual environment.
Make sure Virtualization support is enabled
Some manufacturers disabled CPU Virtualization support for security reasons. Please check your BIOS setting and enable CPU Virtualization support. Besides, KVM kernel module is activated on Ubuntu by default. If you are unsure whether everything is correct, this article will guide you to do some extra examinations.
Install necessary software
The following simple command will install everything we need, including KVM hypervisor. Note that apt
is the corresponding packet manager for Ubuntu, just like yum
for CentOS.
1 | $ sudo apt install virt-manager |
Launch GUI VM Manager
Actually, it is quite hard to create virtual machines under command line interface (CLI) when you have to specify a bunch of arguments manually. Luckily, there is a powerful graphical tool to manage virtual machines. Meanwhile, there is a simple way to let a graphical application running on the remote server show on your computer.
For Linux / Mac User
By appending a flag -X
to ssh
command, the function X11 Forwarding
will be enabled. In brief, the data of the graphical interface will be forwarded and rendered on your computer.
1 | local$ ssh -X 10.20.x.x # Your Server's IP |
Then you will notice virt-manger
is already shown on your computer.
For Windows User
Unfortunately, Windows doesn’t have a build-in X11 Library. Thus additional software called VcXsrv
is required. After installing it, open it, follow the path Multiple windows
-> Start a program
-> Start a program on remote computer
, and fill in your information.
Note that Remote program
could be either xterm
if you have it or virt-manager
. xterm
is a terminal and you could use it to launch virt-manager
directly.
Create login node
Just like what you will do to create a virtual machine using VMware
or something else, create it using virt-manager
. But there are several points you should pay attention to.
CPU and Memory
I recommend checking the boxes Copy host CPU configuration
, which helps Guest OS to identify CPU instruction set correctly and Manually set CPU topology
. My configuration is for each node, one socket, four cores, and one thread per core. As for memory, under most circumstances, two gigabytes of memory per core should be enough.
Disk
By default, Storage format
is qcow
, which is fine but preallocates a lot of space. In contrast, vmdk
is the format that only consumes disk space when virtual machines store new data.
The best Disk bus
is VirtIO
, which is the default option for Ubuntu virtual machines since Ubuntu has native support for VirtIO
. Actually, it provides the best performance compared to alternatives.
Network
Every node will own a network interface card (NIC) that only accesses our private network. So Network source
and Device model
should be NAT
and VirtIO
respectively. Still, VirtIO
is the best option for the same reason.
Display VNC
If you cannot input anything to virtual machines, try to change VNC Server type
from Spice
to VNC
.
Install Guest OS
During installation, some settings should be modified.
Necessary software
SSH
must be installed because a lot of distributed programs rely on this. Other software is optional.
Change Software Source (Optional)
You are able to change software source during installation if having trouble fetching software packages in China. https://mirrors.tuna.tsinghua.edu.cn is recommended. Additionally, you can download OS Image from this site.
Network
IP should be manually assigned and belong to the same private subnet. For example, by default, the DHCP server will automatically assign your node with a new IP like 192.168.122.233
. This IP exactly belongs to your private subnet, but it is a random number. So, we could set 192.168.122.10
as its new IP address.
Clone nodes
After installing OS, it is time to make a login node by cloning, which also takes the storage node’s responsibility and provides shared storage. virt-clone
is a good utility to clone virtual machines. Here is the usage.
1 | $ virt-clone -o existed_node_name -n new_node_name --auto-clone |
We just need to clone one node. In other words, we should have login node and node 1 after that. Consider that some settings will be shared within all computing nodes, so we will make the remaining nodes after configuring node 1.
Make login node accessible
By far, there is only one private IP assigned to login node, which results in that our computer cannot connect to login node without the server’s help. The solution is straightforward, adding a new NIC with public IP. Click the button Add Hardware
and select Network
. This time, Network source
should be one of physical NICs, the one that the server uses to connect to the campus network. Source mode
has better be Bridge
, which will allows the campus’ network infrastructure to recognize your virtual machines directly. As for VEPA
, it is similar to Bridge
, but requires an advanced physical switch that supports the “hairpin” mode and theoretically improves network performance.
Note that one known flaw is that the server cannot access the virtual machine via its public IP address, vice versa. This is caused by the driver macvtap
, and could be regarded as the trade-off between functionality and performance. But the server could still access the virtual machine via its private IP. If you cannot accept this flaw, you should consider the old approach, setting up network bridges (virtual switches) and tuntap
devices (virtual NICs) manually.
Configure shared network storage
On many supercomputers in the real world, there are a few special nodes called storage nodes. They are responsible for storing all the user data such as /home
directory and sharing within the cluster. Since user programs will be executed by plenty of nodes simultaneously, all nodes should be capable of accessing the program data. That’s why shared network storage will be deployed on the cluster.
Although many productive clusters will use a much more advanced and complex storage system, even a distributed system, NFS
has already satisfied our needs. Moreover, it is simple.
On Login Node
First, create /opt
directory, which could store shared software such as Intel Parallel Studio
.
1 | $ sudo mkdir /opt |
And modify the file /etc/exports
(sudo
required), add the following lines.
1 | /home 192.168.122.0/24(rw,sync,no_root_squash,no_subtree_check) |
Note that 192.168.122.0/24
should be replaced with your subnet address if your private IP doesn’t begin with 192.168.122
.
Then restart NFS
service.
1 | $ sudo service nfs-kernel-server restart |
On Computing Node
Add the following lines to /etc/fstab
(sudo
required), then reboot.
1 | 192.168.122.10:/home /home nfs auto,nofail,noatime,nolock,intr,tcp,actimeo=1800 0 0 |
Note that this change is permanent. Although anything wrong will less likely destroy the whole system, the command to test whether you can mount NFS disks is still listed below.
1 | $ mkdir /tmp/home |
If you can observe test
files in /opt
and /home
directories, everything should work correctly.
Clone computing nodes
Refer to Clone nodes
section and clone to remaining nodes.
Change Hostname and IP
Now all the nodes are labeled the same name, which looks weird. And hostnamectl
is the tool to change the hostname. I took the names node-login
, node1
, node2
… for my cluster.
1 | $ hostnamectl set-hostname your_new_name |
Furthermore, all the nodes use the same IP to communicate, which will disturb communications. So open the directory /etc/netplan/
, find a yaml
file, and open it, then you should see something similar to the one below.
1 | # This is the network config written by 'subiquity' |
And replace it with the following code.
1 | # This is the network config written by 'subiquity' |
Then apply the new configuration.
1 | $ sudo netplan apply |
Enter the command ip a
to check whether new IP address is applied.
Write list of machines
/etc/hosts
~/machinefile
(to be continued)
Set up password-less login
Set up password-less sudo
(to be continued)
Install extra utilities (Optional)
Parallel-SSH
Environment Modules
IBM Spectrum LSF
(to be continued)