Hi All, In today’s article we will talk about the Open Grid Scheduler/ Grid Engine and will show you its installation process and some steps to easily setup your Grid Engine Cluster on Ubuntu 15.04. Open Grid Scheduler/Grid Engine is a commercially supported batch-queuing system that manages and schedules the allocation of distributed resources such as processors, memory, disk space, and software licenses. It guarantees a good utilization of resources and prevents a single user from capturing the whole system resources when other users waiting for their jobs to run.
Grid Engine is typically used on a high-performance computing cluster that is responsible for accepting, scheduling, dispatching, and managing the remote and distributed execution of large numbers of standalone, parallel or interactive user jobs.
Grid Engine Master Server Setup
First of all we will setup our Ubuntu server for the installation of Grid Engine on the Master server, let's login to your server with root user and update your system with all basic server's parameters.
# apt-get update
1) Create New User
We will create a new user that will be used for the Grid Engine setup and the NFS shares. Let's run the command for new user setup and add its required information.
# adduser gsadmin --uid 500
2) Download Grid Engine Package
You can find the latest and old releases of Grid Engine scheduler packages on the Sourceforge Page.
Or if you can get it to download on your ubuntu server using the wget utility command by following the complete package path as shown.
# wget http://downloads.sourceforge.net/project/gridscheduler/GE2011.11p1/GE2011.11p1.tar.gz
3) Extract Package
We will extract the Open Grid Package and then move it into the home directory of the newly created user and assign its user rights. To perform these steps lets use the following commands.
# tar -zxvf GE2011.11p1.tar.gz # mv GE2011.11p1 /home/gsadmin/ # chown -R gsadmin:gsadmin /home/gsadmin/
Setup NFS Server on Master Host
Now install the NFS Server package and configure it with the home directory of the new user "gsadmin" for sharing.
Let's run the command below.
# apt-get install nfs-kernel-server
Before the installation process starts you will asked to press Yes/No, So once you press "Y" to continue a number of steps will be performed as shown below.
Creating config file /etc/idmapd.conf with new version Creating config file /etc/default/nfs-common with new version Adding system user `statd' (UID 109) ... Adding new user `statd' (UID 109) with group `nogroup' ... Not creating home directory `/var/lib/nfs'. invoke-rc.d: gssd.service doesn't exist but the upstart job does. Nothing to start or stop until a systemd or init job is present. invoke-rc.d: idmapd.service doesn't exist but the upstart job does. Nothing to start or stop until a systemd or init job is present. nfs-utils.service is a disabled or a static unit, not starting it. Setting up nfs-kernel-server (1:1.2.8-9ubuntu8.1) ... Creating config file /etc/exports with new version Creating config file /etc/default/nfs-kernel-server with new version Processing triggers for libc-bin (2.21-0ubuntu4) ... Processing triggers for ureadahead (0.100.0-19) ... Processing triggers for systemd (219-7ubuntu3) ...
At this point NSF kernel server has been successfully installed, now we will make use of few commands as shown below to configure the default exports file and then restart nfs services.
# echo "/home/gsadmin *(rw,sync,no_subtree_check)" >> /etc/exports # exportfs -a # service nfs-kernel-server restart
Setup OpenJDK-8 On Master Node
Let's setup Open JDK on the Master Node with Open grid engine to run java based jobs on the cluster.
Simply run the below command for the latest java installation.
# apt-get install openjdk-8-jdk
Press "Y" to continue the Java installation process as shown below.
0 to upgrade, 179 to newly install, 0 to remove and 40 not to upgrade. Need to get 108 MB of archives. After this operation, 478 MB of additional disk space will be used. Do you want to continue? [Y/n] Y
Java Home Setup
Now setup and configure the Java_Home and export its path to use.
root@ubuntu-15:~# echo "export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/" >> ~/.bashrc root@ubuntu-15:~# export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/ root@ubuntu-15:~# echo "export PATH=$PATH:/usr/lib/jvm/java-8-openjdk-amd64/bin" >> ~/.bashrc root@ubuntu-15:~# export PATH=$PATH:/usr/lib/jvm/java-8-openjdk-amd64/bin root@ubuntu-15:~# echo "export SGE_ROOT=/home/gsadmin/GE2011.11p1" >>~/.bashrc root@ubuntu-15:~# export SGE_ROOT=/home/gsadmin/GE2011.11p1/ root@ubuntu-15:~# export PATH=$PATH:~/GE2011.11p1/bin/linux-x64:$SGE_ROOT
While the PATH Variables also set at nodes so, before starting the installation of Open Grid Schedule packages, first we will configure the Nodes.
Setup Open Grid Engine Client Node
Like the same way on the master node, we will create a new user on the Client Node for the NFS Share in order to access the Grid Engine files and other shared data.
1) Adding New User
To create new user for the client node, use the command as shown in the image, give it a password and fill out other information if required else leave as default.
2) Installing NFS Package
Now install the NFS command package on th client node using the below command as shown.
3) Mount Share
After installation of the NFS common package now we will mount the server node's shares on the client node and configure it to be permanently mounted using the below commands.
root@ubuntu-node:~# mount 17.5.10.71:/home/gsadmin /home/gsadmin root@ubuntu-node:~# echo "17.5.10.71:/home/gsadmin /home/gsadmin nfs" >> /etc/fstab root@ubuntu-node:~# echo "export SGE_ROOT=/home/gseadmin/GE2011.11p1" >>~/.bashrc root@ubuntu-node:~# export SGE_ROOT=/home/gsadmin/GE2011.11p1/
Setup SSH Keys
Now we have to allow direct ssh access from Server to Client and Vice Versa, So, to accomplish this we will configure both nodes to have access to ssh without entering the password. Let's run the below command on the Client Node first to generate the RSA key pair.
root@client-node:~# ssh-keygen root@server-node:~# ssh-keygen
Now to copy the SSH Key from the client Node to the Master Node run the below command.
root@ubuntu-node:~# ssh-copy-id master_server_IP
We have successfully added Key from Client Machine to Server, now we can access our server node from the client without entering the root password. Use the below command to connect to other server using ssh key.
root@ubuntu-node:~# ssh 17.5.10.71
So, we are now able to connect from between other servers without a password using the same way.
Open Grid Scheduler/Grid Engine Installation
All the basic parameters have been setup, now we will go through the installation process Open Grid Scheduler and Grid Engine on the Master server node.
There are two options, whether to use the Graphical User Interface if you are using Xwindow Operating system or follow the command line installation process.
For GUI installation, point to the location of the installation script, give it the execute permissions on the server and run the gui installer.
root@ubuntu-15:~# cd /home/gsadmin/GE2011.11p1/source/clients/gui-installer/templates/ root@ubuntu-15:/home/gsadmin/GE2011.11p1/source/clients/gui-installer/templates# ./start_gui_installer.sh
You can also install the Grid engine server packages from its available repositories using apt-get command as shown below.
root@ubuntu-15:~# apt-get install gridengine-master
You might get failed error, if any of the prerequisites are unmet, so make sure of the following points.
- Hostname is properly configured.
- Postfix service is up and running.
- bsd-mailx package is installed.
During the installation process, you will be asked to configure some more parameters as shown below so choose the appropriate settings for Postfix General type of mail configurations from the available options.
Then choose the system mail name as instructed on the window.
In the next steps, you will be prompted to configure gridengine common settings. So, if you want the default configurations by SGE automatically then point to the "Yes" key an press OK to move for next settings.
At the end of the installation process, you will be able to see the parameters of the cluster initialization setup as shown in the below image.
Verify Installation
To check and verify that everything went fine during the installation process, lets run the below command on the master server to check its services.
root@ubuntu-15:~# netstat -anput | grep master root@ubuntu-15:~# ps aux | grep "sge"
On the Client node check for exec service.
root@ubuntu-node:~# netstat -anp | grep exec tcp 0 0 0.0.0.0:6445 0.0.0.0:* LISTEN 1353/sge_execd root@ubuntu-node:~# netstat -anput | grep exec tcp 0 0 0.0.0.0:6445 0.0.0.0:* LISTEN 1353/sge_execd
Conclusion
Here we learned the installation and configuration steps to setup Open Grid Scheduler/ Grid Engine cluster using Ubuntu 14.04/15.05. The Sun Grid Engine queuing system is useful when you have a bundle of tasks to run and you want to distribute the tasks over a cluster of nodes. Now you can use it for scheduling your tasks, load balancing or monitor all submitted jobs/queries which cluster nodes are running on.
hello
what is difference between sge clustering vs ogs clustering ,which one best for Ubuntu 14.4
There seems to be an omission in the tutorial commands. You need to create the .ssh keys so that the head node can reach the workers, but also that the workers can reach the head.
This can be done by copying them over with your sudo users using the ssh-copy-id command, then copying the contents of your users (gsadmin) ~/.ssh directory to /root/.ssh/
You go through the process of downloading the tar package, then you install from an apt repo. Which makes most of the previous steps with user addition useless. This does not install on ubuntu from the tar -- there are no binaries. Otherwise, great screenshots of code!