Friday, January 19, 2007

Notes on using PBS systems on clusters and supercomputers

This shortnotes may help you (and me after I'll forget it :) ) to make use of large supercomputers with PBS systems installed. I'll tell how to run, monitor and delete tasks using PBS systems. At the moment I'm using the computer located at ACK CYFRONET AGH in Kraków, where Torgue PBS is installed, and a cluster with OpenPBS located at Interdisciplinary Center for Computer Modelling of Warsaw University.  Generalizing a (short) experience of maintaining PBS, I'll tell you how to put tasks in a queue, monitor them and delete;  how to change resources required for the task; and a sample of typical PBS-skript will be provided here. It is supposed that you already know basics of Unix/Linux and have an account at the computer with PBS system installed.


Portable Batch System is a queue system which allows an optimal maintaining of computer resources (memory, CPU etc) on supercomputers and clusters, which are used by many users for large calculations. Most big computers consist of several nodes, each of which has several processor  units. Normally one runs a command from a workstation or terminal just by typing a command name and waits until it is finished.  This command will use 99% CPU time and will run on one processor unit. If one needs to run two commands at the same time, one opens two terminal windows and runs two commands, each of them will use 50% CPU. Well, when we want to make use of parallel calculations on several nodes and CPUs, a special software is needed, actually this is what PBS does.

It works like a cron daemon or Task Scheduler in Windows. You specify a program to be runned, resources to be used (i.e. number of nodes and processors, amount of memory etc.) and put it into a queue with the following command:

qsub script.sh

where script.sh - is a shell script which contains a path to a program and resources to be used. I'll post example of such a script below.
You may also specify resources in the command line. For details address:

man qsub
man pbs_resources

Sometimes you have to specify a name of the node:

qsub -q node_name

For details address a documentation provided by system administrator. After putting a queue you'll get an output:

98982.supercomp

Here 98982 is job identifier, "supercomp" is a node hostname. You can run the following command to see the status of this job:

qstat -f 98982

To see the status of all jobs just type

qstat -a

Or you may want to see only your own jobs:

qstat -n your_loginname

In order to delete the job with id=98982 you may type:

qdel 98982

In order to change the resources given in the script run.sh, you may use the command qalter. I will not discuss all the possible options of the commands qsub, qalter, qdel and qstat, because you can find them typing man command_name. I'll give you minimum skills required for this and recommend to read manuals and literature for further details. Let us look inside of the script run.sh:

#!/bin/sh
#PBS -N WaterStone
#PBS -l nodes=1:ppn=8
#PBS -l walltime=72:00:00
cd $PBS_O_WORKDIR/
~/siesta/Src/siesta < input.fdf > screen.out

This is usual Unix shell script, where special commands recognized by PBS system are put in the comments and begin with #PBS.  On the first line the name of the task is specified (WaterStone) which will appear on statistics (qstat). Second line denotes the number of notes (1) and the number of processors (8). You should address the documentation provided by system administrator if you want to know how to set these parameters. For example, on computers which I'm using presently I have to set nodes=1 always, because I have an access to only one node, and the maximum number of processors is restricted to 8 for my account. Note also, that on most servers tasks which use less resources (less memory and fewer processors) have priority to run, so sometimes it's useful to set ppn=2 or ppn=1 because such a task will start earlier than the task with ppn=8. Of course, this depends on the policy of system administrators.

The fourth line describes the amount of time needed for the task. After this period of time the task will automatically terminated, so if you need more time for this task you should change this paramater with qalter command later.

The fifth line forces to use the directory where script run.sh is located as working directory, where all the program output will be stored, as well as files WaterStone.task_id_o and WaterStone.task_id_e. In the sooner file all "screen" output will be stored, and possible errors will be stored in the latter file. I strongly recommend neighther use spaces in the name of working directory nor create a working directory in a folder which contains spaces in its name or path! This may cause problems, for example, the job cannot be started.

The last line is a command to run with input read from input.fdf and output stored in screen.out. Note, in such a case no output will be written to WaterStone.task_id_o because in such a command no output is put on the "screen". WaterStone.task_id_o file is created only after the task is finished, but in contrast output to screen.out is stored all the time, so I recommend you to redirect all the output to the file like I did it, so you'll be able to monitor it all the time.

Of course, there are many other options which can be defined in the script. For example, you may specify amount of memory for the program:

#PBS -l mem=1gb

Or force the PBS system to notify you about the status of the job:

#PBS -M noddeat@leavejournal.com

Not all the other options may work, this depends on the settings of the particular system.

NOTE: This article is a draft. I'll add more information here as soon as I have time. You may ask questions or share your own experience on PBS here in the comments.

Sorry for bad English, you may correct this article and post a corrected version in the comments. I also suppose that you post comments in English only because I may share a link to this article within English-speaking scientific community.

 

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.