SGE
Sun Grid Engine
Job Control System


SGE is a job control system which can accept jobs from a variety of users, and assign them for execution on one or more computers that it communicates with. In particular, if a user wishes to run a program in parallel, with MPI, SGE can find the necessary machines and manage the parallel execution of the job.

A job that is to be controlled by SGE must be "noninteractive". If the program would normally read from the keyboard, then a file of input must be prepared beforehand, and SGE must be told to use that file for input. If the program would normally display results to the terminal, then SGE must be told to save such results in an output file.

A job to be controlled by SGE is described by a shell script. The shell script can be written using the C shell, the BASH shell, or other shells. For instance, a C shell script might have the name hello.csh. The commands to be executed appear in the body of the shell script, in the usual way. Comments, which start with the "#" character, are used to pass job control parameters to SGE.

In order to be executed, the user's shell script file is submitted to SGE using the qsub command. For instance, the hello.csh script would be submitted by the command


        qsub hello.csh
      
The system should respond with a comment like:

        Your job 270 ("hello.csh") has been submitted
      
The number 270 is an identification number that can be used later to identify your job, in case you need to cancel it, for instance.

Typically, the job will take some time waiting in a queue before beginning execution. You can check to see the status of your job using the qstat command:


        qstat
      

Related Data and Programs:

C_SHELL shows some examples of the use of the C shell to write scripts.

CONDOR is a job control system which is used at SCS in a secondary role, after SGE.

MPI is a Message Passing Interface that makes it possible to write programs that run in parallel on many computers. Information about MPI, for users of a specific programming language, is available in a C version, or a C++ version , or a FORTRAN77 version, or a FORTRAN90 version.

Reference:

  1. http://www.rocksclusters.org/roll-documentation/sge/4.2.1/,
    The ROCKS documentation;
  2. http://gridengine.sunsource.net/documentation.html,
    The SUN SGE site;
  3. https://www.scs.fsu.edu/twiki/bin/view/Computing/ScsSGE,
    The SCS SGE page;

Examples and Tests:

HELLO is a simple example that simply shows how to run a job through the system. The task being done is simply to print a "Hello" message.

You can go up one level to the Examples page.


Last revised on 07 March 2007.