This document describes the "mechanics" of using an Open MP program on one of Virginia Tech's SGI systems. It assumes that you already have an Open MP program written, and that you have an account on the SGI systems.
This document will simply walk you through a typical series of steps that start with an Open MP program on your "home" machine, transfer it to an SGI system, compile it, run it, and bring the results back home.
At the end of this document are instructions on how to actually carry out these steps, using sample files available on the web.
For this simple introduction, we'll make a number of assumptions.
(One file, simple name): We'll assume you have a program, already written, which uses OpenMP, that the program consists of a single file, written in FORTRAN90, and that this file is called prog.f90.
(No input/Only standard output): We'll also assume for now that the program needs no input, and that the output of the program is entirely directed to the standard output device. In other words, the executing program does not read from or write to any auxilliary files.
(Source code on home machine): We'll assume this source code file is sitting in the source_code subdirectory on your home machine home_mac.
Our goal then is to transfer the file to one of the SGI machines, compile it, run it, and retrieve the output.
There are currently three SGI systems available at Virginia Tech through the Advanced Research Computing Facility:
| Name | Address | Total Processors | User Limit |
|---|---|---|---|
| inferno | inferno.arc.vt.edu | 20 CPU's | 2 CPU's |
| inferno2 | inferno2.arc.vt.edu | 128 CPU's | 10 CPU's |
| cauldron | cauldron.arc.vt.edu | 64 CPU's | 6 CPU's |
To compile your program, you will need to transfer the source code of your Open MP program to one of these nodes. This can be done with the secure FTP program sftp. Here is a typical session, which suggests how you might transfer the file to inferno2. We are assuming here that you already set up a subdirectory on inferno2 called work_directory.
home_mac: sftp inferno2.arc.vt.edu
inferno2: Password for user: xxxxx
inferno2: cd work_directory
inferno2: lcd source_code
inferno2: put prog.f90
inferno2: ls
inferno2: prog.f90
inferno2: quit
home_mac:
Note that the commands cd, pwd and ls are carried out on the remote machine (inferno2 in this case) while the corresponding commands lcd, lpwd and lls will be carried out on the local machine (home_mac in this example). The put command moves files from the local to the remote machine, while the get command brings files from the remote machine to the local one. If multiple files are to be transferred, the mget and mput commands can be used instead.
Once the source code file has been transferred to the SGI system, you can log in and compile the program.
To log in interactively, we use the Secure Shell program, ssh.
home_mac: ssh inferno2.arc.vt.edu
inferno2: Password for user: xxxxx
inferno2: cd work_directory
inferno2: ifort -fpp -openmp prog.f90
inferno2: mv a.out prog
The FORTRAN compiler being used is ifort, the Intel Fortran compiler. The compiler assumes the program is written in Fortran90 based on its file extension of .f90. The switches -fpp and -openmp are necessary in order that the Open MP directives be processed correctly.
If the compilation fails, you will need to revise your program. You can either edit the program on your home machine and transfer it again, or make the changes directly on the System X copy.
In our example, we assume the compilation was successful. We allowed the compiler to assign the default name of a.out to the executable program it created, and then we renamed it to program. We're now ready to submit the program to execution, so we're staying logged in.
If the program was written in Fortran77, then the file extension should be simply .f and the compile command would be the -openmp switch is needed:
ifort -fpp -openmp prog.f
If the program was written in C, then the Intel C compiler would be used. This compiler is named icc. A C program has a file extension of .c. When compiling an Open MP program, the -openmp switch is needed:
icc -openmp prog.c
If the program was written in C++, then the Intel C++ compiler would be used. This compiler is named icpc. A C++ program has a file extension of .cc, cpp, cxx or .C. When compiling an Open MP program, the -openmp switch is needed:
icpc -openmp prog.C
The first time you run your program, you probably should run it sequentially and interactively, just to make sure that everything is OK.
Unless you say otherwise, your program will run sequentially. So for a first test, we will not request parallel processing.
The compiler has a tendency to name your executable program a.out. You can rename it to something more memorable, such as prog, and we'll assume you've done that. To run the executable program, you just type its name, though for technical reasons, it is usually necessary to include the prefix ./ on the name:
./prog
You can halt the execution of your program by typing Control-C. This is useful if your program seems to have entered an infinite loop, or if your program naturally runs for a long time, and you just wanted to start it and run it briefly, to make sure it was OK.
Generally, you need to set an environment variable called OMP_NUM_THREADS in order to specify that your Open MP program should be run on multiple processors.
On inferno2, the maximum number of processors you can request is 10. To request a number of processors, you issue a command like
export OMP_NUM_THREADS=10
if you are using the Bourne, Korn, or Bash shells; the corresponding command
for the C and T shells is
setenv OMP_NUM_THREADS 10
Once you've requested processors, you can run your OpenMP program. In the areas of your program that you have marked as being parallel, your program will have access to up to OMP_NUM_THREADS processors.
To be clear about this, here's how you might run a short program on 2, 4, and 8 processors, using the time utility to return the number of seconds required for execution.
export OMP_NUM_THREADS=2
time ./prog
export OMP_NUM_THREADS=4
time ./prog
export OMP_NUM_THREADS=8
time ./prog
If your program has been properly compiled, and is using Open MP well, you can hope that the wall clock time returned by the three runs decreases roughly by a factor of 2.
Now we'll assume that your program takes a fair amount of time to execute, even when it's running in parallel. You surely don't want to try to run the program interactively. That would mean you need to stay logged in to the SGI system, and you can't even issue a new command until your program finishes!
The first modification is the use of an ampersand & as the terminator of the command line. This says that the command you have just entered should be run "in the background", allowing you to enter new commands. So your command might look like this:
export OMP_NUM_THREADS=2
./prog &
Although you can now issue new commands, the one thing you can't do is log out, because then your background job will automatically be stopped and discarded. In order to log out, you need to prefix your command with the nohup command.
export OMP_NUM_THREADS=2
nohup ./prog &
Now if your program normally prints information to the screen, you're not going to be logged in and connected to the job, so this output will be lost. If you have output and want to save it, you can send the output of the program to a file, so that it won't be lost. An example of such a command is:
export OMP_NUM_THREADS=2
nohup ./prog > prog_output.txt &
If the program has finished execution, then we may want to retrieve the output file, and possibly other files that the program created.
Doing this is essentially the "inverse" of the process we went through in copying the source code program to inferno2. So one way to do this would begin by connecting to inferno2 using the sftp program:
home_mac: sftp inferno2.arc.vt.edu
inferno2: Password for user: xxxxx
inferno2: cd work_directory
inferno2: lcd source_code
inferno2: get prog_output.txt
inferno2: lls
inferno2: prog.f90 prog_output.txt
inferno2: quit
home_mac:
Sample files are available, so that you can try out the procedures for file transfer, compilation, job submission, and output file recovery.
You can return to the HTML web page.