OpenMP Glossary


Introduction

This document is a brief outline of some of the features of OpenMP.


OpenMP Environmental Variables:

OpenMP allows the user to define certain environment variables. If these variables are set by the user, their values will be used at compilation time, changing the default behavior.

There are just 4 OpenMP environment variables:

Assuming you are using a Unix system, then the way you set one of these variables depends on what shell you are in. To determine what shell you are in, type


        echo $SHELL
      

The answer to this command will be the name of your shell, possibly preceded by a prefix. The "interesting" part of the answer might be one of the following:

If you are using the Bourne, Bourne Again, or Korn shell, then you set a variable using the following syntax. (Here we are demonstrating how to set the maximum number of threads to 4):


        export OMP_NUM_THREADS=4
      

If you are using the C or T shell, then you set a variable using the following syntax:


        setenv OMP_NUM_THREADS 4
      

In either case, you can check what you did by typing


        echo $OMP_NUM_THREADS
      
and, if you did everything right, the response should be 4.


OpenMP Functions:

dynamic_truefalse = omp_get_dynamic ( )
t_max = omp_get_max_threads ( )
nested_truefalse = omp_get_nested ( )
p_num = omp_get_num_procs ( )
t_num = omp_get_num_threads ( )
t_id = omp_get_thread_num ( )
wtime = omp_get_wtime ( )
in_parallel_truefalse = omp_in_parallel ( )
 omp_set_dynamic ( dynamic_truefalse )
 omp_set_nested ( nested_truefalse )
 omp_set_num_threads ( t_num )

The most useful functions: call omp_get_num_procs to get the number of processors. Use this value as your input to omp_set_num_threads, which declares how many threads of execution you want to use. In any parallel section of the code, you can call omp_get_thread_num to get the ID of the thread, allowing you to assign work to particular threads.

The function omp_get_num_threads can tell you how many threads of execution there are - that is, the value you chose by setting the environment variable OMP_THREAD_NUM, or by calling omp_set_num_threads, or else the number you got by default (which is probably 1!).

The function omp_get_wtime returns the elapsed wall clock time in seconds, as a value of type double or double precision. It is normally called in pairs, once before and once after a block of significant parallel work. The difference in the two readings gives the amount of wall clock time required to carry out the work.

The functions omp_set_dynamic and omp_get_dynamic allow you to set or query the dynamic scheduling option. In dynamic scheduling, the iterations of the loop are not assigned in advance. Instead, each thread gets an initial "chunk" of iterations to perform. A thread that completes its chunk then is dynamically assigned the next available chunk, and so on.

The functions omp_set_nested and omp_get_nested allow you to request nested parallelism. Nested parallelism is disabled by default, and is not guaranteed to be available on all implementations of OpenMP. If available, it allows a single thread, upon encountering a parallel region nested inside the current parallel region, to create more threads.

The function omp_in_parallel returns TRUE if it is called from within a parallel region. Why is this useful? It can actually be hard to tell whether a line of code is executing "in parallel" in cases where the line of code is in a function that may have been called from either a parallel or sequential part of the code.

The function omp_get_max_threads returns the maximum number of threads available for executing parallel regions encountered after this line. This is a somewhat technical routine, which is intended for cases where nested parallelism is used.


OpenMP Directives
Formatting Rules:

An OpenMP directive is a line of text which begins with a special marker, and which is placed immediately before the section of the program which it controls. In the most common case, this section of code is a loop. A simple directive may also be required to denote the end of the program section to be controlled.

For C and C++ programs, the special marker is actually a preprocessor directive, Every OpenMP directive begins with the string # pragma omp. Thus, a directive that indicates the beginning of a for loop that is to be handled in parallel is:

# pragma omp parallel for (more information possible)

Some OpenMP directive lines can become quite long, and it may be convenient to break a long line into shorter pieces. In C and C++, there are two ways to do this: Let's imagine our long line looks like this:

# pragma omp parallel for clause1 clause2 clause3 clause4 clause5
Using the backslash character as the last character on the first line, we can "escape" the newline character, so that, essentially we can type two lines but have the preprocessor and compiler treat them as one line:
# pragma omp parallel for clause1 clause2 clause3 \
clause4 clause5
or we can simply repeat the marker symbol as often as needed:
# pragma omp parallel for clause1 clause2 clause3
# pragma omp parallel for clause4 clause5

For FORTRAN77 programs, the special marker is c$omp while FORTRAN90 programs use !$omp. A directive that indicates the beginning of a parallel do loop in FORTRAN90 is:

!$omp parallel do (more information possible)
OpenMP requires that FORTRAN programs also place a terminating directive immediately following such a loop, which in this case would be
!$omp end parallel do

In FORTRAN77, the text of a directive can only appear in columns 7 to 72. This makes it important to be able to deal with long directive lines. If we suppose our long directive line looks like this:

c$omp parallel for clause1 clause2 clause3 clause4 clause5
we can use the FORTRAN77 continuation rule to rewrite this as:
c$omp parallel for clause1 clause2 clause3
c$omp&clause4 clause5
Here the ampersand in column 6 indicates that this line is to be treated as a continuation of the previous one. It is also possible to simply "stack" directive lines one after the other:
c$omp parallel for clause1 clause2 clause3
c$omp clause4 clause5

In FORTRAN90, there is no explicit limit on line length. Long lines can be continued by ending them with an ampersand. If our long directive looks like this:

!$omp parallel for clause1 clause2 clause3 clause4 clause5
we can use the FORTRAN90 continuation rule to rewrite this as:
!$omp parallel for clause1 clause2 clause3 &
!$omp clause4 clause5
It is also possible to simply "stack" directive lines one after the other:
!$omp parallel for clause1 clause2 clause3
!$omp clause4 clause5


OpenMP Directives
the Shared and Private clauses:

The most used and most useful OpenMP clauses are shared and private; These clauses allow you to tell the compiler which of the variables in a section of code need the special treatment given to "private" variables.

If possible, you should simply look at every variable inside the parallel section, and declare it in a shared, private or reduction clause. The reduction clause is described later; it is used for special variables that handle summation, maximums and so on.

If you do not declare a variable's type, the default rule makes an undeclared variable shared except:

The private clause has the form


        private ( variable1, variable2, ..., variable#n   )
      
Each variable in the list will be treated as a private variable in the loop.

The shared clause has the form


        shared ( variable1, variable2, ..., variable#n   )
      
Each variable in the list will be treated as a shared variable in the loop.

Every variable that appears within the parallel code section may appear in a private or shared clause (but not both!).

Certain variables may appear in a special reduction clause. Such variables must not also appear in a private or shared clause.

Names of functions or subroutines which are called inside a parallel section are not declared in a private or shared clause; however, the dummy arguments of such functions and subroutines may be declared.


OpenMP Directives
the Reduction clause:

In a section of code that will be executed in parallel by OpenMP, there may be a variable whose usage indicates that it is not really private or shared. For example, if the purpose of a loop is to sum the entries of a vector into a variable called total, then total can't be private because we need one variable to store all the values. However we can't allow total to be shared, because then its value will be "written" by multiple processors.

Variables for which we only want to have one copy, but which need to be modifiable by all the processors are called reduction variables. It is possible to carry out a reduction operation in a safe and orderly fashion, as long as OpenMP knows the name of the variable, and the type of operation being carried out.

An OpenMP reduction clause has the form

reduction ( op : variable )
If several variables in a loop are associated with the same reduction operator, they can be included in the clause, their names separated by commas. If several reduction operators are used in the loop, then each operator and its associated variable must be declared in a separate clause.

A variable which is declared in a reduction clause must not also appear in a private or shared clause for the same loop.

In C and C++, the list of operator symbols and their meanings include:
SymbolMeaning
+summation
-summation with negation
*product
&bitwise AND
|bitwise OR
^shift
&&logical AND
||logical OR

C++ example:

        total = 0.0;
      # pragma omp parallel for private ( i, p ) shared ( n, x ) reduction ( +: total )
        for ( i = 0; i < n; i++ )
        {
          p = ( ( x[i] - 7 ) * x[i] + 4 ) * x[i] - 83;
          total = total + p;
        }
      

In FORTRAN, the list of operator symbols and their meanings include:
SymbolMeaning
+summation
-summation with negation
*product
.and.logical AND
.or.logical OR
.eqv.logical equality
.neqv.logical nonequality
maxmaximum
minminimum
iandlogical AND
iorlogical OR
ieorlogical exclusive OR

FORTRAN90 example:

        p_max = - huge ( p_max )
      !$omp parallel do private ( i, p ) shared ( n, x ) reduction ( max: p_max )
        do i = 1, n
          p = x(i)**3 - 7 * x(i)**2 + 4 * x(i) - 83
          p_max = max ( p_max, p )
        end do
      !$omp end parallel do
      

OpenMP Include File:

In C or C++ programs, you should add the statement


        # include <omp.h>
      

In any FORTRAN77 routine that references one of the OpenMP functions, you can either declare the type of the function yourself, or have it done by invoking the OpenMP include file:


      include 'omp_lib.h'
      

In any FORTRAN90 routine that references one of the OpenMP functions, you can either declare the type of the function yourself, or have it done by invoking the OpenMP include file:


        include 'omp_lib.h'
      
or you can invoke the OpenMP module:

        use omp_lib
      


OpenMP Compilation:

If you are using a Gnu compiler, then to activate the OpenMP directives you simply include the -fopenmp switch:

If you are using an Intel compiler, then to activate the OpenMP directives you include the -openmp and -parallel switches. Moreover, FORTRAN programs also need the -fpp switch for the FORTRAN preprocessor.


Conditional Compilation:

By design, the OpenMP directives look like comments. Therefore, these directives will be ignored if the program is compiled in "sequential" mode, that is, if the compiler is not instructed to activate them by, for instance, the -fopenmp switch on the Gnu compilers.

It would be nice if, when possible, a single program could run in sequential or parallel mode, depending only on the compiler switch. For the general OpenMp program, this is not quite true, because the user will may have accessed an include file, or called one of the OpenMP functions.

It's still possible to try to make a single code in which even the include file and function calls are hidden when the compiler is invoked in sequential mode.

Method 1: use the preprocessor. Although the preprocessor may not be familiar to FORTRAN programmers, this method should work as well for C, C++ FORTRAN77 and FORTRAN90 programs. When the compiler is invoked with OpenMP enabled, the symbol _OPENMP is defined. Therefore it is possible to have the preprocessor selectively activate or remove pieces of code by delimiting them with the appropriate preprocessor directives.

For instance, we can make the call to omp_get_wtime() "disappear" unless we are using OpenMP by rewriting code like this:


        wtime = omp_get_wtime ( );
      # pragma omp parallel for private ( i ) shared ( x, y )
        for ( i = 0; i < n; i++ )
        {
          y[i] = heffalump ( x[i] );
        }
        wtime = omp_get_wtime ( );
        cout << "Wallclock time was " << wtime << " seconds.\n";
      
so that it now looks like this:

      # ifdef _OPENMP
        wtime = omp_get_wtime ( );
      # endif

      # pragma omp parallel for private ( i ) shared ( x, y )
        for ( i = 0; i < n; i++ )
        {
          y[i] = heffalump ( x[i] );
        }

      # ifdef _OPENMP
        wtime = omp_get_wtime ( );
        cout << "Wallclock time was " << wtime << " seconds.\n";
      # else
        cout << "Wallclock time not available.\n";
      # endif
      

The preprocessor is automatically invoked for C and C++ programs. It may or may not be invoked for FORTRAN programs. (The switch "-fpp" used for the Intel FORTRAN compilers is doing exactly this). FORTRAN users who try to use the preprocessor may have to study their compiler to find out how to invoke the preprocesor if it is not getting called for them.

Method 2 (FORTRAN only): use a special OpenMP conditional compilation marker. The user essentially "comments out" the statements specific to OpenMP. The compiler, when OpenMP is activated, "uncomments" those statements. The conditional compilation markers are "!$" or "*$" or "c$".


        wtime = omp_get_wtime ( )
      !$omp parallel do private ( i ) shared ( x, y )
        do i = 1, n
          y(i) = heffalump ( x(i) )
        end do
        wtime = omp_get_wtime ( );
        write ( *, * ) 'Wallclock time was ', wtime, ' seconds.'
      
would be rewritten as

      !$wtime = omp_get_wtime ( )
      !$omp parallel do private ( i ) shared ( x, y )
        do i = 1, n
          y(i) = heffalump ( x(i) )
        end do
      !$wtime = omp_get_wtime ( );
      !$write ( *, * ) 'Wallclock time was ', wtime, ' seconds.'
      

If you use this second option, note that it is not possible to set up statements that will only execute in sequential mode. Look in the previous example at how the preprocessor's else statement let us use one output statement sequentially and a different one for OpenMP.


Reference:

  1. Peter Arbenz, Wesley Petersen,
    Introduction to Parallel Computing - A practical guide with examples in C,
    Oxford University Press,
    ISBN: 0-19-851576-6,
    LC: QA76.58.P47.
  2. Rohit Chandra, Leonardo Dagum, Dave Kohr, Dror Maydan, Jeff McDonald, Ramesh Menon,
    Parallel Programming in OpenMP,
    Morgan Kaufmann, 2001,
    ISBN: 1-55860-671-8,
    LC: QA76.642.P32.
  3. Barbara Chapman, Gabriele Jost, Ruud vanderPas, David Kuck,
    Using OpenMP: Portable Shared Memory Parallel Processing,
    MIT Press, 2007,
    ISBN13: 978-0262533027,
    LC: QA76.642.C49.
  4. Michael Quinn,
    Parallel Programming in C with MPI and OpenMP,
    McGraw-Hill, 2004,
    ISBN13: 978-0071232654,
    LC: QA76.73.C15.Q55.
  5. Tim Mattson, Rudolf Eigenmann,
    OpenMP: An API for Writing Portable SMP Application Software,
    a slide presentation,
    open_mp_slides.pdf.
  6. libgomp.pdf,
    The GNU OpenMP Implementation.
  7. The OpenMP web site
  8. OpenMP Architecture Review Board,
    OpenMP Application Program Interface,
    Version 3.0,
    May 2008,
    open_mp_3.0.pdf
  9. Intel Fortran Language Reference,
    ifort_language.pdf.

You can return to the HTML web page.


Last revised on 15 August 2008.