Goals and activities

Notes on precision and portability in Fortran

Nearly all contemporary computers support at least portions of the IEEE-754 floating-point standard, also known as IEC 60559. This standard specifies two different precisions for floating-point computation: single and double. Single-precision variables occupy 32 bits of memory and carry 23 bits (approximately 7 decimal digits) of precision. Double-precision variables occupy 64 bits of memory and carry 52 bits (approximately 15 decimal digits) of precision. Single-precision variables can represent real numbers whose magnitude ranges from approximately 10-38 to 10+38, plus 0; double precision, from 10-307 to 10+307, plus 0.

Since the 1990 standard, the Fortran language has offered portable ways to specify the precision of floating-point variables. The language defines the kind of a given variable, which can be used as follows:

module precision
integer, parameter:: SINGLE=kind(1.0) ! single-precision kind
integer, parameter:: DOUBLE=kind(1.0d0) ! double-precision kind
end module precision
Here, SINGLE and DOUBLE are compiler-specific integer constants whose values can be used in declarations and constants in the following way:
use precision
real(SINGLE):: sngl
real(DOUBLE):: dbl
real(DOUBLE),parameter:: PI=3.141592653589793238_DOUBLE
...
In this code fragment, sngl is declared as a single-precision value. Both dbl and PI are double precision; the latter is a named constant. Notice how the suffix _DOUBLE can be appended to constant literals to designate the type.

One common convention nowadays is to specify the working precision of a given section of code. That is, single precision may suffice for some applications, but double precision may be required for others. To handle both possibilities with the same code base, you can write

use precision
integer, parameter:: WP=DOUBLE ! the working precision
real(WP),parameter:: PI=3.141592653589793238_WP
...
In this way, you can alter the floating-point precision of any block of code by changing the definition of WP and recompiling.

Important. The actual values of the parameters SINGLE and DOUBLE are compiler-specific. Many compilers use 4 and 8, respectively, to denote single and double precision. Others use 1 and 2, for example. On a compiler that uses the 4/8 convention, code like

real(4):: sngl
will compile and run (and specify a single-precision variable), but such code is not portable: in particular, it will not compile on a compiler that uses the 1/2 convention (or may specify, say, quadruple precision instead of single precision). Portable code must not depend on the actual values of the kind parameters. Instead, use the kind function as illustrated above.

Remarks on exponential notation. Prior to Fortran 90, the only portable way to specify precision in Fortran was to use the keywords real (single) or double precision. They are still part of the standard, but the kind mechanism makes changing the precision much simpler. (Moreover, the kind mechanism accommodates quadruple precision, base-10 arithmetic, etc., each of which has hardware support on newer computers, without the need to define new language keywords.)

Prior to kind, one could denote the precision of explicit constants only by exponential notation: 1.0e2, for instance, indicates a single-precision constant equal to 1.0×102. The notation 1.0d2 denotes the double-precision equivalent. Floating-point constants without an explicit exponent are single precision; for instance, 3.14 and 100.0 are both single precision. The _WP suffix illustrated above, with the kind convention, allows the precision of manifest constants to remain consistent with the other variables.

Newton's method

Newton's method is a fast and efficient way to find the roots of smooth functions, but it is not guaranteed to converge. Given f and a starting guess for the solution, x0, Newton's method generates the following sequence of values:

x1 = x0 - f(x0)/ f '(x0)
x2 = x1 - f(x1)/ f '(x1),
and in general,
xn+1 = xn - f(xn)/ f '(xn).
For example, to find the square root of 2 using only elementary arithmetic, we can define f(x) = x2 -2, f '(x)=2x and take x0=2. Then Newton's method gives the following sequence of values:
x0 = 2
x1 = 1.5
x2 = 1.4166666666667
x3 = 1.4142156862745
x4 = 1.4142135623747
x5 = 1.4142135623731
x6 = 1.4142135623731
Notice that the answer is correct to 14 significant digits after only four iterations. (Calculators often use Newton's method to compute square roots.)

Some Fortran coding exercises

Exercise 1. Write a Fortran subroutine, newton(f,df,x,tol,ierr) that attempts to find the root of a double-precision function f(x) given its derivative df(x). Do not store all the x's. On entry, x is the initial guess of the root, and on successful return, x holds the estimate of the root. Stop the iteration after 100 steps, when two successive iterates are less than tol apart, or when the absolute value of any iterate exceeds, say, the square root of the largest representable value, whichever occurs first. (Suggestion: huge(x) is an intrinsic function that returns the largest representable floating-point number whose type and kind the same as those of x.)

Set the integer argument ierr to 0 if the routine converges and to a nonzero value of your choosing otherwise.

Here's some code to get you started.

Your newton subroutine should not print anything. Its communication with the rest of your program is solely through the dummy arguments. Handle errors by setting ierr to an appropriate value and returning. Here's a sample program that you might use for testing. Then modify the sample program to find the real roots of ex-x.

Code design issue. The reason why you usually don't want a Newton-type subroutine to print anything is that algorithms like this often are part of the inner loop of a much larger simulation. (For example, a Newton method can be used to generate and follow a bifurcation curve for a set of differential equations.) The root finder may be buried deep in an inner loop of the program, where it might be called thousands of times. A failure to converge may not be an error per se but simply an indication that, say, the bifurcated solution ceases to exist at a certain point; whether the error is fatal or expected is something that can be assessed only at a higher level of the program. For this reason, a Newton root-finder that generates a message like

Warning! Newton method failed to converge - please enter a new initial guess
and halts for input from the terminal is simply a disaster. The best design is to silently set a return code and let the caller decide what to do from there.

Quick survey of Makefiles

At this point, you have three Fortran source code files:
1. precision.f90, which defines the floating-point kinds;
2. newton.f90, which contains the Newton root-finding code;
3. test_newton.f90, which contains a main program and functions on which to test the Newton subroutine.
The files must be compiled in this order, because each requires module information from its predecessor.

The make utility keeps track of the order in which files need to be compiled and linked to form executable programs. make is another example of a little programming language.

The easiest way to explain make is to look at a simple example. Typically, make is invoked simply by typing

make
at the command line. When invoked in this way, make looks for a file called makefile or a file called Makefile (in that order). Often the latter name is used on Unix/Linux systems, as ls lists files starting with capital letters before files starting with lower-case letters. In this way, the makefile stands out in the directory listing, as source code files typically start with lower-case letters. The makefile lists the dependencies of the source and object-code files, together with the commands that make should use to synchronize the source and object-code files.

Here is a sample makefile. (The leading numbers are not part of the makefile; they simply line numbers to simplify the discussion.)

1 #Sample Makefile for compiling the Newton program.
2
3 F90 = pgf95
4 FFLAGS = -c
5 LIBS =
6 OBJECTS = test_newton.o newton.o precision.o
7
8 .SUFFIXES: .o .f .f90
9
10 .f.o:
11      \$(F90) \$(FFLAGS) \$*.f
12 .f90.o:
13      \$(F90) \$(FFLAGS) \$*.f90
14
15 newton: \$(OBJECTS)
16      \$(F90) \$(OBJECTS) \$(LIBS) -o newton
17
18 test_newton.o: newton.o
19 newton.o: precision.o
20 precision.o:
Comments begin with # and extend to the end of the line. Blank lines (e.g., 2 and 7) may be used to aid readability. Lines 3-6 define makefile variables. By convention, such variables are in all capital letters. As with shell variables, makefile variables are strings, consisting of all the characters to the right of the equals sign. Each line starts a new makefile statement. (A statement may be continued onto a second and subsequent line by placing a backslash as the last character in the line.) In this particular example, LIBS is an empty string (but that will change later).

Line 8 isn't usually needed for C and C++ programs, as make usually understands that such programs typically are suffixed with .c. However, Fortran programs have a variety of suffixes, depending on the implementation, and make usually does not understand Fortran.

So lines 3 and 8 are necessary. Line 3 specifies the name of the command that invokes the Fortran compiler: in this case, pgf95.

Line 8 says that object code (.o) files are to be created from files suffixed with either .f or .f90. make looks for files with the corresponding suffix in this order. Lines 11 and 13 tell make what to do when it sees such a file: in this case, to invoke pgf95 with the flag -c (compile only, don't link). Given a source file named abc.f, for example, pgf95 creates an object file named abc.o when compilation is successful.

Line 15 specifies the object files needed to make an executable program called newton. In this case, make expands the value of OBJECTS as defined on Line 6. (Keeping OBJECTS as a variable makes it easy to add more filenames as the program is developed.) If all the files named in OBJECTS exist and are newer than their corresponding source files, then make invokes the command given on Line 16 to link them.

When we first start out, however, there are no such object files. So make looks for a file named test_newton.f or test_newton.f90 in that order, and runs pgf95 on the first one that it finds. (If no files by these names exist, then make exits with an error message.)

However, Line 18 lists a dependency: for test_newton.o to be created, newton.o must exist first. Therefore, make checks whether newton.o can be located in the current directory. If not (as will be the case initially), then make looks for any dependencies for newton.o. In this case, one is specified on Line 19. If precision.o does not exist, then make looks for a suitable entry, which is found on Line 20, and so on.

precision.o does not depend on any other code other than the corresponding source file. (This dependency is implied by Line 8.) So make looks for precision.f, then precision.f90, and compiles the latter. If compilation fails at this point (because of a syntax error in precision.f90, for example), then make halts here. If compilation is successful, then make backtracks, compiling newton.f90 next and finally test_newton.f90. Once all the object files are created successfully, then make invokes the shell command

pgf95 test_newton.o newton.o precision.o -o newton
by expanding Line 16 to create the executable program.

Since make automatically keeps track of dependencies, Lines 18, 19, and 20 may appear in any order in the makefile.

It is most convenient to have Line 15 appear where it does. This is because, when invoked simply as make, the first target in the makefile is created. That target is newton in this case.

If Lines 15-16 were to appear after lines Lines 18-20, then it would be necessary to invoke make with the command

make newton
Otherwise, the first target is test_newton.o, so all that a simple make command would do in this case is to create the object file; the remaining object files would not be linked together to form the executable program.

Command lines, such as those on lines 11, 13, and 16, are lines that start with tabs, not spaces. This distinction is crucial. Failure to start command lines with tabs leads to confusing error messages, as tabs and spaces look alike on the screen. (It's an error to start a command line with a space and a tab, so beware. Command lines must begin with a tab character!)

Typically, as programs are written, many errors in compilation and execution occur. make checks the shell exit code of each command that it executes. If the exit code is nonzero, then make assumes that an error has occurred and halts further processing. After you correct the problem, it is merely necessary to type

make
to re-build the program starting where things left off.

For additional information, here is a Makefile tutorial.

Note: here is the Makefile source without the line numbers. Save this file to disk using the Save As function of your browser. (Don't cut and paste with the mouse from the browser window into another window. Doing so converts tabs to spaces.)

LAPACK

The LAPACK library is a comprehensive set of Fortran subroutines for numerical linear algebra. They are built on the Basic Linear Algebra Subroutines, or BLAS. Most every manufacturer of general-purpose microprocessors provides an optimized implementation of the BLAS as well as interfaces that are callable from C, C++, Java, and other languages.

There are several versions of LAPACK; the newest is fully compatible with Fortran 95 and, arguably, is the most convenient to use. For today's exercise, however, we'll use the Fortran-77 calling interface to LAPACK, because it's still the most widely used.

Both LAPACK and BLAS are available on the machines in ECA 221 as object code libraries, liblapack.a and libblas.a, respectively, in /usr/local/lib.

Exercise 2. Write a subroutine,

linsolve(a,n,b,info)
that solves the system Ax=b for a given n×n matrix A and n-vector b. Here a and b are intent(inout) variables that, on input, define the n×n matrix A and n-vector b, respectively. The input argument n gives the size of the problem. On output, b should hold the solution x; your routine is free to overwrite the contents of a. The output argument info is an integer flag that should be set to zero unless an error is encountered in solving the system; in this case, set info to any nonzero value of your choice.

Put another way, linsolve overwrites b with the quantity A-1b, assuming that A is invertible. It is much more efficient to use a routine like linsolve than to form A-1 explicitly, particularly when the system is large.

Add this subroutine to those already included in the module in newton.f90. The declarations will look like

subroutine linsolve(a,n,b,info)
integer,intent(in):: n ! the size of the problem
real(DOUBLE),intent(inout):: a(n,n) ! overwritten by DGESV
real(DOUBLE),intent(inout):: b(n) ! holds RHS in input and solution on output
integer,intent(out):: info ! return code
A use precision statement is not necessary, since one appears at the top of the module. (A use precision statement is necessary if linsolve appears as a standalone subroutine outside of a module.)

Your routine will call the LAPACK library routine dgesv, with the Fortran 77 calling sequence (i.e., the first one listed on the manual page). Your routine must provide the necessary workspace for a problem of arbitrary size. Your routine must not print anything; all communication is through its arguments (for similar reasons as for the Newton subroutine).

Exercise 3. Write a subroutine,

multinewt(f,df,n,x,epsilon,info)
that implements Newton's method for an arbitrary differentiable vector-valued function f. Newton's method for vector-valued functions looks just like the scalar case above, except that the division by f ' is replaced by Df(xk)-1 f(xk), where Df is the Jacobian matrix of partial derivatives of f, evaluated at xk.

In the Fortran code, f and df are subroutines that define the function and its Jacobian. The input argument n gives the size of the problem (i.e., the number of component functions in f). The argument x on input is a starting guess for the solution; on return, x holds an estimate of the solution when the routine is successful. The output argument info is an integer flag that is set to zero if the routine completes successfully and to any nonzero value of your choice otherwise. The input argument epsilon defines the stopping criterion. Your routine should halt after 100 iterations or when the difference between two successive iterates is less than epsilon, whichever occurs first. You may define the difference between two vectors x and y as

maxk |xk - yk|,
that is, the maximum absolute value of the difference between any two corresponding components.

The first two arguments, which require an interface block, define subroutines that evaluate f and Df, respectively. They are invoked by

call f(x,n,fcn)
call df(x,n,dfcn)
respectively. In each case, x is an input argument that specifies the current point. The output argument fcn is an n-vector whose contents are f(x). The output argument dfcn is an n×n matrix whose contents are Df(x).

Use linsolve to evaluate Df(xk)-1 f(xk). Do not form Df(xk)-1 explicitly!

Here is a sample code to get you started.

For testing purposes, try solving the system

x2y - xy2 = -6
x3y + xy3 = -10

Write a program akin to test_newton.f90 that defines this function and its Jacobian matrix. Read in the starting point and tolerance from the standard input, as in test_newton.f90. There are two solutions. Which one you find depends on the starting guess that you supply.

You will need to link with external libraries to obtain the object code for dgesv and the BLAS. Modify the LIBS variable in your makefile so that it reads

LIBS = -llapack -lblas
This tells the compiler drive to look for the LAPACK library and the BLAS library. The -l options must be listed in the order shown, because LAPACK depends on BLAS, and the linker processes libraries in the order in which they are given on the command line (from left to right). No spaces may appear between -l and the library names.

Edit the remainder of the previous makefile appropriately and use it to compile and link your test program.