BLAS
BLAS is an acronym for Basic Linear Algebra Subroutines. As the name
indicates, it contains subprograms for basic operations on vectors and
matrices. BLAS was designed to be used as a building block in other codes, for
example LAPACK. The source code for BLAS is available through Netlib.
However, many computer vendors will have a special version of BLAS
tuned for maximal speed and efficiency on their computer. This is
one of the main advantages of BLAS: the calling sequences are
standardized so that programs that call BLAS will work on any
computer that has BLAS installed. If you have a fast version of BLAS,
you will also get high performance on all programs that call BLAS.
Hence BLAS provides a simple and portable way to achieve high performance
for calculations involving linear algebra.
LAPACK is a higher-level package built on the same ideas.
Levels and naming conventions
The BLAS subroutines can be divided into three levels:
- Level 1: Vector-vector operations. O(n) data and O(n) work.
- Level 2: Matrix-vector operations. O(n^2) data and O(n^2) work.
- Level 3: Matrix-matrix operations. O(n^2) data and O(n^3) work.
Each BLAS and LAPACK routine comes in several versions, one for each
precision (data type). The first letter of the subprogram name indicates
the precision used:
S Real single precision.
D Real double precision.
C Complex single precision.
Z Complex double precision.
Complex double precision is not strictly defined in Fortran 77,
but most compilers will accept one of the following declarations:
double complex list-of-variables
complex*16 list-of-variables
BLAS 1
Some of the BLAS 1 subprograms are:
- xCOPY - copy one vector to another
- xSWAP - swap two vectors
- xSCAL - scale a vector by a constant
- xAXPY - add a multiple of one vector to another
- xDOT - inner product
- xASUM - 1-norm of a vector
- xNRM2 - 2-norm of a vector
- IxAMAX - find maximal entry in a vector
The first letter (x) can be any of the letters S,D,C,Z
depending on the precision.
A quick reference to BLAS 1 can be found at
http://www.netlib.org/blas/blasqr.ps
BLAS 2
Some of the BLAS 2 subprograms are:
- xGEMV - general matrix-vector multiplication
- xGER - general rank-1 update
- xSYR2 - symmetric rank-2 update
- xTRSV - solve a triangular system of equations
A detailed description of BLAS 2 can be
found at
http://www.netlib.org/blas/blas2-paper.ps.
BLAS 3
Some of the BLAS 3 subprograms are:
- xGEMM - general matrix-matrix multiplication
- xSYMM - symmetric matrix-matrix multiplication
- xSYRK - symmetric rank-k update
- xSYR2K - symmetric rank-2k update
The more advanced matrix operations, like solving a linear system of equations,
are contained in LAPACK. A detailed description of BLAS 3 can be
found at
http://www.netlib.org/blas/blas3-paper.ps.
Examples
Let us first look at a very simple BLAS routine, SSCAL.
The call sequence is
call SSCAL ( n, a, x, incx )
Here x is the vector, n is the length (number of
elements in x we wish to use), and a is the scalar
by which we want to multiply x.
The last argument incx is the increment.
Usually, incx=1 and the vector x corresponds directly
to the one-dimensional Fortran array x.
For incx>1 it specifies how many elements in the array we
should "jump" between each element of the vector x.
For example, if incx=2 it means we should only scale every other
element (note: the physical dimension of the array x should then be
at least 2n-1). Consider these examples where x has been
declared as real x(100).
call SSCAL(100, a, x, 1)
call SSCAL( 50, a, x(50), 1)
call SSCAL( 50, a, x(2), 2)
The first line will scale all 100 elements of x by a.
The next line will only scale the last 50 elements of x by a.
The last line will scale all the even indices of x by a.
Observe that the array x will be overwritten by the new values.
If you need to preserve a copy of the old x, you have to
make a copy first, e.g., by using SCOPY.
Now consider a more complicated example. Suppose you
have two 2-dimensional arrays A and B, and you are asked to
find the (i,j) entry of the product A*B. This is
easily done by computing the inner product of row i from A and
column j of B. We can use the BLAS 1 subroutine SDOT. The only
difficulty is to figure out the correct indices and increments.
The call sequence for SDOT is
SDOT ( n, x, incx, y, incy )
Suppose the array declarations were
real A(lda,lda)
real B(ldb,ldb)
but in the program you know that the actual size of A is m*p
and for B it is p*n. The i'th row of A starts at
the element A(i,1). But since Fortran stores 2-dimensional
arrays down columns, the next row element A(i,2) will
be stored lda elements later in memory (since lda is the
length of a column). Hence we set incx = lda.
For the column in B there is no such problem, the elements are
stored consecutively so incy = 1. The length of the
inner product calculation is p. Hence the answer is
SDOT ( p, A(i,1), lda, B(1,j), 1 )
How to get the BLAS
First of all you should check if you already have BLAS on your
system. If not, you can find it on Netlib at
http://www.netlib.org/blas.
Documentation
The BLAS routines are almost self-explanatory. Once you know which
routine you need, fetch it and read the header section that explains
the input and output parameters in detail. We will look at an example
in the next section when we address the LAPACK routines.
[Fortran 77 Tutorial Home]
W3master@edu.ph.unito.it