subroutine sgetv0 & ( ido, bmat, itry, initv, n, j, v, ldv, resid, rnorm, & ipntr, workd, ierr ) ! !! SGETV0 generates a random initial residual vector for the Arnoldi process. ! !----------------------------------------------------------------------- !\BeginDoc ! !\Name: sgetv0 ! !\Description: ! SGETV0 generates a random initial residual vector for the Arnoldi process. ! Force the residual vector to be in the range of the operator OP. ! !\Usage: ! call sgetv0 ! ( IDO, BMAT, ITRY, INITV, N, J, V, LDV, RESID, RNORM, ! IPNTR, WORKD, IERR ) ! !\Arguments ! IDO Integer. (INPUT/OUTPUT) ! Reverse communication flag. IDO must be zero on the first ! call to sgetv0. ! ------------------------------------------------------------- ! IDO = 0: first call to the reverse communication interface ! IDO = -1: compute Y = OP * X where ! IPNTR(1) is the pointer into WORKD for X, ! IPNTR(2) is the pointer into WORKD for Y. ! This is for the initialization phase to force the ! starting vector into the range of OP. ! IDO = 2: compute Y = B * X where ! IPNTR(1) is the pointer into WORKD for X, ! IPNTR(2) is the pointer into WORKD for Y. ! IDO = 99: done ! ------------------------------------------------------------- ! ! BMAT Character*1. (INPUT) ! BMAT specifies the type of the matrix B in the (generalized) ! eigenvalue problem A*x = lambda*B*x. ! B = 'I' -> standard eigenvalue problem A*x = lambda*x ! B = 'G' -> generalized eigenvalue problem A*x = lambda*B*x ! ! ITRY Integer. (INPUT) ! ITRY counts the number of times that sgetv0 is called. ! It should be set to 1 on the initial call to sgetv0. ! ! INITV Logical variable. (INPUT) ! .TRUE. => the initial residual vector is given in RESID. ! .FALSE. => generate a random initial residual vector. ! ! N Integer. (INPUT) ! Dimension of the problem. ! ! J Integer. (INPUT) ! Index of the residual vector to be generated, with respect to ! the Arnoldi process. J > 1 in case of a "restart". ! ! V Real N by J array. (INPUT) ! The first J-1 columns of V contain the current Arnoldi basis ! if this is a "restart". ! ! LDV Integer. (INPUT) ! Leading dimension of V exactly as declared in the calling ! program. ! ! RESID Real array of length N. (INPUT/OUTPUT) ! Initial residual vector to be generated. If RESID is ! provided, force RESID into the range of the operator OP. ! ! RNORM Real scalar. (OUTPUT) ! B-norm of the generated residual. ! ! IPNTR Integer array of length 3. (OUTPUT) ! ! WORKD Real work array of length 2*N. (REVERSE COMMUNICATION). ! On exit, WORK(1:N) = B*RESID to be used in SSAITR. ! ! IERR Integer. (OUTPUT) ! = 0: Normal exit. ! = -1: Cannot generate a nontrivial restarted residual vector ! in the range of the operator OP. ! !\EndDoc ! !----------------------------------------------------------------------- ! !\BeginLib ! !\Local variables: ! xxxxxx real ! !\References: ! 1. D.C. Sorensen, "Implicit Application of Polynomial Filters in ! a k-Step Arnoldi Method", SIAM J. Matr. Anal. Apps., 13 (1992), ! pp 357-385. ! 2. R.B. Lehoucq, "Analysis and Implementation of an Implicitly ! Restarted Arnoldi Iteration", Rice University Technical Report ! TR95-13, Department of Computational and Applied Mathematics. ! !\Routines called: ! svout ARPACK utility routine for vector output. ! slarnv LAPACK routine for generating a random vector. ! sgemv Level 2 BLAS routine for matrix vector multiplication. ! scopy Level 1 BLAS that copies one vector to another. ! sdot Level 1 BLAS that computes the scalar product of two vectors. ! snrm2 Level 1 BLAS that computes the norm of a vector. ! !\Author ! Danny Sorensen Phuong Vu ! Richard Lehoucq CRPC / Rice University ! Dept. of Computational & Houston, Texas ! Applied Mathematics ! Rice University ! Houston, Texas ! !\SCCS Information: @(#) ! FILE: getv0.F SID: 2.7 DATE OF SID: 04/07/99 RELEASE: 2 ! !\EndLib ! !----------------------------------------------------------------------- ! ! %----------------------------------------------------% ! | Include files for debugging and timing information | ! %----------------------------------------------------% ! include 'debug.h' include 'stat.h' ! ! %------------------% ! | Scalar Arguments | ! %------------------% ! character bmat*1 logical initv integer ido, ierr, itry, j, ldv, n Real & rnorm ! ! %-----------------% ! | Array Arguments | ! %-----------------% ! integer ipntr(3) Real & resid(n), v(ldv,j), workd(2*n) ! ! %------------% ! | Parameters | ! %------------% ! Real & one, zero parameter (one = 1.0E+0, zero = 0.0E+0) ! ! %------------------------% ! | Local Scalars & Arrays | ! %------------------------% ! logical first, inits, orth integer idist, iseed(4), iter, msglvl, jj Real & rnorm0 save first, iseed, inits, iter, msglvl, orth, rnorm0 ! ! %----------------------% ! | External Subroutines | ! %----------------------% ! external slarnv, svout, scopy, sgemv ! ! %--------------------% ! | External Functions | ! %--------------------% ! Real & sdot, snrm2 external sdot, snrm2 ! ! %---------------------% ! | Intrinsic Functions | ! %---------------------% ! intrinsic abs, sqrt ! ! %-----------------% ! | Data Statements | ! %-----------------% ! data inits /.true./ ! ! %-----------------------% ! | Executable Statements | ! %-----------------------% ! ! ! %-----------------------------------% ! | Initialize the seed of the LAPACK | ! | random number generator | ! %-----------------------------------% ! if (inits) then iseed(1) = 1 iseed(2) = 3 iseed(3) = 5 iseed(4) = 7 inits = .false. end if ! if (ido == 0) then ! ! %-------------------------------% ! | Initialize timing statistics | ! | & message level for debugging | ! %-------------------------------% ! call cpu_time (t0) msglvl = mgetv0 ! ierr = 0 iter = 0 first = .FALSE. orth = .FALSE. ! ! %-----------------------------------------------------% ! | Possibly generate a random starting vector in RESID | ! | Use a LAPACK random number generator used by the | ! | matrix generation routines. | ! | idist = 1: uniform (0,1) distribution; | ! | idist = 2: uniform (-1,1) distribution; | ! | idist = 3: normal (0,1) distribution; | ! %-----------------------------------------------------% ! if (.not.initv) then idist = 2 call slarnv (idist, iseed, n, resid) end if ! ! %----------------------------------------------------------% ! | Force the starting vector into the range of OP to handle | ! | the generalized problem when B is possibly (singular). | ! %----------------------------------------------------------% ! call cpu_time (t2) if (bmat == 'G') then nopx = nopx + 1 ipntr(1) = 1 ipntr(2) = n + 1 call scopy (n, resid, 1, workd, 1) ido = -1 go to 9000 end if end if ! ! %-----------------------------------------% ! | Back from computing OP*(initial-vector) | ! %-----------------------------------------% ! if (first) go to 20 ! ! %-----------------------------------------------% ! | Back from computing B*(orthogonalized-vector) | ! %-----------------------------------------------% ! if (orth) go to 40 ! if (bmat == 'G') then call cpu_time (t3) tmvopx = tmvopx + (t3 - t2) end if ! ! %------------------------------------------------------% ! | Starting vector is now in the range of OP; r = OP*r; | ! | Compute B-norm of starting vector. | ! %------------------------------------------------------% ! call cpu_time (t2) first = .TRUE. if (bmat == 'G') then nbx = nbx + 1 call scopy (n, workd(n+1), 1, resid, 1) ipntr(1) = n + 1 ipntr(2) = 1 ido = 2 go to 9000 else if (bmat == 'I') then call scopy (n, resid, 1, workd, 1) end if ! 20 continue ! if (bmat == 'G') then call cpu_time (t3) tmvbx = tmvbx + (t3 - t2) end if ! first = .FALSE. if (bmat == 'G') then rnorm0 = sdot (n, resid, 1, workd, 1) rnorm0 = sqrt(abs(rnorm0)) else if (bmat == 'I') then rnorm0 = snrm2(n, resid, 1) end if rnorm = rnorm0 ! ! %---------------------------------------------% ! | Exit if this is the very first Arnoldi step | ! %---------------------------------------------% ! if (j == 1) go to 50 ! ! %---------------------------------------------------------------- ! | Otherwise need to B-orthogonalize the starting vector against | ! | the current Arnoldi basis using Gram-Schmidt with iter. ref. | ! | This is the case where an invariant subspace is encountered | ! | in the middle of the Arnoldi factorization. | ! | | ! | s = V^{T}*B*r; r = r - V*s; | ! | | ! | Stopping criteria used for iter. ref. is discussed in | ! | Parlett's book, page 107 and in Gragg & Reichel TOMS paper. | ! %---------------------------------------------------------------% ! orth = .TRUE. 30 continue ! call sgemv ('T', n, j-1, one, v, ldv, workd, 1, & zero, workd(n+1), 1) call sgemv ('N', n, j-1, -one, v, ldv, workd(n+1), 1, & one, resid, 1) ! ! %----------------------------------------------------------% ! | Compute the B-norm of the orthogonalized starting vector | ! %----------------------------------------------------------% ! call cpu_time (t2) if (bmat == 'G') then nbx = nbx + 1 call scopy (n, resid, 1, workd(n+1), 1) ipntr(1) = n + 1 ipntr(2) = 1 ido = 2 go to 9000 else if (bmat == 'I') then call scopy (n, resid, 1, workd, 1) end if ! 40 continue ! if (bmat == 'G') then call cpu_time (t3) tmvbx = tmvbx + (t3 - t2) end if ! if (bmat == 'G') then rnorm = sdot (n, resid, 1, workd, 1) rnorm = sqrt(abs(rnorm)) else if (bmat == 'I') then rnorm = snrm2(n, resid, 1) end if ! ! %--------------------------------------% ! | Check for further orthogonalization. | ! %--------------------------------------% ! if (msglvl > 2) then call svout (logfil, 1, rnorm0, ndigit, & '_getv0: re-orthonalization ; rnorm0 is') call svout (logfil, 1, rnorm, ndigit, & '_getv0: re-orthonalization ; rnorm is') end if ! if (rnorm > 0.717*rnorm0) go to 50 ! iter = iter + 1 if (iter <= 5) then ! ! %-----------------------------------% ! | Perform iterative refinement step | ! %-----------------------------------% ! rnorm0 = rnorm go to 30 else ! ! %------------------------------------% ! | Iterative refinement step "failed" | ! %------------------------------------% ! do 45 jj = 1, n resid(jj) = zero 45 continue rnorm = zero ierr = -1 end if ! 50 continue ! if (msglvl > 0) then call svout (logfil, 1, rnorm, ndigit, & '_getv0: B-norm of initial / restarted starting vector') end if if (msglvl > 3) then call svout (logfil, n, resid, ndigit, & '_getv0: initial / restarted starting vector') end if ido = 99 ! call cpu_time (t1) tgetv0 = tgetv0 + (t1 - t0) ! 9000 continue return ! ! %---------------% ! | End of sgetv0 | ! %---------------% ! end !----------------------------------------------------------------------- !\BeginDoc ! !\Name: slaqrb ! !\Description: ! SLAQRB computes eigenvalues and Schur decomposition of an upper ! Hessenberg submatrix in rows and columns ILO to IHI. Only the ! last component of the Schur vectors are computed. ! ! This is mostly a modification of the LAPACK routine slahqr. ! !\Usage: ! call slaqrb ! ( WANTT, N, ILO, IHI, H, LDH, WR, WI, Z, INFO ) ! !\Arguments ! WANTT Logical variable. (INPUT) ! = .TRUE. : the full Schur form T is required; ! = .FALSE.: only eigenvalues are required. ! ! N Integer. (INPUT) ! The order of the matrix H. N >= 0. ! ! ILO Integer. (INPUT) ! IHI Integer. (INPUT) ! It is assumed that H is already upper quasi-triangular in ! rows and columns IHI+1:N, and that H(ILO,ILO-1) = 0 (unless ! ILO = 1). SLAQRB works primarily with the Hessenberg ! submatrix in rows and columns ILO to IHI, but applies ! transformations to all of H if WANTT is .TRUE.. ! 1 <= ILO <= max(1,IHI); IHI <= N. ! ! H Real array, dimension (LDH,N). (INPUT/OUTPUT) ! On entry, the upper Hessenberg matrix H. ! On exit, if WANTT is .TRUE., H is upper quasi-triangular in ! rows and columns ILO:IHI, with any 2-by-2 diagonal blocks in ! standard form. If WANTT is .FALSE., the contents of H are ! unspecified on exit. ! ! LDH Integer. (INPUT) ! The leading dimension of the array H. LDH >= max(1,N). ! ! WR Real array, dimension (N). (OUTPUT) ! WI Real array, dimension (N). (OUTPUT) ! The real and imaginary parts, respectively, of the computed ! eigenvalues ILO to IHI are stored in the corresponding ! elements of WR and WI. If two eigenvalues are computed as a ! complex conjugate pair, they are stored in consecutive ! elements of WR and WI, say the i-th and (i+1)th, with ! WI(i) > 0 and WI(i+1) < 0. If WANTT is .TRUE., the ! eigenvalues are stored in the same order as on the diagonal ! of the Schur form returned in H, with WR(i) = H(i,i), and, if ! H(i:i+1,i:i+1) is a 2-by-2 diagonal block, ! WI(i) = sqrt(H(i+1,i)*H(i,i+1)) and WI(i+1) = -WI(i). ! ! Z Real array, dimension (N). (OUTPUT) ! On exit Z contains the last components of the Schur vectors. ! ! INFO Integer. (OUPUT) ! = 0: successful exit ! > 0: SLAQRB failed to compute all the eigenvalues ILO to IHI ! in a total of 30*(IHI-ILO+1) iterations; if INFO = i, ! elements i+1:ihi of WR and WI contain those eigenvalues ! which have been successfully computed. ! !\Remarks ! 1. None. ! !----------------------------------------------------------------------- ! !\BeginLib ! !\Local variables: ! xxxxxx real ! !\Routines called: ! slabad LAPACK routine that computes machine constants. ! slamch LAPACK routine that determines machine constants. ! slanhs LAPACK routine that computes various norms of a matrix. ! slanv2 LAPACK routine that computes the Schur factorization of ! 2 by 2 nonsymmetric matrix in standard form. ! slarfg LAPACK Householder reflection construction routine. ! scopy Level 1 BLAS that copies one vector to another. ! srot Level 1 BLAS that applies a rotation to a 2 by 2 matrix. ! !\Author ! Danny Sorensen Phuong Vu ! Richard Lehoucq CRPC / Rice University ! Dept. of Computational & Houston, Texas ! Applied Mathematics ! Rice University ! Houston, Texas ! !\Revision history: ! xx/xx/92: Version ' 2.4' ! Modified from the LAPACK routine slahqr so that only the ! last component of the Schur vectors are computed. ! !\SCCS Information: @(#) ! FILE: laqrb.F SID: 2.2 DATE OF SID: 8/27/96 RELEASE: 2 ! !\Remarks ! 1. None ! !\EndLib ! !----------------------------------------------------------------------- ! subroutine slaqrb ( wantt, n, ilo, ihi, h, ldh, wr, wi, & z, info ) ! !! SLAQRB computes eigenvalues and Schur decomposition. ! ! %------------------% ! | Scalar Arguments | ! %------------------% ! logical wantt integer ihi, ilo, info, ldh, n ! ! %-----------------% ! | Array Arguments | ! %-----------------% ! Real & h( ldh, * ), wi( * ), wr( * ), z( * ) ! ! %------------% ! | Parameters | ! %------------% ! Real & zero, one, dat1, dat2 parameter (zero = 0.0E+0, one = 1.0E+0, dat1 = 7.5E-1, & dat2 = -4.375E-1) ! ! %------------------------% ! | Local Scalars & Arrays | ! %------------------------% ! integer i, i1, i2, itn, its, j, k, l, m, nh, nr Real & cs, h00, h10, h11, h12, h21, h22, h33, h33s, & h43h34, h44, h44s, ovfl, s, smlnum, sn, sum, & t1, t2, t3, tst1, ulp, unfl, v1, v2, v3 Real & v( 3 ), work( 1 ) ! ! %--------------------% ! | External Functions | ! %--------------------% ! Real & slamch, slanhs external slamch, slanhs ! ! %----------------------% ! | External Subroutines | ! %----------------------% ! external scopy, slabad, slanv2, slarfg, srot ! ! %-----------------------% ! | Executable Statements | ! %-----------------------% ! info = 0 ! ! %--------------------------% ! | Quick return if possible | ! %--------------------------% ! if( n==0 ) & return if( ilo==ihi ) then wr( ilo ) = h( ilo, ilo ) wi( ilo ) = zero return end if ! ! %---------------------------------------------% ! | Initialize the vector of last components of | ! | the Schur vectors for accumulation. | ! %---------------------------------------------% ! z(1:n-1) = zero z(n) = one ! nh = ihi - ilo + 1 ! ! %-------------------------------------------------------------% ! | Set machine-dependent constants for the stopping criterion. | ! | If norm(H) <= sqrt(OVFL), overflow should not occur. | ! %-------------------------------------------------------------% ! unfl = slamch( 'safe minimum' ) ovfl = one / unfl call slabad( unfl, ovfl ) ulp = slamch( 'precision' ) smlnum = unfl*( nh / ulp ) ! ! %---------------------------------------------------------------% ! | I1 and I2 are the indices of the first row and last column | ! | of H to which transformations must be applied. If eigenvalues | ! | only are computed, I1 and I2 are set inside the main loop. | ! | Zero out H(J+2,J) = ZERO for J=1:N if WANTT = .TRUE. | ! | else H(J+2,J) for J=ILO:IHI-ILO-1 if WANTT = .FALSE. | ! %---------------------------------------------------------------% ! if( wantt ) then i1 = 1 i2 = n do 8 i=1,i2-2 h(i1+i+1,i) = zero 8 continue else do 9 i=1, ihi-ilo-1 h(ilo+i+1,ilo+i-1) = zero 9 continue end if ! ! %---------------------------------------------------% ! | ITN is the total number of QR iterations allowed. | ! %---------------------------------------------------% ! itn = 30*nh ! ! ------------------------------------------------------------------ ! The main loop begins here. I is the loop index and decreases from ! IHI to ILO in steps of 1 or 2. Each iteration of the loop works ! with the active submatrix in rows and columns L to I. ! Eigenvalues I+1 to IHI have already converged. Either L = ILO or ! H(L,L-1) is negligible so that the matrix splits. ! ------------------------------------------------------------------ ! i = ihi 10 continue l = ilo if( iilo ) then ! ! %------------------------% ! | H(L,L-1) is negligible | ! %------------------------% ! h( l, l-1 ) = zero end if ! ! %-------------------------------------------------------------% ! | Exit from loop if a submatrix of order 1 or 2 has split off | ! %-------------------------------------------------------------% ! if( l>=i-1 ) & go to 140 ! ! %---------------------------------------------------------% ! | Now the active submatrix is in rows and columns L to I. | ! | If eigenvalues only are being computed, only the active | ! | submatrix need be transformed. | ! %---------------------------------------------------------% ! if( .not.wantt ) then i1 = l i2 = i end if ! if( its==10 .or. its.eq.20 ) then ! ! %-------------------% ! | Exceptional shift | ! %-------------------% ! s = abs( h( i, i-1 ) ) + abs( h( i-1, i-2 ) ) h44 = dat1*s h33 = h44 h43h34 = dat2*s*s ! else ! ! %-----------------------------------------% ! | Prepare to use Wilkinson's double shift | ! %-----------------------------------------% ! h44 = h( i, i ) h33 = h( i-1, i-1 ) h43h34 = h( i, i-1 )*h( i-1, i ) end if ! ! %-----------------------------------------------------% ! | Look for two consecutive small subdiagonal elements | ! %-----------------------------------------------------% ! do 40 m = i - 2, l, -1 ! ! %---------------------------------------------------------% ! | Determine the effect of starting the double-shift QR | ! | iteration at row M, and see if this would make H(M,M-1) | ! | negligible. | ! %---------------------------------------------------------% ! h11 = h( m, m ) h22 = h( m+1, m+1 ) h21 = h( m+1, m ) h12 = h( m, m+1 ) h44s = h44 - h11 h33s = h33 - h11 v1 = ( h33s*h44s-h43h34 ) / h21 + h12 v2 = h22 - h11 - h33s - h44s v3 = h( m+2, m+1 ) s = abs( v1 ) + abs( v2 ) + abs( v3 ) v1 = v1 / s v2 = v2 / s v3 = v3 / s v( 1 ) = v1 v( 2 ) = v2 v( 3 ) = v3 if( m==l ) & go to 50 h00 = h( m-1, m-1 ) h10 = h( m, m-1 ) tst1 = abs( v1 )*( abs( h00 )+abs( h11 )+abs( h22 ) ) if( abs( h10 )*( abs( v2 )+abs( v3 ) )<=ulp*tst1 ) & go to 50 40 continue 50 continue ! ! %----------------------% ! | Double-shift QR step | ! %----------------------% ! do 120 k = m, i - 1 ! ! ------------------------------------------------------------ ! The first iteration of this loop determines a reflection G ! from the vector V and applies it from left and right to H, ! thus creating a nonzero bulge below the subdiagonal. ! ! Each subsequent iteration determines a reflection G to ! restore the Hessenberg form in the (K-1)th column, and thus ! chases the bulge one step toward the bottom of the active ! submatrix. NR is the order of G. ! ------------------------------------------------------------ ! nr = min( 3, i-k+1 ) if( k>m ) & call scopy( nr, h( k, k-1 ), 1, v, 1 ) call slarfg( nr, v( 1 ), v( 2 ), 1, t1 ) if( k>m ) then h( k, k-1 ) = v( 1 ) h( k+1, k-1 ) = zero if( kl ) then h( k, k-1 ) = -h( k, k-1 ) end if v2 = v( 2 ) t2 = t1*v2 if( nr==3 ) then v3 = v( 3 ) t3 = t1*v3 ! ! %------------------------------------------------% ! | Apply G from the left to transform the rows of | ! | the matrix in columns K to I2. | ! %------------------------------------------------% ! do 60 j = k, i2 sum = h( k, j ) + v2*h( k+1, j ) + v3*h( k+2, j ) h( k, j ) = h( k, j ) - sum*t1 h( k+1, j ) = h( k+1, j ) - sum*t2 h( k+2, j ) = h( k+2, j ) - sum*t3 60 continue ! ! %----------------------------------------------------% ! | Apply G from the right to transform the columns of | ! | the matrix in rows I1 to min(K+3,I). | ! %----------------------------------------------------% ! do 70 j = i1, min( k+3, i ) sum = h( j, k ) + v2*h( j, k+1 ) + v3*h( j, k+2 ) h( j, k ) = h( j, k ) - sum*t1 h( j, k+1 ) = h( j, k+1 ) - sum*t2 h( j, k+2 ) = h( j, k+2 ) - sum*t3 70 continue ! ! %----------------------------------% ! | Accumulate transformations for Z | ! %----------------------------------% ! sum = z( k ) + v2*z( k+1 ) + v3*z( k+2 ) z( k ) = z( k ) - sum*t1 z( k+1 ) = z( k+1 ) - sum*t2 z( k+2 ) = z( k+2 ) - sum*t3 else if( nr==2 ) then ! ! %------------------------------------------------% ! | Apply G from the left to transform the rows of | ! | the matrix in columns K to I2. | ! %------------------------------------------------% ! do 90 j = k, i2 sum = h( k, j ) + v2*h( k+1, j ) h( k, j ) = h( k, j ) - sum*t1 h( k+1, j ) = h( k+1, j ) - sum*t2 90 continue ! ! %----------------------------------------------------% ! | Apply G from the right to transform the columns of | ! | the matrix in rows I1 to min(K+3,I). | ! %----------------------------------------------------% ! do j = i1, i sum = h( j, k ) + v2*h( j, k+1 ) h( j, k ) = h( j, k ) - sum*t1 h( j, k+1 ) = h( j, k+1 ) - sum*t2 end do ! ! %----------------------------------% ! | Accumulate transformations for Z | ! %----------------------------------% ! sum = z( k ) + v2*z( k+1 ) z( k ) = z( k ) - sum*t1 z( k+1 ) = z( k+1 ) - sum*t2 end if 120 continue 130 continue ! ! %-------------------------------------------------------% ! | Failure to converge in remaining number of iterations | ! %-------------------------------------------------------% ! info = i return 140 continue if( l==i ) then ! ! %------------------------------------------------------% ! | H(I,I-1) is negligible: one eigenvalue has converged | ! %------------------------------------------------------% ! wr( i ) = h( i, i ) wi( i ) = zero else if( l==i-1 ) then ! ! %--------------------------------------------------------% ! | H(I-1,I-2) is negligible; | ! | a pair of eigenvalues have converged. | ! | | ! | Transform the 2-by-2 submatrix to standard Schur form, | ! | and compute and store the eigenvalues. | ! %--------------------------------------------------------% ! call slanv2( h( i-1, i-1 ), h( i-1, i ), h( i, i-1 ), & h( i, i ), wr( i-1 ), wi( i-1 ), wr( i ), wi( i ), & cs, sn ) if( wantt ) then ! ! %-----------------------------------------------------% ! | Apply the transformation to the rest of H and to Z, | ! | as required. | ! %-----------------------------------------------------% ! if( i2>i ) & call srot( i2-i, h( i-1, i+1 ), ldh, h( i, i+1 ), ldh, & cs, sn ) call srot( i-i1-1, h( i1, i-1 ), 1, h( i1, i ), 1, cs, sn ) sum = cs*z( i-1 ) + sn*z( i ) z( i ) = cs*z( i ) - sn*z( i-1 ) z( i-1 ) = sum end if end if ! ! %---------------------------------------------------------% ! | Decrement number of remaining iterations, and return to | ! | start of the main loop with new value of I. | ! %---------------------------------------------------------% ! itn = itn - its i = l - 1 go to 10 150 continue return ! ! %---------------% ! | End of slaqrb | ! %---------------% ! end !----------------------------------------------------------------------- !\BeginDoc ! !\Name: snaitr ! !\Description: ! SNAITR is a reverse communication interface for applying NP additional ! steps to a K step nonsymmetric Arnoldi factorization. ! ! Input: OP*V_{k} - V_{k}*H = r_{k}*e_{k}^T ! ! with (V_{k}^T)*B*V_{k} = I, (V_{k}^T)*B*r_{k} = 0. ! ! Output: OP*V_{k+p} - V_{k+p}*H = r_{k+p}*e_{k+p}^T ! ! with (V_{k+p}^T)*B*V_{k+p} = I, (V_{k+p}^T)*B*r_{k+p} = 0. ! ! where OP and B are as in snaupd. The B-norm of r_{k+p} is also ! computed and returned. ! !\Usage: ! call snaitr ! ( IDO, BMAT, N, K, NP, NB, RESID, RNORM, V, LDV, H, LDH, ! IPNTR, WORKD, INFO ) ! !\Arguments ! IDO Integer. (INPUT/OUTPUT) ! Reverse communication flag. ! ------------------------------------------------------------- ! IDO = 0: first call to the reverse communication interface ! IDO = -1: compute Y = OP * X where ! IPNTR(1) is the pointer into WORK for X, ! IPNTR(2) is the pointer into WORK for Y. ! This is for the restart phase to force the new ! starting vector into the range of OP. ! IDO = 1: compute Y = OP * X where ! IPNTR(1) is the pointer into WORK for X, ! IPNTR(2) is the pointer into WORK for Y, ! IPNTR(3) is the pointer into WORK for B * X. ! IDO = 2: compute Y = B * X where ! IPNTR(1) is the pointer into WORK for X, ! IPNTR(2) is the pointer into WORK for Y. ! IDO = 99: done ! ------------------------------------------------------------- ! When the routine is used in the "shift-and-invert" mode, the ! vector B * Q is already available and do not need to be ! recompute in forming OP * Q. ! ! BMAT Character*1. (INPUT) ! BMAT specifies the type of the matrix B that defines the ! semi-inner product for the operator OP. See snaupd. ! B = 'I' -> standard eigenvalue problem A*x = lambda*x ! B = 'G' -> generalized eigenvalue problem A*x = lambda*M**x ! ! N Integer. (INPUT) ! Dimension of the eigenproblem. ! ! K Integer. (INPUT) ! Current size of V and H. ! ! NP Integer. (INPUT) ! Number of additional Arnoldi steps to take. ! ! NB Integer. (INPUT) ! Blocksize to be used in the recurrence. ! Only work for NB = 1 right now. The goal is to have a ! program that implement both the block and non-block method. ! ! RESID Real array of length N. (INPUT/OUTPUT) ! On INPUT: RESID contains the residual vector r_{k}. ! On OUTPUT: RESID contains the residual vector r_{k+p}. ! ! RNORM Real scalar. (INPUT/OUTPUT) ! B-norm of the starting residual on input. ! B-norm of the updated residual r_{k+p} on output. ! ! V Real N by K+NP array. (INPUT/OUTPUT) ! On INPUT: V contains the Arnoldi vectors in the first K ! columns. ! On OUTPUT: V contains the new NP Arnoldi vectors in the next ! NP columns. The first K columns are unchanged. ! ! LDV Integer. (INPUT) ! Leading dimension of V exactly as declared in the calling ! program. ! ! H Real (K+NP) by (K+NP) array. (INPUT/OUTPUT) ! H is used to store the generated upper Hessenberg matrix. ! ! LDH Integer. (INPUT) ! Leading dimension of H exactly as declared in the calling ! program. ! ! IPNTR Integer array of length 3. (OUTPUT) ! Pointer to mark the starting locations in the WORK for ! vectors used by the Arnoldi iteration. ! ------------------------------------------------------------- ! IPNTR(1): pointer to the current operand vector X. ! IPNTR(2): pointer to the current result vector Y. ! IPNTR(3): pointer to the vector B * X when used in the ! shift-and-invert mode. X is the current operand. ! ------------------------------------------------------------- ! ! WORKD Real work array of length 3*N. (REVERSE COMMUNICATION) ! Distributed array to be used in the basic Arnoldi iteration ! for reverse communication. The calling program should not ! use WORKD as temporary workspace during the iteration !!!!!! ! On input, WORKD(1:N) = B*RESID and is used to save some ! computation at the first step. ! ! INFO Integer. (OUTPUT) ! = 0: Normal exit. ! > 0: Size of the spanning invariant subspace of OP found. ! !\EndDoc ! !----------------------------------------------------------------------- ! !\BeginLib ! !\Local variables: ! xxxxxx real ! !\References: ! 1. D.C. Sorensen, "Implicit Application of Polynomial Filters in ! a k-Step Arnoldi Method", SIAM J. Matr. Anal. Apps., 13 (1992), ! pp 357-385. ! 2. R.B. Lehoucq, "Analysis and Implementation of an Implicitly ! Restarted Arnoldi Iteration", Rice University Technical Report ! TR95-13, Department of Computational and Applied Mathematics. ! !\Routines called: ! sgetv0 ARPACK routine to generate the initial vector. ! ivout ARPACK utility routine that prints integers. ! smout ARPACK utility routine that prints matrices ! svout ARPACK utility routine that prints vectors. ! slabad LAPACK routine that computes machine constants. ! slamch LAPACK routine that determines machine constants. ! slascl LAPACK routine for careful scaling of a matrix. ! slanhs LAPACK routine that computes various norms of a matrix. ! sgemv Level 2 BLAS routine for matrix vector multiplication. ! saxpy Level 1 BLAS that computes a vector triad. ! sscal Level 1 BLAS that scales a vector. ! scopy Level 1 BLAS that copies one vector to another . ! sdot Level 1 BLAS that computes the scalar product of two vectors. ! snrm2 Level 1 BLAS that computes the norm of a vector. ! !\Author ! Danny Sorensen Phuong Vu ! Richard Lehoucq CRPC / Rice University ! Dept. of Computational & Houston, Texas ! Applied Mathematics ! Rice University ! Houston, Texas ! !\Revision history: ! xx/xx/92: Version ' 2.4' ! !\SCCS Information: @(#) ! FILE: naitr.F SID: 2.4 DATE OF SID: 8/27/96 RELEASE: 2 ! !\Remarks ! The algorithm implemented is: ! ! restart = .false. ! Given V_{k} = [v_{1}, ..., v_{k}], r_{k}; ! r_{k} contains the initial residual vector even for k = 0; ! Also assume that rnorm = || B*r_{k} || and B*r_{k} are already ! computed by the calling program. ! ! betaj = rnorm ; p_{k+1} = B*r_{k} ; ! For j = k+1, ..., k+np Do ! 1) if ( betaj < tol ) stop or restart depending on j. ! ( At present tol is zero ) ! if ( restart ) generate a new starting vector. ! 2) v_{j} = r(j-1)/betaj; V_{j} = [V_{j-1}, v_{j}]; ! p_{j} = p_{j}/betaj ! 3) r_{j} = OP*v_{j} where OP is defined as in snaupd ! For shift-invert mode p_{j} = B*v_{j} is already available. ! wnorm = || OP*v_{j} || ! 4) Compute the j-th step residual vector. ! w_{j} = V_{j}^T * B * OP * v_{j} ! r_{j} = OP*v_{j} - V_{j} * w_{j} ! H(:,j) = w_{j}; ! H(j,j-1) = rnorm ! rnorm = || r_(j) || ! If (rnorm > 0.717*wnorm) accept step and go back to 1) ! 5) Re-orthogonalization step: ! s = V_{j}'*B*r_{j} ! r_{j} = r_{j} - V_{j}*s; rnorm1 = || r_{j} || ! alphaj = alphaj + s_{j}; ! 6) Iterative refinement step: ! If (rnorm1 > 0.717*rnorm) then ! rnorm = rnorm1 ! accept step and go back to 1) ! Else ! rnorm = rnorm1 ! If this is the first time in step 6), go to 5) ! Else r_{j} lies in the span of V_{j} numerically. ! Set r_{j} = 0 and rnorm = 0; go to 1) ! EndIf ! End Do ! !\EndLib ! !----------------------------------------------------------------------- ! subroutine snaitr & (ido, bmat, n, k, np, nb, resid, rnorm, v, ldv, h, ldh, & ipntr, workd, info) ! !! SNAITR applies more steps to a K step nonsymmetric Arnoldi factorization. ! ! %----------------------------------------------------% ! | Include files for debugging and timing information | ! %----------------------------------------------------% ! include 'debug.h' include 'stat.h' ! ! %------------------% ! | Scalar Arguments | ! %------------------% ! character bmat*1 integer ido, info, k, ldh, ldv, n, nb, np Real & rnorm ! ! %-----------------% ! | Array Arguments | ! %-----------------% ! integer ipntr(3) Real & h(ldh,k+np), resid(n), v(ldv,k+np), workd(3*n) ! ! %------------% ! | Parameters | ! %------------% ! Real & one, zero parameter (one = 1.0E+0, zero = 0.0E+0) ! ! %---------------% ! | Local Scalars | ! %---------------% ! logical first, orth1, orth2, rstart, step3, step4 integer ierr, i, infol, ipj, irj, ivj, iter, itry, j, msglvl, & jj Real & betaj, ovfl, temp1, rnorm1, smlnum, tst1, ulp, unfl, & wnorm save first, orth1, orth2, rstart, step3, step4, & ierr, ipj, irj, ivj, iter, itry, j, msglvl, ovfl, & betaj, rnorm1, smlnum, ulp, unfl, wnorm ! ! %-----------------------% ! | Local Array Arguments | ! %-----------------------% ! Real & xtemp(2) ! ! %----------------------% ! | External Subroutines | ! %----------------------% ! external saxpy, scopy, sscal, sgemv, sgetv0, slabad, & svout, smout, ivout ! ! %--------------------% ! | External Functions | ! %--------------------% ! Real & sdot, snrm2, slanhs, slamch external sdot, snrm2, slanhs, slamch ! ! %---------------------% ! | Intrinsic Functions | ! %---------------------% ! intrinsic abs, sqrt ! ! %-----------------% ! | Data statements | ! %-----------------% ! data first / .true. / ! ! %-----------------------% ! | Executable Statements | ! %-----------------------% ! if (first) then ! ! %-----------------------------------------% ! | Set machine-dependent constants for the | ! | the splitting and deflation criterion. | ! | If norm(H) <= sqrt(OVFL), | ! | overflow should not occur. | ! | REFERENCE: LAPACK subroutine slahqr | ! %-----------------------------------------% ! unfl = slamch( 'safe minimum' ) ovfl = one / unfl call slabad( unfl, ovfl ) ulp = slamch( 'precision' ) smlnum = unfl*( n / ulp ) first = .false. end if ! if (ido == 0) then ! ! %-------------------------------% ! | Initialize timing statistics | ! | & message level for debugging | ! %-------------------------------% ! call cpu_time (t0) msglvl = mnaitr ! ! %------------------------------% ! | Initial call to this routine | ! %------------------------------% ! info = 0 step3 = .false. step4 = .false. rstart = .false. orth1 = .false. orth2 = .false. j = k + 1 ipj = 1 irj = ipj + n ivj = irj + n end if ! ! %-------------------------------------------------% ! | When in reverse communication mode one of: | ! | STEP3, STEP4, ORTH1, ORTH2, RSTART | ! | will be .true. when .... | ! | STEP3: return from computing OP*v_{j}. | ! | STEP4: return from computing B-norm of OP*v_{j} | ! | ORTH1: return from computing B-norm of r_{j+1} | ! | ORTH2: return from computing B-norm of | ! | correction to the residual vector. | ! | RSTART: return from OP computations needed by | ! | sgetv0. | ! %-------------------------------------------------% ! if (step3) go to 50 if (step4) go to 60 if (orth1) go to 70 if (orth2) go to 90 if (rstart) go to 30 ! ! %-----------------------------% ! | Else this is the first step | ! %-----------------------------% ! ! %--------------------------------------------------------------% ! | | ! | A R N O L D I I T E R A T I O N L O O P | ! | | ! | Note: B*r_{j-1} is already in WORKD(1:N)=WORKD(IPJ:IPJ+N-1) | ! %--------------------------------------------------------------% 1000 continue ! if (msglvl > 1) then call ivout (logfil, 1, j, ndigit, & '_naitr: generating Arnoldi vector number') call svout (logfil, 1, rnorm, ndigit, & '_naitr: B-norm of the current residual is') end if ! ! %---------------------------------------------------% ! | STEP 1: Check if the B norm of j-th residual | ! | vector is zero. Equivalent to determing whether | ! | an exact j-step Arnoldi factorization is present. | ! %---------------------------------------------------% ! betaj = rnorm if (rnorm > zero) go to 40 ! ! %---------------------------------------------------% ! | Invariant subspace found, generate a new starting | ! | vector which is orthogonal to the current Arnoldi | ! | basis and continue the iteration. | ! %---------------------------------------------------% ! if (msglvl > 0) then call ivout (logfil, 1, j, ndigit, & '_naitr: ****** RESTART AT STEP ******') end if ! ! %---------------------------------------------% ! | ITRY is the loop variable that controls the | ! | maximum amount of times that a restart is | ! | attempted. NRSTRT is used by stat.h | ! %---------------------------------------------% ! betaj = zero nrstrt = nrstrt + 1 itry = 1 20 continue rstart = .true. ido = 0 30 continue ! ! %--------------------------------------% ! | If in reverse communication mode and | ! | RSTART = .true. flow returns here. | ! %--------------------------------------% ! call sgetv0 (ido, bmat, itry, .false., n, j, v, ldv, & resid, rnorm, ipntr, workd, ierr) if (ido /= 99) go to 9000 if (ierr < 0) then itry = itry + 1 if (itry <= 3) go to 20 ! ! %------------------------------------------------% ! | Give up after several restart attempts. | ! | Set INFO to the size of the invariant subspace | ! | which spans OP and exit. | ! %------------------------------------------------% ! info = j - 1 call cpu_time (t1) tnaitr = tnaitr + (t1 - t0) ido = 99 go to 9000 end if ! 40 continue ! ! %---------------------------------------------------------% ! | STEP 2: v_{j} = r_{j-1}/rnorm and p_{j} = p_{j}/rnorm | ! | Note that p_{j} = B*r_{j-1}. In order to avoid overflow | ! | when reciprocating a small RNORM, test against lower | ! | machine bound. | ! %---------------------------------------------------------% ! call scopy (n, resid, 1, v(1,j), 1) if (rnorm >= unfl) then temp1 = one / rnorm call sscal (n, temp1, v(1,j), 1) call sscal (n, temp1, workd(ipj), 1) else ! ! %-----------------------------------------% ! | To scale both v_{j} and p_{j} carefully | ! | use LAPACK routine SLASCL | ! %-----------------------------------------% ! call slascl ('General', i, i, rnorm, one, n, 1, & v(1,j), n, infol) call slascl ('General', i, i, rnorm, one, n, 1, & workd(ipj), n, infol) end if ! ! %------------------------------------------------------% ! | STEP 3: r_{j} = OP*v_{j}; Note that p_{j} = B*v_{j} | ! | Note that this is not quite yet r_{j}. See STEP 4 | ! %------------------------------------------------------% ! step3 = .true. nopx = nopx + 1 call cpu_time (t2) call scopy (n, v(1,j), 1, workd(ivj), 1) ipntr(1) = ivj ipntr(2) = irj ipntr(3) = ipj ido = 1 ! ! %-----------------------------------% ! | Exit in order to compute OP*v_{j} | ! %-----------------------------------% ! go to 9000 50 continue ! ! %----------------------------------% ! | Back from reverse communication; | ! | WORKD(IRJ:IRJ+N-1) := OP*v_{j} | ! | if step3 = .true. | ! %----------------------------------% ! call cpu_time (t3) tmvopx = tmvopx + (t3 - t2) step3 = .false. ! ! %------------------------------------------% ! | Put another copy of OP*v_{j} into RESID. | ! %------------------------------------------% ! call scopy (n, workd(irj), 1, resid, 1) ! ! %---------------------------------------% ! | STEP 4: Finish extending the Arnoldi | ! | factorization to length j. | ! %---------------------------------------% ! call cpu_time (t2) if (bmat == 'G') then nbx = nbx + 1 step4 = .true. ipntr(1) = irj ipntr(2) = ipj ido = 2 ! ! %-------------------------------------% ! | Exit in order to compute B*OP*v_{j} | ! %-------------------------------------% ! go to 9000 else if (bmat == 'I') then call scopy (n, resid, 1, workd(ipj), 1) end if 60 continue ! ! %----------------------------------% ! | Back from reverse communication; | ! | WORKD(IPJ:IPJ+N-1) := B*OP*v_{j} | ! | if step4 = .true. | ! %----------------------------------% ! if (bmat == 'G') then call cpu_time (t3) tmvbx = tmvbx + (t3 - t2) end if ! step4 = .false. ! ! %-------------------------------------% ! | The following is needed for STEP 5. | ! | Compute the B-norm of OP*v_{j}. | ! %-------------------------------------% ! if (bmat == 'G') then wnorm = sdot (n, resid, 1, workd(ipj), 1) wnorm = sqrt(abs(wnorm)) else if (bmat == 'I') then wnorm = snrm2(n, resid, 1) end if ! ! %-----------------------------------------% ! | Compute the j-th residual corresponding | ! | to the j step factorization. | ! | Use Classical Gram Schmidt and compute: | ! | w_{j} <- V_{j}^T * B * OP * v_{j} | ! | r_{j} <- OP*v_{j} - V_{j} * w_{j} | ! %-----------------------------------------% ! ! ! %------------------------------------------% ! | Compute the j Fourier coefficients w_{j} | ! | WORKD(IPJ:IPJ+N-1) contains B*OP*v_{j}. | ! %------------------------------------------% ! call sgemv ('T', n, j, one, v, ldv, workd(ipj), 1, & zero, h(1,j), 1) ! ! %--------------------------------------% ! | Orthogonalize r_{j} against V_{j}. | ! | RESID contains OP*v_{j}. See STEP 3. | ! %--------------------------------------% ! call sgemv ('N', n, j, -one, v, ldv, h(1,j), 1, & one, resid, 1) ! if (j > 1) h(j,j-1) = betaj ! call cpu_time (t4) ! orth1 = .true. ! call cpu_time (t2) if (bmat == 'G') then nbx = nbx + 1 call scopy (n, resid, 1, workd(irj), 1) ipntr(1) = irj ipntr(2) = ipj ido = 2 ! ! %----------------------------------% ! | Exit in order to compute B*r_{j} | ! %----------------------------------% ! go to 9000 else if (bmat == 'I') then call scopy (n, resid, 1, workd(ipj), 1) end if 70 continue ! ! %---------------------------------------------------% ! | Back from reverse communication if ORTH1 = .true. | ! | WORKD(IPJ:IPJ+N-1) := B*r_{j}. | ! %---------------------------------------------------% ! if (bmat == 'G') then call cpu_time (t3) tmvbx = tmvbx + (t3 - t2) end if ! orth1 = .false. ! ! %------------------------------% ! | Compute the B-norm of r_{j}. | ! %------------------------------% ! if (bmat == 'G') then rnorm = sdot (n, resid, 1, workd(ipj), 1) rnorm = sqrt(abs(rnorm)) else if (bmat == 'I') then rnorm = snrm2(n, resid, 1) end if ! ! %-----------------------------------------------------------% ! | STEP 5: Re-orthogonalization / Iterative refinement phase | ! | Maximum NITER_ITREF tries. | ! | | ! | s = V_{j}^T * B * r_{j} | ! | r_{j} = r_{j} - V_{j}*s | ! | alphaj = alphaj + s_{j} | ! | | ! | The stopping criteria used for iterative refinement is | ! | discussed in Parlett's book SEP, page 107 and in Gragg & | ! | Reichel ACM TOMS paper; Algorithm 686, Dec. 1990. | ! | Determine if we need to correct the residual. The goal is | ! | to enforce ||v(:,1:j)^T * r_{j}|| <= eps * || r_{j} || | ! | The following test determines whether the sine of the | ! | angle between OP*x and the computed residual is less | ! | than or equal to 0.717. | ! %-----------------------------------------------------------% ! if (rnorm > 0.717*wnorm) go to 100 iter = 0 nrorth = nrorth + 1 ! ! %---------------------------------------------------% ! | Enter the Iterative refinement phase. If further | ! | refinement is necessary, loop back here. The loop | ! | variable is ITER. Perform a step of Classical | ! | Gram-Schmidt using all the Arnoldi vectors V_{j} | ! %---------------------------------------------------% ! 80 continue ! if (msglvl > 2) then xtemp(1) = wnorm xtemp(2) = rnorm call svout (logfil, 2, xtemp, ndigit, & '_naitr: re-orthonalization; wnorm and rnorm are') call svout (logfil, j, h(1,j), ndigit, & '_naitr: j-th column of H') end if ! ! %----------------------------------------------------% ! | Compute V_{j}^T * B * r_{j}. | ! | WORKD(IRJ:IRJ+J-1) = v(:,1:J)'*WORKD(IPJ:IPJ+N-1). | ! %----------------------------------------------------% ! call sgemv ('T', n, j, one, v, ldv, workd(ipj), 1, & zero, workd(irj), 1) ! ! %---------------------------------------------% ! | Compute the correction to the residual: | ! | r_{j} = r_{j} - V_{j} * WORKD(IRJ:IRJ+J-1). | ! | The correction to H is v(:,1:J)*H(1:J,1:J) | ! | + v(:,1:J)*WORKD(IRJ:IRJ+J-1)*e'_j. | ! %---------------------------------------------% ! call sgemv ('N', n, j, -one, v, ldv, workd(irj), 1, & one, resid, 1) call saxpy (j, one, workd(irj), 1, h(1,j), 1) ! orth2 = .true. call cpu_time (t2) if (bmat == 'G') then nbx = nbx + 1 call scopy (n, resid, 1, workd(irj), 1) ipntr(1) = irj ipntr(2) = ipj ido = 2 ! ! %-----------------------------------% ! | Exit in order to compute B*r_{j}. | ! | r_{j} is the corrected residual. | ! %-----------------------------------% ! go to 9000 else if (bmat == 'I') then call scopy (n, resid, 1, workd(ipj), 1) end if 90 continue ! ! %---------------------------------------------------% ! | Back from reverse communication if ORTH2 = .true. | ! %---------------------------------------------------% ! if (bmat == 'G') then call cpu_time (t3) tmvbx = tmvbx + (t3 - t2) end if ! ! %-----------------------------------------------------% ! | Compute the B-norm of the corrected residual r_{j}. | ! %-----------------------------------------------------% ! if (bmat == 'G') then rnorm1 = sdot (n, resid, 1, workd(ipj), 1) rnorm1 = sqrt(abs(rnorm1)) else if (bmat == 'I') then rnorm1 = snrm2(n, resid, 1) end if if (msglvl > 0 .and. iter .gt. 0) then call ivout (logfil, 1, j, ndigit, & '_naitr: Iterative refinement for Arnoldi residual') if (msglvl > 2) then xtemp(1) = rnorm xtemp(2) = rnorm1 call svout (logfil, 2, xtemp, ndigit, & '_naitr: iterative refinement ; rnorm and rnorm1 are') end if end if ! ! %-----------------------------------------% ! | Determine if we need to perform another | ! | step of re-orthogonalization. | ! %-----------------------------------------% ! if (rnorm1 > 0.717*rnorm) then ! ! %---------------------------------------% ! | No need for further refinement. | ! | The cosine of the angle between the | ! | corrected residual vector and the old | ! | residual vector is greater than 0.717 | ! | In other words the corrected residual | ! | and the old residual vector share an | ! | angle of less than arcCOS(0.717) | ! %---------------------------------------% ! rnorm = rnorm1 ! else ! ! %-------------------------------------------% ! | Another step of iterative refinement step | ! | is required. NITREF is used by stat.h | ! %-------------------------------------------% ! nitref = nitref + 1 rnorm = rnorm1 iter = iter + 1 if (iter <= 1) go to 80 ! ! %-------------------------------------------------% ! | Otherwise RESID is numerically in the span of V | ! %-------------------------------------------------% ! do 95 jj = 1, n resid(jj) = zero 95 continue rnorm = zero end if ! ! %----------------------------------------------% ! | Branch here directly if iterative refinement | ! | wasn't necessary or after at most NITER_REF | ! | steps of iterative refinement. | ! %----------------------------------------------% ! 100 continue ! rstart = .false. orth2 = .false. ! call cpu_time (t5) titref = titref + (t5 - t4) ! ! %------------------------------------% ! | STEP 6: Update j = j+1; Continue | ! %------------------------------------% ! j = j + 1 if (j > k+np) then call cpu_time (t1) tnaitr = tnaitr + (t1 - t0) ido = 99 do 110 i = max(1,k), k+np-1 ! ! %--------------------------------------------% ! | Check for splitting and deflation. | ! | Use a standard test as in the QR algorithm | ! | REFERENCE: LAPACK subroutine slahqr | ! %--------------------------------------------% ! tst1 = abs( h( i, i ) ) + abs( h( i+1, i+1 ) ) if( tst1==zero ) & tst1 = slanhs( '1', k+np, h, ldh, workd(n+1) ) if( abs( h( i+1,i ) )<=max( ulp*tst1, smlnum ) ) & h(i+1,i) = zero 110 continue ! if (msglvl > 2) then call smout (logfil, k+np, k+np, h, ldh, ndigit, & '_naitr: Final upper Hessenberg matrix H of order K+NP') end if ! go to 9000 end if ! ! %--------------------------------------------------------% ! | Loop back to extend the factorization by another step. | ! %--------------------------------------------------------% ! go to 1000 ! ! %---------------------------------------------------------------% ! | | ! | E N D O F M A I N I T E R A T I O N L O O P | ! | | ! %---------------------------------------------------------------% ! 9000 continue return ! ! %---------------% ! | End of snaitr | ! %---------------% ! end !----------------------------------------------------------------------- !\BeginDoc ! !\Name: snapps ! !\Description: ! Given the Arnoldi factorization ! ! A*V_{k} - V_{k}*H_{k} = r_{k+p}*e_{k+p}^T, ! ! apply NP implicit shifts resulting in ! ! A*(V_{k}*Q) - (V_{k}*Q)*(Q^T* H_{k}*Q) = r_{k+p}*e_{k+p}^T * Q ! ! where Q is an orthogonal matrix which is the product of rotations ! and reflections resulting from the NP bulge chage sweeps. ! The updated Arnoldi factorization becomes: ! ! A*VNEW_{k} - VNEW_{k}*HNEW_{k} = rnew_{k}*e_{k}^T. ! !\Usage: ! call snapps ! ( N, KEV, NP, SHIFTR, SHIFTI, V, LDV, H, LDH, RESID, Q, LDQ, ! WORKL, WORKD ) ! !\Arguments ! N Integer. (INPUT) ! Problem size, i.e. size of matrix A. ! ! KEV Integer. (INPUT/OUTPUT) ! KEV+NP is the size of the input matrix H. ! KEV is the size of the updated matrix HNEW. KEV is only ! updated on ouput when fewer than NP shifts are applied in ! order to keep the conjugate pair together. ! ! NP Integer. (INPUT) ! Number of implicit shifts to be applied. ! ! SHIFTR, Real array of length NP. (INPUT) ! SHIFTI Real and imaginary part of the shifts to be applied. ! Upon, entry to snapps, the shifts must be sorted so that the ! conjugate pairs are in consecutive locations. ! ! V Real N by (KEV+NP) array. (INPUT/OUTPUT) ! On INPUT, V contains the current KEV+NP Arnoldi vectors. ! On OUTPUT, V contains the updated KEV Arnoldi vectors ! in the first KEV columns of V. ! ! LDV Integer. (INPUT) ! Leading dimension of V exactly as declared in the calling ! program. ! ! H Real (KEV+NP) by (KEV+NP) array. (INPUT/OUTPUT) ! On INPUT, H contains the current KEV+NP by KEV+NP upper ! Hessenber matrix of the Arnoldi factorization. ! On OUTPUT, H contains the updated KEV by KEV upper Hessenberg ! matrix in the KEV leading submatrix. ! ! LDH Integer. (INPUT) ! Leading dimension of H exactly as declared in the calling ! program. ! ! RESID Real array of length N. (INPUT/OUTPUT) ! On INPUT, RESID contains the the residual vector r_{k+p}. ! On OUTPUT, RESID is the update residual vector rnew_{k} ! in the first KEV locations. ! ! Q Real KEV+NP by KEV+NP work array. (WORKSPACE) ! Work array used to accumulate the rotations and reflections ! during the bulge chase sweep. ! ! LDQ Integer. (INPUT) ! Leading dimension of Q exactly as declared in the calling ! program. ! ! WORKL Real work array of length (KEV+NP). (WORKSPACE) ! Private (replicated) array on each PE or array allocated on ! the front end. ! ! WORKD Real work array of length 2*N. (WORKSPACE) ! Distributed array used in the application of the accumulated ! orthogonal matrix Q. ! !\EndDoc ! !----------------------------------------------------------------------- ! !\BeginLib ! !\Local variables: ! xxxxxx real ! !\References: ! 1. D.C. Sorensen, "Implicit Application of Polynomial Filters in ! a k-Step Arnoldi Method", SIAM J. Matr. Anal. Apps., 13 (1992), ! pp 357-385. ! !\Routines called: ! ivout ARPACK utility routine that prints integers. ! smout ARPACK utility routine that prints matrices. ! svout ARPACK utility routine that prints vectors. ! slabad LAPACK routine that computes machine constants. ! slacpy LAPACK matrix copy routine. ! slamch LAPACK routine that determines machine constants. ! slanhs LAPACK routine that computes various norms of a matrix. ! slapy2 LAPACK routine to compute sqrt(x**2+y**2) carefully. ! slarf LAPACK routine that applies Householder reflection to ! a matrix. ! slarfg LAPACK Householder reflection construction routine. ! slartg LAPACK Givens rotation construction routine. ! slaset LAPACK matrix initialization routine. ! sgemv Level 2 BLAS routine for matrix vector multiplication. ! saxpy Level 1 BLAS that computes a vector triad. ! scopy Level 1 BLAS that copies one vector to another . ! sscal Level 1 BLAS that scales a vector. ! !\Author ! Danny Sorensen Phuong Vu ! Richard Lehoucq CRPC / Rice University ! Dept. of Computational & Houston, Texas ! Applied Mathematics ! Rice University ! Houston, Texas ! !\Revision history: ! xx/xx/92: Version ' 2.4' ! !\SCCS Information: @(#) ! FILE: napps.F SID: 2.4 DATE OF SID: 3/28/97 RELEASE: 2 ! !\Remarks ! 1. In this version, each shift is applied to all the sublocks of ! the Hessenberg matrix H and not just to the submatrix that it ! comes from. Deflation as in LAPACK routine slahqr (QR algorithm ! for upper Hessenberg matrices ) is used. ! The subdiagonals of H are enforced to be non-negative. ! !\EndLib ! !----------------------------------------------------------------------- ! subroutine snapps & ( n, kev, np, shiftr, shifti, v, ldv, h, ldh, resid, q, ldq, & workl, workd ) ! !! SNAPPS applies implicit shifts to the Arnoldi factorization. ! ! %----------------------------------------------------% ! | Include files for debugging and timing information | ! %----------------------------------------------------% ! include 'debug.h' include 'stat.h' ! ! %------------------% ! | Scalar Arguments | ! %------------------% ! integer kev, ldh, ldq, ldv, n, np ! ! %-----------------% ! | Array Arguments | ! %-----------------% ! Real & h(ldh,kev+np), resid(n), shifti(np), shiftr(np), & v(ldv,kev+np), q(ldq,kev+np), workd(2*n), workl(kev+np) ! ! %------------% ! | Parameters | ! %------------% ! Real & one, zero parameter (one = 1.0E+0, zero = 0.0E+0) ! ! %------------------------% ! | Local Scalars & Arrays | ! %------------------------% ! integer i, iend, ir, istart, j, jj, kplusp, msglvl, nr logical cconj, first Real & c, f, g, h11, h12, h21, h22, h32, ovfl, r, s, sigmai, & sigmar, smlnum, ulp, unfl, u(3), t, tau, tst1 save first, ovfl, smlnum, ulp, unfl ! ! %----------------------% ! | External Subroutines | ! %----------------------% ! external saxpy, scopy, sscal, slacpy, slarfg, slarf, & slaset, slabad, slartg ! ! %--------------------% ! | External Functions | ! %--------------------% ! Real & slamch, slanhs, slapy2 external slamch, slanhs, slapy2 ! ! %----------------------% ! | Intrinsics Functions | ! %----------------------% ! intrinsic abs, max, min ! ! %----------------% ! | Data statments | ! %----------------% ! data first / .true. / ! ! %-----------------------% ! | Executable Statements | ! %-----------------------% ! if (first) then ! ! %-----------------------------------------------% ! | Set machine-dependent constants for the | ! | stopping criterion. If norm(H) <= sqrt(OVFL), | ! | overflow should not occur. | ! | REFERENCE: LAPACK subroutine slahqr | ! %-----------------------------------------------% ! unfl = slamch( 'safe minimum' ) ovfl = one / unfl call slabad( unfl, ovfl ) ulp = slamch( 'precision' ) smlnum = unfl*( n / ulp ) first = .false. end if ! ! %-------------------------------% ! | Initialize timing statistics | ! | & message level for debugging | ! %-------------------------------% ! call cpu_time (t0) msglvl = mnapps kplusp = kev + np ! ! %--------------------------------------------% ! | Initialize Q to the identity to accumulate | ! | the rotations and reflections | ! %--------------------------------------------% ! call slaset ('All', kplusp, kplusp, zero, one, q, ldq) ! ! %----------------------------------------------% ! | Quick return if there are no shifts to apply | ! %----------------------------------------------% ! if (np == 0) go to 9000 ! ! %----------------------------------------------% ! | Chase the bulge with the application of each | ! | implicit shift. Each shift is applied to the | ! | whole matrix including each block. | ! %----------------------------------------------% ! cconj = .false. do 110 jj = 1, np sigmar = shiftr(jj) sigmai = shifti(jj) ! if (msglvl > 2 ) then call ivout (logfil, 1, jj, ndigit, & '_napps: shift number.') call svout (logfil, 1, sigmar, ndigit, & '_napps: The real part of the shift ') call svout (logfil, 1, sigmai, ndigit, & '_napps: The imaginary part of the shift ') end if ! ! %-------------------------------------------------% ! | The following set of conditionals is necessary | ! | in order that complex conjugate pairs of shifts | ! | are applied together or not at all. | ! %-------------------------------------------------% ! if ( cconj ) then ! ! %-----------------------------------------% ! | cconj = .true. means the previous shift | ! | had non-zero imaginary part. | ! %-----------------------------------------% ! cconj = .false. go to 110 else if ( jj < np .and. abs( sigmai ) > zero ) then ! ! %------------------------------------% ! | Start of a complex conjugate pair. | ! %------------------------------------% ! cconj = .true. else if ( jj == np .and. abs( sigmai ) > zero ) then ! ! %----------------------------------------------% ! | The last shift has a nonzero imaginary part. | ! | Don't apply it; thus the order of the | ! | compressed H is order KEV+1 since only np-1 | ! | were applied. | ! %----------------------------------------------% ! kev = kev + 1 go to 110 end if istart = 1 20 continue ! ! %--------------------------------------------------% ! | if sigmai = 0 then | ! | Apply the jj-th shift ... | ! | else | ! | Apply the jj-th and (jj+1)-th together ... | ! | (Note that jj < np at this point in the code) | ! | end | ! | to the current block of H. The next do loop | ! | determines the current block ; | ! %--------------------------------------------------% ! do 30 i = istart, kplusp-1 ! ! %----------------------------------------% ! | Check for splitting and deflation. Use | ! | a standard test as in the QR algorithm | ! | REFERENCE: LAPACK subroutine slahqr | ! %----------------------------------------% ! tst1 = abs( h( i, i ) ) + abs( h( i+1, i+1 ) ) if( tst1==zero ) & tst1 = slanhs( '1', kplusp-jj+1, h, ldh, workl ) if( abs( h( i+1,i ) )<=max( ulp*tst1, smlnum ) ) then if (msglvl > 0) then call ivout (logfil, 1, i, ndigit, & '_napps: matrix splitting at row/column no.') call ivout (logfil, 1, jj, ndigit, & '_napps: matrix splitting with shift number.') call svout (logfil, 1, h(i+1,i), ndigit, & '_napps: off diagonal element.') end if iend = i h(i+1,i) = zero go to 40 end if 30 continue iend = kplusp 40 continue ! if (msglvl > 2) then call ivout (logfil, 1, istart, ndigit, & '_napps: Start of current block ') call ivout (logfil, 1, iend, ndigit, & '_napps: End of current block ') end if ! ! %------------------------------------------------% ! | No reason to apply a shift to block of order 1 | ! %------------------------------------------------% ! if ( istart == iend ) go to 100 ! ! %------------------------------------------------------% ! | If istart + 1 = iend then no reason to apply a | ! | complex conjugate pair of shifts on a 2 by 2 matrix. | ! %------------------------------------------------------% ! if ( istart + 1 == iend .and. abs( sigmai ) > zero ) & go to 100 ! h11 = h(istart,istart) h21 = h(istart+1,istart) if ( abs( sigmai ) <= zero ) then ! ! %---------------------------------------------% ! | Real-valued shift ==> apply single shift QR | ! %---------------------------------------------% ! f = h11 - sigmar g = h21 ! do 80 i = istart, iend-1 ! ! %-----------------------------------------------------% ! | Contruct the plane rotation G to zero out the bulge | ! %-----------------------------------------------------% ! call slartg (f, g, c, s, r) if (i > istart) then ! ! %-------------------------------------------% ! | The following ensures that h(1:iend-1,1), | ! | the first iend-2 off diagonal of elements | ! | H, remain non negative. | ! %-------------------------------------------% ! if (r < zero) then r = -r c = -c s = -s end if h(i,i-1) = r h(i+1,i-1) = zero end if ! ! %---------------------------------------------% ! | Apply rotation to the left of H; H <- G'*H | ! %---------------------------------------------% ! do 50 j = i, kplusp t = c*h(i,j) + s*h(i+1,j) h(i+1,j) = -s*h(i,j) + c*h(i+1,j) h(i,j) = t 50 continue ! ! %---------------------------------------------% ! | Apply rotation to the right of H; H <- H*G | ! %---------------------------------------------% ! do j = 1, min(i+2,iend) t = c*h(j,i) + s*h(j,i+1) h(j,i+1) = -s*h(j,i) + c*h(j,i+1) h(j,i) = t end do ! ! %----------------------------------------------------% ! | Accumulate the rotation in the matrix Q; Q <- Q*G | ! %----------------------------------------------------% ! do j = 1, min( i+jj, kplusp ) t = c*q(j,i) + s*q(j,i+1) q(j,i+1) = - s*q(j,i) + c*q(j,i+1) q(j,i) = t end do ! ! %---------------------------% ! | Prepare for next rotation | ! %---------------------------% ! if (i < iend-1) then f = h(i+1,i) g = h(i+2,i) end if 80 continue ! ! %-----------------------------------% ! | Finished applying the real shift. | ! %-----------------------------------% ! else ! ! %----------------------------------------------------% ! | Complex conjugate shifts ==> apply double shift QR | ! %----------------------------------------------------% ! h12 = h(istart,istart+1) h22 = h(istart+1,istart+1) h32 = h(istart+2,istart+1) ! ! %---------------------------------------------------------% ! | Compute 1st column of (H - shift*I)*(H - conj(shift)*I) | ! %---------------------------------------------------------% ! s = 2.0*sigmar t = slapy2 ( sigmar, sigmai ) u(1) = ( h11 * (h11 - s) + t * t ) / h21 + h12 u(2) = h11 + h22 - s u(3) = h32 ! do 90 i = istart, iend-1 ! nr = min ( 3, iend-i+1 ) ! ! %-----------------------------------------------------% ! | Construct Householder reflector G to zero out u(1). | ! | G is of the form I - tau*( 1 u )' * ( 1 u' ). | ! %-----------------------------------------------------% ! call slarfg ( nr, u(1), u(2), 1, tau ) ! if (i > istart) then h(i,i-1) = u(1) h(i+1,i-1) = zero if (i < iend-1) h(i+2,i-1) = zero end if u(1) = one ! ! %--------------------------------------% ! | Apply the reflector to the left of H | ! %--------------------------------------% ! call slarf ('Left', nr, kplusp-i+1, u, 1, tau, & h(i,i), ldh, workl) ! ! %---------------------------------------% ! | Apply the reflector to the right of H | ! %---------------------------------------% ! ir = min ( i+3, iend ) call slarf ('Right', ir, nr, u, 1, tau, & h(1,i), ldh, workl) ! ! %-----------------------------------------------------% ! | Accumulate the reflector in the matrix Q; Q <- Q*G | ! %-----------------------------------------------------% ! call slarf ('Right', kplusp, nr, u, 1, tau, & q(1,i), ldq, workl) ! ! %----------------------------% ! | Prepare for next reflector | ! %----------------------------% ! if (i < iend-1) then u(1) = h(i+1,i) u(2) = h(i+2,i) if (i < iend-2) u(3) = h(i+3,i) end if ! 90 continue ! ! %--------------------------------------------% ! | Finished applying a complex pair of shifts | ! | to the current block | ! %--------------------------------------------% ! end if ! 100 continue ! ! %---------------------------------------------------------% ! | Apply the same shift to the next block if there is any. | ! %---------------------------------------------------------% ! istart = iend + 1 if (iend < kplusp) go to 20 ! ! %---------------------------------------------% ! | Loop back to the top to get the next shift. | ! %---------------------------------------------% ! 110 continue ! ! %--------------------------------------------------% ! | Perform a similarity transformation that makes | ! | sure that H will have non negative sub diagonals | ! %--------------------------------------------------% ! do 120 j=1,kev if ( h(j+1,j) < zero ) then call sscal( kplusp-j+1, -one, h(j+1,j), ldh ) call sscal( min(j+2, kplusp), -one, h(1,j+1), 1 ) call sscal( min(j+np+1,kplusp), -one, q(1,j+1), 1 ) end if 120 continue ! do 130 i = 1, kev ! ! %--------------------------------------------% ! | Final check for splitting and deflation. | ! | Use a standard test as in the QR algorithm | ! | REFERENCE: LAPACK subroutine slahqr | ! %--------------------------------------------% ! tst1 = abs( h( i, i ) ) + abs( h( i+1, i+1 ) ) if( tst1==zero ) & tst1 = slanhs( '1', kev, h, ldh, workl ) if( h( i+1,i ) <= max( ulp*tst1, smlnum ) ) & h(i+1,i) = zero 130 continue ! ! %-------------------------------------------------% ! | Compute the (kev+1)-st column of (V*Q) and | ! | temporarily store the result in WORKD(N+1:2*N). | ! | This is needed in the residual update since we | ! | cannot GUARANTEE that the corresponding entry | ! | of H would be zero as in exact arithmetic. | ! %-------------------------------------------------% ! if (h(kev+1,kev) > zero) & call sgemv ('N', n, kplusp, one, v, ldv, q(1,kev+1), 1, zero, & workd(n+1), 1) ! ! %----------------------------------------------------------% ! | Compute column 1 to kev of (V*Q) in backward order | ! | taking advantage of the upper Hessenberg structure of Q. | ! %----------------------------------------------------------% ! do 140 i = 1, kev call sgemv ('N', n, kplusp-i+1, one, v, ldv, & q(1,kev-i+1), 1, zero, workd, 1) call scopy (n, workd, 1, v(1,kplusp-i+1), 1) 140 continue ! ! %-------------------------------------------------% ! | Move v(:,kplusp-kev+1:kplusp) into v(:,1:kev). | ! %-------------------------------------------------% ! call slacpy ('A', n, kev, v(1,kplusp-kev+1), ldv, v, ldv) ! ! %--------------------------------------------------------------% ! | Copy the (kev+1)-st column of (V*Q) in the appropriate place | ! %--------------------------------------------------------------% ! if (h(kev+1,kev) > zero) & call scopy (n, workd(n+1), 1, v(1,kev+1), 1) ! ! %-------------------------------------% ! | Update the residual vector: | ! | r <- sigmak*r + betak*v(:,kev+1) | ! | where | ! | sigmak = (e_{kplusp}'*Q)*e_{kev} | ! | betak = e_{kev+1}'*H*e_{kev} | ! %-------------------------------------% ! call sscal (n, q(kplusp,kev), resid, 1) if (h(kev+1,kev) > zero) & call saxpy (n, h(kev+1,kev), v(1,kev+1), 1, resid, 1) ! if (msglvl > 1) then call svout (logfil, 1, q(kplusp,kev), ndigit, & '_napps: sigmak = (e_{kev+p}^T*Q)*e_{kev}') call svout (logfil, 1, h(kev+1,kev), ndigit, & '_napps: betak = e_{kev+1}^T*H*e_{kev}') call ivout (logfil, 1, kev, ndigit, & '_napps: Order of the final Hessenberg matrix ') if (msglvl > 2) then call smout (logfil, kev, kev, h, ldh, ndigit, & '_napps: updated Hessenberg matrix H for next iteration') end if ! end if ! 9000 continue call cpu_time (t1) tnapps = tnapps + (t1 - t0) ! return ! ! %---------------% ! | End of snapps | ! %---------------% ! end !\BeginDoc ! !\Name: snaup2 ! !\Description: ! Intermediate level interface called by snaupd. ! !\Usage: ! call snaup2 ! ( IDO, BMAT, N, WHICH, NEV, NP, TOL, RESID, MODE, IUPD, ! ISHIFT, MXITER, V, LDV, H, LDH, RITZR, RITZI, BOUNDS, ! Q, LDQ, WORKL, IPNTR, WORKD, INFO ) ! !\Arguments ! ! IDO, BMAT, N, WHICH, NEV, TOL, RESID: same as defined in snaupd. ! MODE, ISHIFT, MXITER: see the definition of IPARAM in snaupd. ! ! NP Integer. (INPUT/OUTPUT) ! Contains the number of implicit shifts to apply during ! each Arnoldi iteration. ! If ISHIFT=1, NP is adjusted dynamically at each iteration ! to accelerate convergence and prevent stagnation. ! This is also roughly equal to the number of matrix-vector ! products (involving the operator OP) per Arnoldi iteration. ! The logic for adjusting is contained within the current ! subroutine. ! If ISHIFT=0, NP is the number of shifts the user needs ! to provide via reverse comunication. 0 < NP < NCV-NEV. ! NP may be less than NCV-NEV for two reasons. The first, is ! to keep complex conjugate pairs of "wanted" Ritz values ! together. The second, is that a leading block of the current ! upper Hessenberg matrix has split off and contains "unwanted" ! Ritz values. ! Upon termination of the IRA iteration, NP contains the number ! of "converged" wanted Ritz values. ! ! IUPD Integer. (INPUT) ! IUPD .EQ. 0: use explicit restart instead implicit update. ! IUPD .NE. 0: use implicit update. ! ! V Real N by (NEV+NP) array. (INPUT/OUTPUT) ! The Arnoldi basis vectors are returned in the first NEV ! columns of V. ! ! LDV Integer. (INPUT) ! Leading dimension of V exactly as declared in the calling ! program. ! ! H Real (NEV+NP) by (NEV+NP) array. (OUTPUT) ! H is used to store the generated upper Hessenberg matrix ! ! LDH Integer. (INPUT) ! Leading dimension of H exactly as declared in the calling ! program. ! ! RITZR, Real arrays of length NEV+NP. (OUTPUT) ! RITZI RITZR(1:NEV) (resp. RITZI(1:NEV)) contains the real (resp. ! imaginary) part of the computed Ritz values of OP. ! ! BOUNDS Real array of length NEV+NP. (OUTPUT) ! BOUNDS(1:NEV) contain the error bounds corresponding to ! the computed Ritz values. ! ! Q Real (NEV+NP) by (NEV+NP) array. (WORKSPACE) ! Private (replicated) work array used to accumulate the ! rotation in the shift application step. ! ! LDQ Integer. (INPUT) ! Leading dimension of Q exactly as declared in the calling ! program. ! ! WORKL Real work array of length at least ! (NEV+NP)**2 + 3*(NEV+NP). (INPUT/WORKSPACE) ! Private (replicated) array on each PE or array allocated on ! the front end. It is used in shifts calculation, shifts ! application and convergence checking. ! ! On exit, the last 3*(NEV+NP) locations of WORKL contain ! the Ritz values (real,imaginary) and associated Ritz ! estimates of the current Hessenberg matrix. They are ! listed in the same order as returned from sneigh. ! ! If ISHIFT .EQ. O and IDO .EQ. 3, the first 2*NP locations ! of WORKL are used in reverse communication to hold the user ! supplied shifts. ! ! IPNTR Integer array of length 3. (OUTPUT) ! Pointer to mark the starting locations in the WORKD for ! vectors used by the Arnoldi iteration. ! ------------------------------------------------------------- ! IPNTR(1): pointer to the current operand vector X. ! IPNTR(2): pointer to the current result vector Y. ! IPNTR(3): pointer to the vector B * X when used in the ! shift-and-invert mode. X is the current operand. ! ------------------------------------------------------------- ! ! WORKD Real work array of length 3*N. (WORKSPACE) ! Distributed array to be used in the basic Arnoldi iteration ! for reverse communication. The user should not use WORKD ! as temporary workspace during the iteration !!!!!!!!!! ! See Data Distribution Note in DNAUPD. ! ! INFO Integer. (INPUT/OUTPUT) ! If INFO .EQ. 0, a randomly initial residual vector is used. ! If INFO .NE. 0, RESID contains the initial residual vector, ! possibly from a previous run. ! Error flag on output. ! = 0: Normal return. ! = 1: Maximum number of iterations taken. ! All possible eigenvalues of OP has been found. ! NP returns the number of converged Ritz values. ! = 2: No shifts could be applied. ! = -8: Error return from LAPACK eigenvalue calculation; ! This should never happen. ! = -9: Starting vector is zero. ! = -9999: Could not build an Arnoldi factorization. ! Size that was built in returned in NP. ! !\EndDoc ! !----------------------------------------------------------------------- ! !\BeginLib ! !\Local variables: ! xxxxxx real ! !\References: ! 1. D.C. Sorensen, "Implicit Application of Polynomial Filters in ! a k-Step Arnoldi Method", SIAM J. Matr. Anal. Apps., 13 (1992), ! pp 357-385. ! 2. R.B. Lehoucq, "Analysis and Implementation of an Implicitly ! Restarted Arnoldi Iteration", Rice University Technical Report ! TR95-13, Department of Computational and Applied Mathematics. ! !\Routines called: ! sgetv0 ARPACK initial vector generation routine. ! snaitr ARPACK Arnoldi factorization routine. ! snapps ARPACK application of implicit shifts routine. ! snconv ARPACK convergence of Ritz values routine. ! sneigh ARPACK compute Ritz values and error bounds routine. ! sngets ARPACK reorder Ritz values and error bounds routine. ! ssortc ARPACK sorting routine. ! ivout ARPACK utility routine that prints integers. ! smout ARPACK utility routine that prints matrices ! svout ARPACK utility routine that prints vectors. ! slamch LAPACK routine that determines machine constants. ! slapy2 LAPACK routine to compute sqrt(x**2+y**2) carefully. ! scopy Level 1 BLAS that copies one vector to another . ! sdot Level 1 BLAS that computes the scalar product of two vectors. ! snrm2 Level 1 BLAS that computes the norm of a vector. ! sswap Level 1 BLAS that swaps two vectors. ! !\Author ! Danny Sorensen Phuong Vu ! Richard Lehoucq CRPC / Rice University ! Dept. of Computational & Houston, Texas ! Applied Mathematics ! Rice University ! Houston, Texas ! !\SCCS Information: @(#) ! FILE: naup2.F SID: 2.8 DATE OF SID: 10/17/00 RELEASE: 2 ! !\Remarks ! 1. None ! !\EndLib ! !----------------------------------------------------------------------- ! subroutine snaup2 & ( ido, bmat, n, which, nev, np, tol, resid, mode, iupd, & ishift, mxiter, v, ldv, h, ldh, ritzr, ritzi, bounds, & q, ldq, workl, ipntr, workd, info ) ! !! SNAUP2 is an intermediate interface called by SNAUPD. ! ! %----------------------------------------------------% ! | Include files for debugging and timing information | ! %----------------------------------------------------% ! include 'debug.h' include 'stat.h' ! ! %------------------% ! | Scalar Arguments | ! %------------------% ! character bmat*1, which*2 integer ido, info, ishift, iupd, mode, ldh, ldq, ldv, mxiter, & n, nev, np Real & tol ! ! %-----------------% ! | Array Arguments | ! %-----------------% ! integer ipntr(13) Real & bounds(nev+np), h(ldh,nev+np), q(ldq,nev+np), resid(n), & ritzi(nev+np), ritzr(nev+np), v(ldv,nev+np), & workd(3*n), workl( (nev+np)*(nev+np+3) ) ! ! %------------% ! | Parameters | ! %------------% ! Real & one, zero parameter (one = 1.0E+0, zero = 0.0E+0) ! ! %---------------% ! | Local Scalars | ! %---------------% ! character wprime*2 logical cnorm , getv0, initv, update, ushift integer ierr , iter , j , kplusp, msglvl, nconv, & nevbef, nev0 , np0 , nptemp, numcnv Real & rnorm , temp , eps23 save cnorm , getv0, initv, update, ushift, & rnorm , iter , eps23, kplusp, msglvl, nconv , & nevbef, nev0 , np0 , numcnv ! ! %-----------------------% ! | Local array arguments | ! %-----------------------% ! integer kp(4) ! ! %----------------------% ! | External Subroutines | ! %----------------------% ! external scopy , sgetv0, snaitr, snconv, sneigh, & sngets, snapps, svout , ivout ! ! %--------------------% ! | External Functions | ! %--------------------% ! Real & sdot, snrm2, slapy2, slamch external sdot, snrm2, slapy2, slamch ! ! %---------------------% ! | Intrinsic Functions | ! %---------------------% ! intrinsic min, max, abs, sqrt ! ! %-----------------------% ! | Executable Statements | ! %-----------------------% ! if (ido == 0) then ! call cpu_time (t0) ! msglvl = mnaup2 ! ! %-------------------------------------% ! | Get the machine dependent constant. | ! %-------------------------------------% ! eps23 = slamch('Epsilon-Machine') eps23 = eps23**(2.0E+0 / 3.0E+0) ! nev0 = nev np0 = np ! ! %-------------------------------------% ! | kplusp is the bound on the largest | ! | Lanczos factorization built. | ! | nconv is the current number of | ! | "converged" eigenvlues. | ! | iter is the counter on the current | ! | iteration step. | ! %-------------------------------------% ! kplusp = nev + np nconv = 0 iter = 0 ! ! %---------------------------------------% ! | Set flags for computing the first NEV | ! | steps of the Arnoldi factorization. | ! %---------------------------------------% ! getv0 = .true. update = .false. ushift = .false. cnorm = .false. ! if (info /= 0) then ! ! %--------------------------------------------% ! | User provides the initial residual vector. | ! %--------------------------------------------% ! initv = .true. info = 0 else initv = .false. end if end if ! ! %---------------------------------------------% ! | Get a possibly random starting vector and | ! | force it into the range of the operator OP. | ! %---------------------------------------------% ! 10 continue ! if (getv0) then call sgetv0 (ido, bmat, 1, initv, n, 1, v, ldv, resid, rnorm, & ipntr, workd, info) ! if (ido /= 99) go to 9000 ! if (rnorm == zero) then ! ! %-----------------------------------------% ! | The initial vector is zero. Error exit. | ! %-----------------------------------------% ! info = -9 go to 1100 end if getv0 = .false. ido = 0 end if ! ! %-----------------------------------% ! | Back from reverse communication : | ! | continue with update step | ! %-----------------------------------% ! if (update) go to 20 ! ! %-------------------------------------------% ! | Back from computing user specified shifts | ! %-------------------------------------------% ! if (ushift) go to 50 ! ! %-------------------------------------% ! | Back from computing residual norm | ! | at the end of the current iteration | ! %-------------------------------------% ! if (cnorm) go to 100 ! ! %----------------------------------------------------------% ! | Compute the first NEV steps of the Arnoldi factorization | ! %----------------------------------------------------------% ! call snaitr (ido, bmat, n, 0, nev, mode, resid, rnorm, v, ldv, & h, ldh, ipntr, workd, info) ! ! %---------------------------------------------------% ! | ido /= 99 implies use of reverse communication | ! | to compute operations involving OP and possibly B | ! %---------------------------------------------------% ! if (ido /= 99) go to 9000 if (info > 0) then np = info mxiter = iter info = -9999 go to 1200 end if ! ! %--------------------------------------------------------------% ! | | ! | M A I N ARNOLDI I T E R A T I O N L O O P | ! | Each iteration implicitly restarts the Arnoldi | ! | factorization in place. | ! | | ! %--------------------------------------------------------------% ! 1000 continue ! iter = iter + 1 ! if (msglvl > 0) then call ivout (logfil, 1, iter, ndigit, & '_naup2: **** Start of major iteration number ****') end if ! ! %-----------------------------------------------------------% ! | Compute NP additional steps of the Arnoldi factorization. | ! | Adjust NP since NEV might have been updated by last call | ! | to the shift application routine snapps. | ! %-----------------------------------------------------------% ! np = kplusp - nev ! if (msglvl > 1) then call ivout (logfil, 1, nev, ndigit, & '_naup2: The length of the current Arnoldi factorization') call ivout (logfil, 1, np, ndigit, & '_naup2: Extend the Arnoldi factorization by') end if ! ! %-----------------------------------------------------------% ! | Compute NP additional steps of the Arnoldi factorization. | ! %-----------------------------------------------------------% ! ido = 0 20 continue update = .true. call snaitr (ido , bmat, n , nev, np , mode , resid, & rnorm, v , ldv, h , ldh, ipntr, workd, & info) ! ! %---------------------------------------------------% ! | ido /= 99 implies use of reverse communication | ! | to compute operations involving OP and possibly B | ! %---------------------------------------------------% ! if (ido /= 99) go to 9000 if (info > 0) then np = info mxiter = iter info = -9999 go to 1200 end if update = .false. if (msglvl > 1) then call svout (logfil, 1, rnorm, ndigit, & '_naup2: Corresponding B-norm of the residual') end if ! ! %--------------------------------------------------------% ! | Compute the eigenvalues and corresponding error bounds | ! | of the current upper Hessenberg matrix. | ! %--------------------------------------------------------% ! call sneigh (rnorm, kplusp, h, ldh, ritzr, ritzi, bounds, & q, ldq, workl, ierr) ! if (ierr /= 0) then info = -8 go to 1200 end if ! ! %----------------------------------------------------% ! | Make a copy of eigenvalues and corresponding error | ! | bounds obtained from sneigh. | ! %----------------------------------------------------% ! call scopy(kplusp, ritzr, 1, workl(kplusp**2+1), 1) call scopy(kplusp, ritzi, 1, workl(kplusp**2+kplusp+1), 1) call scopy(kplusp, bounds, 1, workl(kplusp**2+2*kplusp+1), 1) ! ! %---------------------------------------------------% ! | Select the wanted Ritz values and their bounds | ! | to be used in the convergence test. | ! | The wanted part of the spectrum and corresponding | ! | error bounds are in the last NEV loc. of RITZR, | ! | RITZI and BOUNDS respectively. The variables NEV | ! | and NP may be updated if the NEV-th wanted Ritz | ! | value has a non zero imaginary part. In this case | ! | NEV is increased by one and NP decreased by one. | ! | NOTE: The last two arguments of sngets are no | ! | longer used as of version 2.1. | ! %---------------------------------------------------% ! nev = nev0 np = np0 numcnv = nev call sngets (ishift, which, nev, np, ritzr, ritzi, & bounds, workl, workl(np+1)) if (nev == nev0+1) numcnv = nev0+1 ! ! %-------------------% ! | Convergence test. | ! %-------------------% ! call scopy (nev, bounds(np+1), 1, workl(2*np+1), 1) call snconv (nev, ritzr(np+1), ritzi(np+1), workl(2*np+1), & tol, nconv) ! if (msglvl > 2) then kp(1) = nev kp(2) = np kp(3) = numcnv kp(4) = nconv call ivout (logfil, 4, kp, ndigit, & '_naup2: NEV, NP, NUMCNV, NCONV are') call svout (logfil, kplusp, ritzr, ndigit, & '_naup2: Real part of the eigenvalues of H') call svout (logfil, kplusp, ritzi, ndigit, & '_naup2: Imaginary part of the eigenvalues of H') call svout (logfil, kplusp, bounds, ndigit, & '_naup2: Ritz estimates of the current NCV Ritz values') end if ! ! %---------------------------------------------------------% ! | Count the number of unwanted Ritz values that have zero | ! | Ritz estimates. If any Ritz estimates are equal to zero | ! | then a leading block of H of order equal to at least | ! | the number of Ritz values with zero Ritz estimates has | ! | split off. None of these Ritz values may be removed by | ! | shifting. Decrease NP the number of shifts to apply. If | ! | no shifts may be applied, then prepare to exit | ! %---------------------------------------------------------% ! nptemp = np do 30 j=1, nptemp if (bounds(j) == zero) then np = np - 1 nev = nev + 1 end if 30 continue ! if ( (nconv >= numcnv) .or. & (iter > mxiter) .or. & (np == 0) ) then ! if (msglvl > 4) then call svout(logfil, kplusp, workl(kplusp**2+1), ndigit, & '_naup2: Real part of the eig computed by _neigh:') call svout(logfil, kplusp, workl(kplusp**2+kplusp+1), & ndigit, & '_naup2: Imag part of the eig computed by _neigh:') call svout(logfil, kplusp, workl(kplusp**2+kplusp*2+1), & ndigit, & '_naup2: Ritz eistmates computed by _neigh:') end if ! ! %------------------------------------------------% ! | Prepare to exit. Put the converged Ritz values | ! | and corresponding bounds in RITZ(1:NCONV) and | ! | BOUNDS(1:NCONV) respectively. Then sort. Be | ! | careful when NCONV > NP | ! %------------------------------------------------% ! ! %------------------------------------------% ! | Use h( 3,1 ) as storage to communicate | ! | rnorm to _neupd if needed | ! %------------------------------------------% h(3,1) = rnorm ! ! %----------------------------------------------% ! | To be consistent with sngets, we first do a | ! | pre-processing sort in order to keep complex | ! | conjugate pairs together. This is similar | ! | to the pre-processing sort used in sngets | ! | except that the sort is done in the opposite | ! | order. | ! %----------------------------------------------% ! if (which == 'LM') wprime = 'SR' if (which == 'SM') wprime = 'LR' if (which == 'LR') wprime = 'SM' if (which == 'SR') wprime = 'LM' if (which == 'LI') wprime = 'SM' if (which == 'SI') wprime = 'LM' ! call ssortc (wprime, .true., kplusp, ritzr, ritzi, bounds) ! ! %----------------------------------------------% ! | Now sort Ritz values so that converged Ritz | ! | values appear within the first NEV locations | ! | of ritzr, ritzi and bounds, and the most | ! | desired one appears at the front. | ! %----------------------------------------------% ! if (which == 'LM') wprime = 'SM' if (which == 'SM') wprime = 'LM' if (which == 'LR') wprime = 'SR' if (which == 'SR') wprime = 'LR' if (which == 'LI') wprime = 'SI' if (which == 'SI') wprime = 'LI' ! call ssortc(wprime, .true., kplusp, ritzr, ritzi, bounds) ! ! %--------------------------------------------------% ! | Scale the Ritz estimate of each Ritz value | ! | by 1 / max(eps23,magnitude of the Ritz value). | ! %--------------------------------------------------% ! do 35 j = 1, numcnv temp = max(eps23,slapy2(ritzr(j), & ritzi(j))) bounds(j) = bounds(j)/temp 35 continue ! ! %----------------------------------------------------% ! | Sort the Ritz values according to the scaled Ritz | ! | esitmates. This will push all the converged ones | ! | towards the front of ritzr, ritzi, bounds | ! | (in the case when NCONV < NEV.) | ! %----------------------------------------------------% ! wprime = 'LR' call ssortc(wprime, .true., numcnv, bounds, ritzr, ritzi) ! ! %----------------------------------------------% ! | Scale the Ritz estimate back to its original | ! | value. | ! %----------------------------------------------% ! do j = 1, numcnv temp = max(eps23, slapy2(ritzr(j), & ritzi(j))) bounds(j) = bounds(j)*temp end do ! ! %------------------------------------------------% ! | Sort the converged Ritz values again so that | ! | the "threshold" value appears at the front of | ! | ritzr, ritzi and bound. | ! %------------------------------------------------% ! call ssortc(which, .true., nconv, ritzr, ritzi, bounds) ! if (msglvl > 1) then call svout (logfil, kplusp, ritzr, ndigit, & '_naup2: Sorted real part of the eigenvalues') call svout (logfil, kplusp, ritzi, ndigit, & '_naup2: Sorted imaginary part of the eigenvalues') call svout (logfil, kplusp, bounds, ndigit, & '_naup2: Sorted ritz estimates.') end if ! ! %------------------------------------% ! | Max iterations have been exceeded. | ! %------------------------------------% ! if (iter > mxiter .and. nconv < numcnv) info = 1 ! ! %---------------------% ! | No shifts to apply. | ! %---------------------% ! if (np == 0 .and. nconv < numcnv) info = 2 ! np = nconv go to 1100 ! else if ( (nconv < numcnv) .and. (ishift == 1) ) then ! ! %-------------------------------------------------% ! | Do not have all the requested eigenvalues yet. | ! | To prevent possible stagnation, adjust the size | ! | of NEV. | ! %-------------------------------------------------% ! nevbef = nev nev = nev + min(nconv, np/2) if (nev == 1 .and. kplusp >= 6) then nev = kplusp / 2 else if (nev == 1 .and. kplusp > 3) then nev = 2 end if np = kplusp - nev ! ! %---------------------------------------% ! | If the size of NEV was just increased | ! | resort the eigenvalues. | ! %---------------------------------------% ! if (nevbef < nev) & call sngets (ishift, which, nev, np, ritzr, ritzi, & bounds, workl, workl(np+1)) ! end if ! if (msglvl > 0) then call ivout (logfil, 1, nconv, ndigit, & '_naup2: no. of "converged" Ritz values at this iter.') if (msglvl > 1) then kp(1) = nev kp(2) = np call ivout (logfil, 2, kp, ndigit, & '_naup2: NEV and NP are') call svout (logfil, nev, ritzr(np+1), ndigit, & '_naup2: "wanted" Ritz values -- real part') call svout (logfil, nev, ritzi(np+1), ndigit, & '_naup2: "wanted" Ritz values -- imag part') call svout (logfil, nev, bounds(np+1), ndigit, & '_naup2: Ritz estimates of the "wanted" values ') end if end if ! if (ishift == 0) then ! ! %-------------------------------------------------------% ! | User specified shifts: reverse comminucation to | ! | compute the shifts. They are returned in the first | ! | 2*NP locations of WORKL. | ! %-------------------------------------------------------% ! ushift = .true. ido = 3 go to 9000 end if ! 50 continue ! ! %------------------------------------% ! | Back from reverse communication; | ! | User specified shifts are returned | ! | in WORKL(1:2*NP) | ! %------------------------------------% ! ushift = .false. ! if ( ishift == 0 ) then ! ! %----------------------------------% ! | Move the NP shifts from WORKL to | ! | RITZR, RITZI to free up WORKL | ! | for non-exact shift case. | ! %----------------------------------% ! call scopy (np, workl, 1, ritzr, 1) call scopy (np, workl(np+1), 1, ritzi, 1) end if ! if (msglvl > 2) then call ivout (logfil, 1, np, ndigit, & '_naup2: The number of shifts to apply ') call svout (logfil, np, ritzr, ndigit, & '_naup2: Real part of the shifts') call svout (logfil, np, ritzi, ndigit, & '_naup2: Imaginary part of the shifts') if ( ishift == 1 ) & call svout (logfil, np, bounds, ndigit, & '_naup2: Ritz estimates of the shifts') end if ! ! %---------------------------------------------------------% ! | Apply the NP implicit shifts by QR bulge chasing. | ! | Each shift is applied to the whole upper Hessenberg | ! | matrix H. | ! | The first 2*N locations of WORKD are used as workspace. | ! %---------------------------------------------------------% ! call snapps (n, nev, np, ritzr, ritzi, v, ldv, & h, ldh, resid, q, ldq, workl, workd) ! ! %---------------------------------------------% ! | Compute the B-norm of the updated residual. | ! | Keep B*RESID in WORKD(1:N) to be used in | ! | the first step of the next call to snaitr. | ! %---------------------------------------------% ! cnorm = .true. call cpu_time (t2) if (bmat == 'G') then nbx = nbx + 1 call scopy (n, resid, 1, workd(n+1), 1) ipntr(1) = n + 1 ipntr(2) = 1 ido = 2 ! ! %----------------------------------% ! | Exit in order to compute B*RESID | ! %----------------------------------% ! go to 9000 else if (bmat == 'I') then call scopy (n, resid, 1, workd, 1) end if ! 100 continue ! ! %----------------------------------% ! | Back from reverse communication; | ! | WORKD(1:N) := B*RESID | ! %----------------------------------% ! if (bmat == 'G') then call cpu_time (t3) tmvbx = tmvbx + (t3 - t2) end if ! if (bmat == 'G') then rnorm = sdot (n, resid, 1, workd, 1) rnorm = sqrt(abs(rnorm)) else if (bmat == 'I') then rnorm = snrm2(n, resid, 1) end if cnorm = .false. ! if (msglvl > 2) then call svout (logfil, 1, rnorm, ndigit, & '_naup2: B-norm of residual for compressed factorization') call smout (logfil, nev, nev, h, ldh, ndigit, & '_naup2: Compressed upper Hessenberg matrix H') end if ! go to 1000 ! ! %---------------------------------------------------------------% ! | | ! | E N D O F M A I N I T E R A T I O N L O O P | ! | | ! %---------------------------------------------------------------% ! 1100 continue ! mxiter = iter nev = numcnv ! 1200 continue ido = 99 ! ! %------------% ! | Error Exit | ! %------------% ! call cpu_time (t1) tnaup2 = t1 - t0 ! 9000 continue ! ! %---------------% ! | End of snaup2 | ! %---------------% ! return end subroutine snaupd & ( ido, bmat, n, which, nev, tol, resid, ncv, v, ldv, iparam, & ipntr, workd, workl, lworkl, info ) ! !! SNAUPD is an interface for the Implicitly Restarted Arnoldi iteration. ! !\BeginDoc ! !\Name: snaupd ! !\Description: ! Reverse communication interface for the Implicitly Restarted Arnoldi ! iteration. This subroutine computes approximations to a few eigenpairs ! of a linear operator "OP" with respect to a semi-inner product defined by ! a symmetric positive semi-definite real matrix B. B may be the identity ! matrix. NOTE: If the linear operator "OP" is real and symmetric ! with respect to the real positive semi-definite symmetric matrix B, ! i.e. B*OP = (OP`)*B, then subroutine ssaupd should be used instead. ! ! The computed approximate eigenvalues are called Ritz values and ! the corresponding approximate eigenvectors are called Ritz vectors. ! ! snaupd is usually called iteratively to solve one of the ! following problems: ! ! Mode 1: A*x = lambda*x. ! ===> OP = A and B = I. ! ! Mode 2: A*x = lambda*M*x, M symmetric positive definite ! ===> OP = inv[M]*A and B = M. ! ===> (If M can be factored see remark 3 below) ! ! Mode 3: A*x = lambda*M*x, M symmetric semi-definite ! ===> OP = Real_Part{ inv[A - sigma*M]*M } and B = M. ! ===> shift-and-invert mode (in real arithmetic) ! If OP*x = amu*x, then ! amu = 1/2 * [ 1/(lambda-sigma) + 1/(lambda-conjg(sigma)) ]. ! Note: If sigma is real, i.e. imaginary part of sigma is zero; ! Real_Part{ inv[A - sigma*M]*M } == inv[A - sigma*M]*M ! amu == 1/(lambda-sigma). ! ! Mode 4: A*x = lambda*M*x, M symmetric semi-definite ! ===> OP = Imaginary_Part{ inv[A - sigma*M]*M } and B = M. ! ===> shift-and-invert mode (in real arithmetic) ! If OP*x = amu*x, then ! amu = 1/2i * [ 1/(lambda-sigma) - 1/(lambda-conjg(sigma)) ]. ! ! Both mode 3 and 4 give the same enhancement to eigenvalues close to ! the (complex) shift sigma. However, as lambda goes to infinity, ! the operator OP in mode 4 dampens the eigenvalues more strongly than ! does OP defined in mode 3. ! ! NOTE: The action of w <- inv[A - sigma*M]*v or w <- inv[M]*v ! should be accomplished either by a direct method ! using a sparse matrix factorization and solving ! ! [A - sigma*M]*w = v or M*w = v, ! ! or through an iterative method for solving these ! systems. If an iterative method is used, the ! convergence test must be more stringent than ! the accuracy requirements for the eigenvalue ! approximations. ! !\Usage: ! call snaupd ! ( IDO, BMAT, N, WHICH, NEV, TOL, RESID, NCV, V, LDV, IPARAM, ! IPNTR, WORKD, WORKL, LWORKL, INFO ) ! !\Arguments ! IDO Integer. (INPUT/OUTPUT) ! Reverse communication flag. IDO must be zero on the first ! call to snaupd. IDO will be set internally to ! indicate the type of operation to be performed. Control is ! then given back to the calling routine which has the ! responsibility to carry out the requested operation and call ! snaupd with the result. The operand is given in ! WORKD(IPNTR(1)), the result must be put in WORKD(IPNTR(2)). ! ------------------------------------------------------------- ! IDO = 0: first call to the reverse communication interface ! IDO = -1: compute Y = OP * X where ! IPNTR(1) is the pointer into WORKD for X, ! IPNTR(2) is the pointer into WORKD for Y. ! This is for the initialization phase to force the ! starting vector into the range of OP. ! IDO = 1: compute Y = OP * X where ! IPNTR(1) is the pointer into WORKD for X, ! IPNTR(2) is the pointer into WORKD for Y. ! In mode 3 and 4, the vector B * X is already ! available in WORKD(ipntr(3)). It does not ! need to be recomputed in forming OP * X. ! IDO = 2: compute Y = B * X where ! IPNTR(1) is the pointer into WORKD for X, ! IPNTR(2) is the pointer into WORKD for Y. ! IDO = 3: compute the IPARAM(8) real and imaginary parts ! of the shifts where INPTR(14) is the pointer ! into WORKL for placing the shifts. See Remark ! 5 below. ! IDO = 99: done ! ------------------------------------------------------------- ! ! BMAT Character*1. (INPUT) ! BMAT specifies the type of the matrix B that defines the ! semi-inner product for the operator OP. ! BMAT = 'I' -> standard eigenvalue problem A*x = lambda*x ! BMAT = 'G' -> generalized eigenvalue problem A*x = lambda*B*x ! ! N Integer. (INPUT) ! Dimension of the eigenproblem. ! ! WHICH Character*2. (INPUT) ! 'LM' -> want the NEV eigenvalues of largest magnitude. ! 'SM' -> want the NEV eigenvalues of smallest magnitude. ! 'LR' -> want the NEV eigenvalues of largest real part. ! 'SR' -> want the NEV eigenvalues of smallest real part. ! 'LI' -> want the NEV eigenvalues of largest imaginary part. ! 'SI' -> want the NEV eigenvalues of smallest imaginary part. ! ! NEV Integer. (INPUT/OUTPUT) ! Number of eigenvalues of OP to be computed. 0 < NEV < N-1. ! ! TOL Real scalar. (INPUT) ! Stopping criterion: the relative accuracy of the Ritz value ! is considered acceptable if BOUNDS(I) .LE. TOL*ABS(RITZ(I)) ! where ABS(RITZ(I)) is the magnitude when RITZ(I) is complex. ! DEFAULT = SLAMCH('EPS') (machine precision as computed ! by the LAPACK auxiliary subroutine SLAMCH). ! ! RESID Real array of length N. (INPUT/OUTPUT) ! On INPUT: ! If INFO .EQ. 0, a random initial residual vector is used. ! If INFO .NE. 0, RESID contains the initial residual vector, ! possibly from a previous run. ! On OUTPUT: ! RESID contains the final residual vector. ! ! NCV Integer. (INPUT) ! Number of columns of the matrix V. NCV must satisfy the two ! inequalities 2 <= NCV-NEV and NCV <= N. ! This will indicate how many Arnoldi vectors are generated ! at each iteration. After the startup phase in which NEV ! Arnoldi vectors are generated, the algorithm generates ! approximately NCV-NEV Arnoldi vectors at each subsequent update ! iteration. Most of the cost in generating each Arnoldi vector is ! in the matrix-vector operation OP*x. ! NOTE: 2 <= NCV-NEV in order that complex conjugate pairs of Ritz ! values are kept together. (See remark 4 below) ! ! V Real array N by NCV. (OUTPUT) ! Contains the final set of Arnoldi basis vectors. ! ! LDV Integer. (INPUT) ! Leading dimension of V exactly as declared in the calling program. ! ! IPARAM Integer array of length 11. (INPUT/OUTPUT) ! IPARAM(1) = ISHIFT: method for selecting the implicit shifts. ! The shifts selected at each iteration are used to restart ! the Arnoldi iteration in an implicit fashion. ! ------------------------------------------------------------- ! ISHIFT = 0: the shifts are provided by the user via ! reverse communication. The real and imaginary ! parts of the NCV eigenvalues of the Hessenberg ! matrix H are returned in the part of the WORKL ! array corresponding to RITZR and RITZI. See remark ! 5 below. ! ISHIFT = 1: exact shifts with respect to the current ! Hessenberg matrix H. This is equivalent to ! restarting the iteration with a starting vector ! that is a linear combination of approximate Schur ! vectors associated with the "wanted" Ritz values. ! ------------------------------------------------------------- ! ! IPARAM(2) = No longer referenced. ! ! IPARAM(3) = MXITER ! On INPUT: maximum number of Arnoldi update iterations allowed. ! On OUTPUT: actual number of Arnoldi update iterations taken. ! ! IPARAM(4) = NB: blocksize to be used in the recurrence. ! The code currently works only for NB = 1. ! ! IPARAM(5) = NCONV: number of "converged" Ritz values. ! This represents the number of Ritz values that satisfy ! the convergence criterion. ! ! IPARAM(6) = IUPD ! No longer referenced. Implicit restarting is ALWAYS used. ! ! IPARAM(7) = MODE ! On INPUT determines what type of eigenproblem is being solved. ! Must be 1,2,3,4; See under \Description of snaupd for the ! four modes available. ! ! IPARAM(8) = NP ! When ido = 3 and the user provides shifts through reverse ! communication (IPARAM(1)=0), snaupd returns NP, the number ! of shifts the user is to provide. 0 < NP <=NCV-NEV. See Remark ! 5 below. ! ! IPARAM(9) = NUMOP, IPARAM(10) = NUMOPB, IPARAM(11) = NUMREO, ! OUTPUT: NUMOP = total number of OP*x operations, ! NUMOPB = total number of B*x operations if BMAT='G', ! NUMREO = total number of steps of re-orthogonalization. ! ! IPNTR Integer array of length 14. (OUTPUT) ! Pointer to mark the starting locations in the WORKD and WORKL ! arrays for matrices/vectors used by the Arnoldi iteration. ! ------------------------------------------------------------- ! IPNTR(1): pointer to the current operand vector X in WORKD. ! IPNTR(2): pointer to the current result vector Y in WORKD. ! IPNTR(3): pointer to the vector B * X in WORKD when used in ! the shift-and-invert mode. ! IPNTR(4): pointer to the next available location in WORKL ! that is untouched by the program. ! IPNTR(5): pointer to the NCV by NCV upper Hessenberg matrix ! H in WORKL. ! IPNTR(6): pointer to the real part of the ritz value array ! RITZR in WORKL. ! IPNTR(7): pointer to the imaginary part of the ritz value array ! RITZI in WORKL. ! IPNTR(8): pointer to the Ritz estimates in array WORKL associated ! with the Ritz values located in RITZR and RITZI in WORKL. ! ! IPNTR(14): pointer to the NP shifts in WORKL. See Remark 5 below. ! ! Note: IPNTR(9:13) is only referenced by sneupd. See Remark 2 below. ! ! IPNTR(9): pointer to the real part of the NCV RITZ values of the ! original system. ! IPNTR(10): pointer to the imaginary part of the NCV RITZ values of ! the original system. ! IPNTR(11): pointer to the NCV corresponding error bounds. ! IPNTR(12): pointer to the NCV by NCV upper quasi-triangular ! Schur matrix for H. ! IPNTR(13): pointer to the NCV by NCV matrix of eigenvectors ! of the upper Hessenberg matrix H. Only referenced by ! sneupd if RVEC = .TRUE. See Remark 2 below. ! ------------------------------------------------------------- ! ! WORKD Real work array of length 3*N. (REVERSE COMMUNICATION) ! Distributed array to be used in the basic Arnoldi iteration ! for reverse communication. The user should not use WORKD ! as temporary workspace during the iteration. Upon termination ! WORKD(1:N) contains B*RESID(1:N). If an invariant subspace ! associated with the converged Ritz values is desired, see remark ! 2 below, subroutine sneupd uses this output. ! See Data Distribution Note below. ! ! WORKL Real work array of length LWORKL. (OUTPUT/WORKSPACE) ! Private (replicated) array on each PE or array allocated on ! the front end. See Data Distribution Note below. ! ! LWORKL Integer. (INPUT) ! LWORKL must be at least 3*NCV**2 + 6*NCV. ! ! INFO Integer. (INPUT/OUTPUT) ! If INFO .EQ. 0, a randomly initial residual vector is used. ! If INFO .NE. 0, RESID contains the initial residual vector, ! possibly from a previous run. ! Error flag on output. ! = 0: Normal exit. ! = 1: Maximum number of iterations taken. ! All possible eigenvalues of OP has been found. IPARAM(5) ! returns the number of wanted converged Ritz values. ! = 2: No longer an informational error. Deprecated starting ! with release 2 of ARPACK. ! = 3: No shifts could be applied during a cycle of the ! Implicitly restarted Arnoldi iteration. One possibility ! is to increase the size of NCV relative to NEV. ! See remark 4 below. ! = -1: N must be positive. ! = -2: NEV must be positive. ! = -3: NCV-NEV >= 2 and less than or equal to N. ! = -4: The maximum number of Arnoldi update iteration ! must be greater than zero. ! = -5: WHICH must be one of 'LM', 'SM', 'LR', 'SR', 'LI', 'SI' ! = -6: BMAT must be one of 'I' or 'G'. ! = -7: Length of private work array is not sufficient. ! = -8: Error return from LAPACK eigenvalue calculation; ! = -9: Starting vector is zero. ! = -10: IPARAM(7) must be 1,2,3,4. ! = -11: IPARAM(7) = 1 and BMAT = 'G' are incompatable. ! = -12: IPARAM(1) must be equal to 0 or 1. ! = -9999: Could not build an Arnoldi factorization. ! IPARAM(5) returns the size of the current Arnoldi ! factorization. ! !\Remarks ! 1. The computed Ritz values are approximate eigenvalues of OP. The ! selection of WHICH should be made with this in mind when ! Mode = 3 and 4. After convergence, approximate eigenvalues of the ! original problem may be obtained with the ARPACK subroutine sneupd. ! ! 2. If a