SIDE
SIDE is CHARACTER*1
= 'L': apply Q or Q**T from the Left;
= 'R': apply Q or Q**T from the Right.
TRANS
TRANS is CHARACTER*1
= 'N': No transpose, apply Q;
= 'T': Transpose, apply Q**T.
M
M is INTEGER
The number of rows of the matrix C. M >=0.
N
N is INTEGER
The number of columns of the matrix C. N >= 0.
K
K is INTEGER
The number of elementary reflectors whose product defines
the matrix Q.
M >= K >= 0;
MB
MB is INTEGER
The row block size to be used in the blocked LQ.
M >= MB >= 1
NB
NB is INTEGER
The column block size to be used in the blocked LQ.
NB > M.
A
A is REAL array, dimension
(LDA,M) if SIDE = 'L',
(LDA,N) if SIDE = 'R'
The i-th row must contain the vector which defines the blocked
elementary reflector H(i), for i = 1,2,...,k, as returned by
SLASWLQ in the first k rows of its array argument A.
LDA
LDA is INTEGER
The leading dimension of the array A. LDA >= max(1,K).
T
T is REAL array, dimension
( M * Number of blocks(CEIL(N-K/NB-K)),
The blocked upper triangular block reflectors stored in compact form
as a sequence of upper triangular blocks. See below
for further details.
LDT
LDT is INTEGER
The leading dimension of the array T. LDT >= MB.
C
C is REAL array, dimension (LDC,N)
On entry, the M-by-N matrix C.
On exit, C is overwritten by Q*C or Q**T*C or C*Q**T or C*Q.
LDC
LDC is INTEGER
The leading dimension of the array C. LDC >= max(1,M).
WORK
(workspace) REAL array, dimension (MAX(1,LWORK))
LWORK
LWORK is INTEGER
The dimension of the array WORK.
If SIDE = 'L', LWORK >= max(1,NB) * MB;
if SIDE = 'R', LWORK >= max(1,M) * MB.
If LWORK = -1, then a workspace query is assumed; the routine
only calculates the optimal size of the WORK array, returns
this value as the first entry of the WORK array, and no error
message related to LWORK is issued by XERBLA.
INFO
INFO is INTEGER
= 0: successful exit
< 0: if INFO = -i, the i-th argument had an illegal value
Short-Wide LQ (SWLQ) performs LQ by a sequence of orthogonal transformations,
representing Q as a product of other orthogonal matrices
Q = Q(1) * Q(2) * . . . * Q(k)
where each Q(i) zeros out upper diagonal entries of a block of NB rows of A:
Q(1) zeros out the upper diagonal entries of rows 1:NB of A
Q(2) zeros out the bottom MB-N rows of rows [1:M,NB+1:2*NB-M] of A
Q(3) zeros out the bottom MB-N rows of rows [1:M,2*NB-M+1:3*NB-2*M] of A
. . .
Q(1) is computed by GELQT, which represents Q(1) by Householder vectors
stored under the diagonal of rows 1:MB of A, and by upper triangular
block reflectors, stored in array T(1:LDT,1:N).
For more information see Further Details in GELQT.
Q(i) for i>1 is computed by TPLQT, which represents Q(i) by Householder vectors
stored in columns [(i-1)*(NB-M)+M+1:i*(NB-M)+M] of A, and by upper triangular
block reflectors, stored in array T(1:LDT,(i-1)*M+1:i*M).
The last Q(k) may use fewer rows.
For more information see Further Details in TPLQT.
For more details of the overall algorithm, see the description of
Sequential TSQR in Section 2.2 of [1].
[1] “Communication-Optimal Parallel and Sequential QR and LU Factorizations,”
J. Demmel, L. Grigori, M. Hoemmen, J. Langou,
SIAM J. Sci. Comput, vol. 34, no. 1, 2012