HPL_pdlaswp01T - Broadcast a column panel L and swap the row panel U.
void HPL_pdlaswp01T( HPL_T_panel * PBCST
, HPL_T_panel * PANEL
, const int
applies the NB row interchanges to NN columns of the
trailing submatrix and broadcast a column panel.
A "Spread then roll" algorithm performs the swap :: broadcast of the
row panel U at once, resulting in a minimal communication volume and a
"very good" use of the connectivity if available. With P process
rows and assuming bi-directional links, the running time of this function can
be approximated by:
(log_2(P)+(P-1)) * lat + K * NB * LocQ(N) / bdwth
where NB is the number of rows of the row panel U, N is the global number of
columns being updated, lat and bdwth are the latency and bandwidth of the
network for double precision real words. K is a constant in (2,3] that depends
on the achieved bandwidth during a simultaneous message exchange between two
processes. An empirical optimistic value of K is typically 2.4.
- PBCST (local input/output) HPL_T_panel *
- On entry, PBCST points to the data structure containing the panel (to be
- IFLAG (local input/output) int *
- On entry, IFLAG indicates whether or not the broadcast has already been
completed. If not, probing will occur, and the outcome will be contained
in IFLAG on exit.
- PANEL (local input/output) HPL_T_panel *
- On entry, PANEL points to the data structure containing the panel
- NN (local input) const int
- On entry, NN specifies the local number of columns of the trailing
submatrix to be swapped and broadcast starting at the current position. NN
must be at least zero.