Tpetra parallel linear algebra Version of the Day
Loading...
Searching...
No Matches
Tpetra_DistributionLowerTriangularBlock.hpp
1// @HEADER
2// ***********************************************************************
3//
4// Tpetra: Templated Linear Algebra Services Package
5// Copyright (2008) Sandia Corporation
6//
7// Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation,
8// the U.S. Government retains certain rights in this software.
9//
10// Redistribution and use in source and binary forms, with or without
11// modification, are permitted provided that the following conditions are
12// met:
13//
14// 1. Redistributions of source code must retain the above copyright
15// notice, this list of conditions and the following disclaimer.
16//
17// 2. Redistributions in binary form must reproduce the above copyright
18// notice, this list of conditions and the following disclaimer in the
19// documentation and/or other materials provided with the distribution.
20//
21// 3. Neither the name of the Corporation nor the names of the
22// contributors may be used to endorse or promote products derived from
23// this software without specific prior written permission.
24//
25// THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY
26// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
27// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
28// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE
29// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
30// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
31// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
32// PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
33// LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
34// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
35// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
36//
37// Questions? Contact Michael A. Heroux (maherou@sandia.gov)
38//
39// ************************************************************************
40// @HEADER
41
42#ifndef __TPETRA_DISTRIBUTORLOWERTRIANGULARBLOCK_HPP
43#define __TPETRA_DISTRIBUTORLOWERTRIANGULARBLOCK_HPP
44
45// Needed by DistributionLowerTriangularBlock
46#include "Tpetra_Distributor.hpp"
47
48// Needed by LowerTriangularBlock operator
49#include "Tpetra_Core.hpp"
50#include "Tpetra_Map.hpp"
51#include "Tpetra_Operator.hpp"
52#include "Tpetra_Vector.hpp"
53#include "Tpetra_CrsMatrix.hpp"
54
55namespace Tpetra
56{
57
59template <typename gno_t, typename scalar_t>
60class DistributionLowerTriangularBlock : public Distribution<gno_t,scalar_t> {
61// Seher Acer's lower-triangular block decomposition for triangle counting
62// See also: LowerTriangularBlockOperator below that allows this distribution
63// to be used in Tpetra SpMV.
64//
65// Requirements:
66// Matrix must be square (undirected graph)
67// Number of processors np = q(q+1)/2 for some q.
68//
69// Only the lower triangular portion of the matrix is stored.
70// Processors are arranged logically as follows:
71// 0
72// 1 2
73// 3 4 5
74// ...
75//
76// The lower triangular part of the matrix is stored in a 2D distribution.
77// For example, the dense 7x7 lower triangular matrix below would be assigned
78// to processors according to numbers shown as follows:
79// 0 | |
80// 00| |
81// ---------
82// 11|2 |
83// 11|22 |
84// 11|222|
85// ---------
86// 33|444|5
87// 33|444|55
88// ...
89// (Note that we expect the matrix to be sparse. For dense matrices,
90// CrsMatrix is the wrong tool.)
91//
92// Matrix rows are assigned to processor rows greedily to roughly balance
93// (# nonzeros in processor row / # processors in processor row)
94// across processor rows.
95// The same cuts are used to divide rows and columns among processors
96// (that is, all processors have a square block).
97//
98// The lower triangular algorithm:
99// 1. distribute all matrix entries via 1D linear distribution
100// (this initial distribution is needed to avoid storing the entire
101// matrix on one processor, while providing info about the nonzeros per row
102// needed in step 2.
103// 2. (optional) sort rows in decreasing order wrt the number of nonzeros
104// per row
105// 3. find "chunk cuts": divisions in row assignments such that
106// (# nonzeros in processor row / # processors in processor row) is
107// roughly equal for all processor rows
108// 4. send nonzeros to their new processor assignment
109//
110// Known issues: (TODO)
111// - The sorting in Step 2 and computation of chunk cuts in step 3 are
112// currently done in serial and requires O(number of rows) storage each
113// processor. More effort could parallelize this computation, but parallel
114// load balancing algorithms are more appropriate in Zoltan2 than Tpetra.
115// - The sorting in Step 2 renumbers the rows (assigns new Global Ordinals to
116// the rows) to make them contiguous, as needed in Acer's triangle counting
117// algorithm.
118// (Acer's algorithm relies on local indexing from the chunk boundaries to
119// find neighbors needed for communication.)
120// The class currently provides a permutation matrix P describing the
121// reordering. Thus, the matrix stored in the lower triangular block
122// distribution is actually P A P -- the row and column permutation of
123// matrix A in the Matrix Market file.
124// The fact that a permuted matrix is stored complicates use of the matrix
125// in algorithms other than Acer's triangle counting. For SpMV with the
126// vector numbered according to the MatrixMarket numbering, for example,
127// P^T must be applied to the vector before SpMV, and P^T must be applied to
128// the result of SpMV. See LowerTriangularBlockOperator to see how this
129// permutation matrix is used.
130//
131// Before addressing these issues, we will decide (TODO)
132// - Is this Distribution general enough to be in Tpetra?
133// - Should we, instead, have a separate package for distributions (that could
134// use Zoltan2 and Tpetra without circular dependence)?
135// - Or should we allow users (such as the triangle counting algorithm) to
136// provide their own distributions (e.g., LowerTriangularBlock) that
137// inherit from Tpetra's Distribution class?
138// For now, we will push this Distribution into Tpetra, but we will revisit
139// this decision.
140
141public:
142 using Distribution<gno_t,scalar_t>::me;
143 using Distribution<gno_t,scalar_t>::np;
144 using Distribution<gno_t,scalar_t>::comm;
145 using Distribution<gno_t,scalar_t>::nrows;
146 using typename Distribution<gno_t,scalar_t>::NZindex_t;
147 using typename Distribution<gno_t,scalar_t>::LocalNZmap_t;
148
149 using map_t = Tpetra::Map<>;
150 using matrix_t = Tpetra::CrsMatrix<scalar_t>;
151
152 DistributionLowerTriangularBlock(size_t nrows_,
153 const Teuchos::RCP<const Teuchos::Comm<int> > &comm_,
154 const Teuchos::ParameterList &params) :
155 Distribution<gno_t,scalar_t>(nrows_, comm_, params),
156 initialDist(nrows_, comm_, params),
157 sortByDegree(false), permMatrix(Teuchos::null),
158 redistributed(false), chunksComputed(false), nChunks(0)
159 {
160 int npnp = 2 * np;
161 nChunks = int(std::sqrt(float(npnp)));
162 while (nChunks * (nChunks + 1) < npnp) nChunks++;
163
164 TEUCHOS_TEST_FOR_EXCEPTION(nChunks * (nChunks+1) != npnp, std::logic_error,
165 "Number of processors np = " << np <<
166 " must satisfy np = q(q+1)/2 for some q" <<
167 " for LowerTriangularBlock distribution");
168 nChunksPerRow = double(nChunks) / double(nrows);
169
170 const Teuchos::ParameterEntry *pe = params.getEntryPtr("sortByDegree");
171 if (pe != NULL) sortByDegree = pe->getValue<bool>(&sortByDegree);
172
173 pe = params.getEntryPtr("readPerProcess");
174 if (pe != NULL) redistributed = pe->getValue<bool>(&redistributed);
175
176 if (me == 0) std::cout << "\n LowerTriangularBlock Distribution: "
177 << "\n np = " << np
178 << "\n nChunks = " << nChunks
179 << std::endl;
180 }
181
182 enum DistributionType DistType() { return LowerTriangularBlock; }
183
184 bool areChunksComputed() {return chunksComputed; }
185
186 Teuchos::Array<gno_t> getChunkCuts() {
187 if(chunksComputed)
188 return chunkCuts;
189 else {
190 throw std::runtime_error("Error: Requested chunk cuts have not been computed yet.");
191 }
192 }
193
194 // Return whether this rank owns vector entry i.
195 // TODO: for now, use same vector dist as 1DLinear;
196 // TODO: think about best distribution of Vectors
197 inline bool VecMine(gno_t i) { return initialDist.VecMine(i); }
198
199 // Return whether this rank owns nonzero (i,j)
200 // Vector map and row map are the same in 1D distribution.
201 // But keep only the lower Triangular entries
202 bool Mine(gno_t i, gno_t j) {
203 if (redistributed) {
204 if (j > i) return false; // Don't keep any upper triangular entries
205 else return (procFromChunks(i,j) == me);
206 }
207 else
208 return initialDist.Mine(i,j);
209 }
210
211 inline bool Mine(gno_t i, gno_t j, int p) {return Mine(i,j);}
212
213 // How to redistribute according to chunk-based row distribution
214 void Redistribute(LocalNZmap_t &localNZ)
215 {
216 // Going to do chunking and sorting serially for now;
217 // need to gather per-row information from each processor
218 // TODO: think about a parallel implementation
219
220 gno_t myFirstRow = initialDist.getFirstRow(me);
221 gno_t nMyRows = initialDist.getNumRow(me);
222 Teuchos::Array<gno_t> nnzPerRow(nMyRows, 0);
223
224 Teuchos::Array<int> rcvcnt(np);
225 Teuchos::Array<int> disp(np);
226 for (int sum = 0, p = 0; p < np; p++) {
227 int prows = initialDist.getNumRow(p);
228 rcvcnt[p] = prows;
229 disp[p] = sum;
230 sum += prows;
231 }
232
233 // If desire sortByDegree, first need to sort with respect to ALL entries
234 // in matrix (not lower-triangular entries);
235 // decreasing sort by number of entries per row in global matrix.
236 // Generate permuteIndex for the sorted rows
237
238 Teuchos::Array<gno_t> permuteIndex; // This is the inverse permutation
239 Teuchos::Array<gno_t> sortedOrder; // This is the original permutation
240
241 Teuchos::Array<gno_t> globalRowBuf;
242 // TODO Dunno why there isn't a Teuchos::gatherAllv;
243 // TODO for now, compute and broadcast
244 if (me == 0) {
245 globalRowBuf.resize(nrows, 0); // TODO: Ick! Need parallel
246 }
247
248 if (sortByDegree) {
249 // Compute nnzPerRow; distribution is currently 1D and includes all nz
250 for (auto it = localNZ.begin(); it != localNZ.end(); it++) {
251 gno_t I = it->first.first;
252 nnzPerRow[I-myFirstRow]++;
253 }
254
255 Teuchos::gatherv<int,gno_t>(nnzPerRow.getRawPtr(), nMyRows,
256 globalRowBuf.getRawPtr(),
257 rcvcnt.getRawPtr(), disp.getRawPtr(),
258 0, *comm);
259
260 permuteIndex.resize(nrows); // TODO: Ick! Need parallel
261 sortedOrder.resize(nrows); // TODO: Ick! Need parallel
262
263 if (me == 0) { // TODO: do on all procs once have allgatherv
264
265 for (size_t i = 0 ; i != nrows; i++) sortedOrder[i] = i;
266
267 std::sort(sortedOrder.begin(), sortedOrder.end(),
268 [&](const size_t& a, const size_t& b) {
269 return (globalRowBuf[a] > globalRowBuf[b]);
270 }
271 );
272
273 // Compute inverse permutation; it is more useful for our needs
274 for (size_t i = 0; i < nrows; i++) {
275 permuteIndex[sortedOrder[i]] = i;
276 }
277 }
278
279 Teuchos::broadcast<int,gno_t>(*comm, 0, permuteIndex(0,nrows));
280 // Ick! Use a directory TODO
281
282 // Sorting is changing the global IDs associated
283 // with rows/columns. To make this distribution applicable beyond
284 // triangle counting (e.g., in a Tpetra operator), we need a way
285 // to map from the original global IDs and back again.
286 // Create a permutation matrix for use in the operator; use
287 // default Tpetra layout.
288 Teuchos::Array<gno_t> myRows;
289 for (size_t i = 0; i < nrows; i++) {
290 if (VecMine(i)) myRows.push_back(i);
291 }
292
293 Tpetra::global_size_t dummy =
294 Teuchos::OrdinalTraits<Tpetra::global_size_t>::invalid();
295 Teuchos::RCP<const map_t> permMap =
296 rcp(new map_t(dummy, myRows(), 0, comm));
297
298 permMatrix = rcp(new matrix_t(permMap, 1)); // one nz / row in permMatrix
299
300 Teuchos::Array<gno_t> cols(1);
301 Teuchos::Array<scalar_t> vals(1); vals[0] = 1.;
302
303 for (size_t i = 0; i < permMap->getLocalNumElements(); i++) {
304 gno_t gid = permMap->getGlobalElement(i);
305 cols[0] = permuteIndex[gid];
306 permMatrix->insertGlobalValues(gid, cols(), vals());
307 }
308
309 permMatrix->fillComplete(permMap, permMap);
310 }
311
312 // Now, to determine the chunks, we care only about the number of
313 // nonzeros in the lower triangular matrix.
314 // Compute nnzPerRow; distribution is currently 1D
315 nnzPerRow.assign(nMyRows, 0);
316 size_t nnz = 0;
317 for (auto it = localNZ.begin(); it != localNZ.end(); it++) {
318 gno_t I = (sortByDegree ? permuteIndex[it->first.first]
319 : it->first.first);
320 gno_t J = (sortByDegree ? permuteIndex[it->first.second]
321 : it->first.second);
322 if (J <= I) {// Lower-triangular part
323 nnzPerRow[it->first.first - myFirstRow]++;
324 nnz++;
325 }
326 }
327
328 // TODO Dunno why there isn't a Teuchos::gatherAllv;
329 // TODO for now, compute and broadcast
330
331 Teuchos::gatherv<int,gno_t>(nnzPerRow.getRawPtr(), nMyRows,
332 globalRowBuf.getRawPtr(),
333 rcvcnt.getRawPtr(), disp.getRawPtr(),
334 0, *comm);
335
336 Teuchos::Array<int>().swap(rcvcnt); // no longer needed
337 Teuchos::Array<int>().swap(disp); // no longer needed
338
339 size_t gNnz;
340 Teuchos::reduceAll(*comm, Teuchos::REDUCE_SUM, 1, &nnz, &gNnz);
341
342 chunkCuts.resize(nChunks+1, 0);
343
344
345 if (me == 0) { // TODO: when have allgatherv, can do on all procs
346 // TODO: or better, implement parallel version
347
348 // Determine chunk cuts
349 size_t target = gNnz / np; // target nnz per processor
350 size_t targetRunningTotal = 0;
351 size_t currentRunningTotal = 0;
352 gno_t I = gno_t(0);
353 for (int chunkCnt = 0; chunkCnt < nChunks; chunkCnt++) {
354 targetRunningTotal = (target * (chunkCnt+1));
355 currentRunningTotal = 0;
356 while (I < static_cast<gno_t>(nrows)) {
357 size_t nextNnz = (sortByDegree ? globalRowBuf[sortedOrder[I]]
358 : globalRowBuf[I]);
359 if (currentRunningTotal + nextNnz <= targetRunningTotal) {
360 currentRunningTotal += nextNnz;
361 I++;
362 }
363 else
364 break;
365 }
366 chunkCuts[chunkCnt+1] = I;
367 }
368 chunkCuts[nChunks] = static_cast<gno_t>(nrows);
369 }
370
371 // Free memory associated with globalRowBuf
372 Teuchos::Array<gno_t>().swap(globalRowBuf);
373
374 Teuchos::broadcast<int,gno_t>(*comm, 0, chunkCuts(0,nChunks+1));
375 chunksComputed = true;
376
377 // Determine new owner of each nonzero; buffer for sending
378 Kokkos::View<gno_t*, Kokkos::HostSpace> iOut("iOut", localNZ.size());
379 Kokkos::View<gno_t*, Kokkos::HostSpace> jOut("jOut", localNZ.size());
380 Kokkos::View<scalar_t*, Kokkos::HostSpace> vOut("vOut", localNZ.size());
381 Teuchos::Array<int> pOut(localNZ.size());
382
383 size_t sendCnt = 0;
384 for (auto it = localNZ.begin(); it != localNZ.end(); it++) {
385 iOut[sendCnt] = (sortByDegree ? permuteIndex[it->first.first]
386 : it->first.first);
387 jOut[sendCnt] = (sortByDegree ? permuteIndex[it->first.second]
388 : it->first.second);
389 if (jOut[sendCnt] <= iOut[sendCnt]) { // keep only lower diagonal entries
390 vOut[sendCnt] = it->second;
391 pOut[sendCnt] = procFromChunks(iOut[sendCnt], jOut[sendCnt]);
392
393 sendCnt++;
394 }
395 }
396
397 // Free memory associated with localNZ and permuteIndex
398 LocalNZmap_t().swap(localNZ);
399 if (sortByDegree) Teuchos::Array<gno_t>().swap(permuteIndex);
400
401 // Use a Distributor to send nonzeros to new processors.
402 Tpetra::Distributor plan(comm);
403 size_t nrecvs = plan.createFromSends(pOut(0,sendCnt));
404 Kokkos::View<gno_t*, Kokkos::HostSpace> iIn("iIn", nrecvs);
405 Kokkos::View<gno_t*, Kokkos::HostSpace> jIn("jIn", nrecvs);
406 Kokkos::View<scalar_t*, Kokkos::HostSpace> vIn("vIn", nrecvs);
407
408 // TODO: With more clever packing, could do only one round of communication
409 auto sendIndices = std::make_pair(static_cast<size_t>(0), sendCnt);
410 plan.doPostsAndWaits(Kokkos::subview(iOut, sendIndices), 1, iIn);
411 plan.doPostsAndWaits(Kokkos::subview(jOut, sendIndices), 1, jIn);
412 plan.doPostsAndWaits(Kokkos::subview(vOut, sendIndices), 1, vIn);
413
414 // Put received nonzeros in map
415 for (size_t n = 0; n < nrecvs; n++) {
416 NZindex_t nz(iIn[n], jIn[n]);
417 localNZ[nz] = vIn[n];
418 }
419
420 redistributed = true;
421 }
422
423 Teuchos::RCP<matrix_t> getPermutationMatrix() const { return permMatrix; }
424
425private:
426 // Initially distribute nonzeros with a 1D linear distribution
427 Distribution1DLinear<gno_t,scalar_t> initialDist;
428
429 // Flag indicating whether matrix should be reordered and renumbered
430 // in decreasing sort order of number of nonzeros per row in full matrix
431 bool sortByDegree;
432
433 // Column permutation matrix built only when sortByDegree = true;
434 Teuchos::RCP<matrix_t> permMatrix;
435
436 // Flag whether redistribution has occurred yet
437 // This is true
438 // i) after Tpetra performs the redistribution or
439 // ii) when Tpetra reads already-distributed nonzeros by readPerProcess function
440 bool redistributed;
441
442 // If we read the already-distributed nonzeros from per-process files,
443 // this will remain false until a triangle counting code actually computes
444 // the chunks when the need arises.
445 bool chunksComputed;
446
447 int nChunks; // in np = q(q+1)/2 layout, nChunks = q
448 double nChunksPerRow;
449 Teuchos::Array<gno_t> chunkCuts;
450
451 int findIdxInChunks(gno_t I) {
452 int m = I * nChunksPerRow;
453 while (I < chunkCuts[m]) m--;
454 while (I >= chunkCuts[m+1]) m++;
455 return m;
456 }
457
458 int procFromChunks(gno_t I, gno_t J) {
459 int m = findIdxInChunks(I);
460 int n = findIdxInChunks(J);
461 int p = m*(m+1)/2 + n;
462 return p;
463 }
464};
465
466
468// Tpetra::Operator that works with the DistributionLowerTriangularBlock
469
470template <typename scalar_t,
471 class Node = ::Tpetra::Details::DefaultTypes::node_type>
472class LowerTriangularBlockOperator :
473 public Tpetra::Operator<scalar_t, Tpetra::Map<>::local_ordinal_type,
474 Tpetra::Map<>::global_ordinal_type,
475 Node>
476{
477public:
480 using map_t = Tpetra::Map<>;
481 using import_t = Tpetra::Import<>;
482 using export_t = Tpetra::Export<>;
483 using vector_t = Tpetra::Vector<scalar_t>;
484 using mvector_t = Tpetra::MultiVector<scalar_t>;
487
488 LowerTriangularBlockOperator(
489 const Teuchos::RCP<const matrix_t> &lowerTriangularMatrix_,
490 const dist_t &dist)
491 : lowerTriangularMatrix(lowerTriangularMatrix_),
492 permMatrix(dist.getPermutationMatrix())
493 {
494 // LowerTriangularBlockOperator requires the range map and domain map
495 // to be the same. Check it here.
496 TEUCHOS_TEST_FOR_EXCEPTION(
497 !lowerTriangularMatrix->getRangeMap()->isSameAs(
498 *lowerTriangularMatrix->getDomainMap()),
499 std::logic_error,
500 "The Domain and Range maps of the LowerTriangularBlock matrix "
501 "must be the same");
502
503 // Extract diagonals
504
505 vector_t diagByRowMap(lowerTriangularMatrix->getRowMap());
506 lowerTriangularMatrix->getLocalDiagCopy(diagByRowMap);
507 diag = Teuchos::rcp(new vector_t(lowerTriangularMatrix->getRangeMap()));
508 Tpetra::Export<> exporter(lowerTriangularMatrix->getRowMap(),
509 lowerTriangularMatrix->getRangeMap());
510 diag->doExport(diagByRowMap, exporter, Tpetra::ADD);
511 }
512
513 void apply(const mvector_t &x, mvector_t &y, Teuchos::ETransp mode,
514 scalar_t alpha, scalar_t beta) const
515 {
516 scalar_t ZERO = Teuchos::ScalarTraits<scalar_t>::zero();
517 scalar_t ONE = Teuchos::ScalarTraits<scalar_t>::one();
518 if (alpha == ZERO) {
519 if (beta == ZERO) y.putScalar(ZERO);
520 else y.scale(beta);
521 return;
522 }
523
524 if (permMatrix == Teuchos::null) {
525
526 // Multiply lower triangular
527 lowerTriangularMatrix->apply(x, y, Teuchos::NO_TRANS, alpha, beta);
528
529 // Multiply upper triangular
530 lowerTriangularMatrix->apply(x, y, Teuchos::TRANS, alpha, ONE);
531
532 // Subtract out duplicate diagonal terms
533 y.elementWiseMultiply(-alpha, *diag, x, ONE);
534 }
535 else {
536
537 // With sorting, the LowerTriangularBlock distribution stores (P^T A P)
538 // in the CrsMatrix, for permutation matrix P.
539 // Thus, apply must compute
540 // y = P (beta (P^T y) + alpha (P^T A P) (P^T x))
541
542 vector_t xtmp(x.getMap(), x.getNumVectors());
543 vector_t ytmp(y.getMap(), y.getNumVectors());
544
545 permMatrix->apply(x, xtmp, Teuchos::TRANS);
546 if (beta != ZERO) permMatrix->apply(y, ytmp, Teuchos::TRANS);
547
548 // Multiply lower triangular
549 lowerTriangularMatrix->apply(xtmp, ytmp, Teuchos::NO_TRANS, alpha, beta);
550
551 // Multiply upper triangular
552 lowerTriangularMatrix->apply(xtmp, ytmp, Teuchos::TRANS, alpha, ONE);
553
554 // Subtract out duplicate diagonal terms
555 ytmp.elementWiseMultiply(-alpha, *diag, xtmp, ONE);
556
557 permMatrix->apply(ytmp, y, Teuchos::NO_TRANS);
558 }
559 }
560
561 Teuchos::RCP<const map_t> getDomainMap() const {
562 return lowerTriangularMatrix->getDomainMap();
563 }
564
565 Teuchos::RCP<const map_t> getRangeMap() const {
566 return lowerTriangularMatrix->getRangeMap();
567 }
568
569 bool hasTransposeApply() const {return true;} // Symmetric matrix
570
571private:
572 const Teuchos::RCP<const matrix_t > lowerTriangularMatrix;
573 const Teuchos::RCP<const matrix_t > permMatrix;
574 Teuchos::RCP<vector_t> diag;
575};
576
577
578}
579#endif
Functions for initializing and finalizing Tpetra.
Struct that holds views of the contents of a CrsMatrix.
Sets up and executes a communication plan for a Tpetra DistObject.
Abstract interface for operators (e.g., matrices and preconditioners).
Namespace Tpetra contains the class and methods constituting the Tpetra library.
@ ADD
Sum new values.
@ ZERO
Replace old values with zero.