This is the PETSc to-do list; any help on these would be greatly
appreciated. Each of the first few projects could be done by a graduate
student or advanced undergraduate in a couple of months.
- When PETSc loads a matrix rank 0 first loads its piece of data, then loads data to be sent to other processes.
This algorithm basically causes that rank 0 uses twice the memory it needs, at least for a while.
This is sometimes a problem, mainly for big matrices. If you know that a piece of your matrix will be safely stored on one node, sometimes it happens that a single node cannot keep twice its piece of the matrix.
If MatLoad don't immediately load the data for rank 0 and they only store the file offset to the beginning of data for rank 0, they could start loading one by one the blocks of data to be sent to other processes. Only at the end, they will return to the saved offset and load the piece for the rank 0.
- MatSOR() should used blocks when bs > 0 for AIJ matrices and no inodes
-
Daniel Szyld suggests a refined way of computing overlap for additive
Schwarz. Instead of taking all rows/columns connected to the block
take those with "strong" coupling. I think this could be easily added
to PETSc.
- Add optional scalar value argument to VecMAXPY(), fix all places in code that depend on it.
-
Develop the parallel PETSc interface for the new FFTW parallel codes
being developed, based on the sequential interfaces in
src/mat/impls/fftw
-
Developing fast (FFT based) solvers (underneath src/ksp/pc/impls for
"the model problems", sort of like a modern FISHPACK in PETSc. For
parallel this requires the FFTW parallel interface.
-
Converting PetscLogViewPython() to generate JSON instead and
developing Python parsers for quickly generating nice tables of
performance details from runs or groups of runs.
-
Extend PetscWebServe() that allows accessing running PETSc jobs via
a browser and looking at what is going on (this uses the AMS inside).
See src/sys/viewer/impls/socket/send.c
- GPU based preconditiones
- Add PCApplyHermitianTranpose() and KSPSolveHermitianTranspose()
- Interface to CUBLAS for dense linear algebra on GPUs
-
Improve the PETSc website so that searchs for, for example
MatGetSubMatrix() at google always go to the latest PETSc docs and
not some out of date stuff on another site. See
here.
-
Interface to
Elemental for modern
distributed-memory dense linear algebra
- Vec and Mat class based on pthreads parallelism
-
Turn MPICH, Open MPI and then PETSc into an Apple Framework.
-
Make a Mat subclass for tridiagonal matrix (and block tridiagonal)
and write fast sequential solvers for them. Also do fast parallel
tridiagonal solvers.
-
Convert KSPSPECEST to something like -ksp_type chebyshev
-ksp_chebyshev_precompute_parameters
-ksp_chebyshev_precompute_ksp_type cg
-ksp_chebyshev_precompute_minfactor .95
- MatSOR() should used blocks when bs > for AIJ matrices and no inodes
-
KSPSolve and SNESSolve should return bounds on the errors and some
measure of sensitivity of the solution to the something
-
The MatCreateSeqCUFFT() and MatCreateSeqFFTW() should be both made
subclasses of an abstract Mat class for FFTs. Allowing command line
switching between them etc so that user code makes no mention of the
particular FFT package being used.
-
Remove all mention of Sean from the PETSc repositories.