FLJONES.SC740Report02

SC740 SEMINAR REPORT 02 for Frederick L. Jones

PRESENTER: Dr. Dexuan Xie

TOPIC: New Parallel SOR Methods by Domain Decomposition

OVERVIEW

Dr. Dexuan Xie first described the models for scientific computing most commonly used, i.e. partial differential equations (PDE’s). Next, Dr. Xie described both the setup for the PDE model and several solution techniques. After discussing multigrid solution techniques for PDE linear systems of equations, Dr. Xie focused on one solution technique, namely the Successive Over-Relaxation (SOR) iterative method.

Dr. Xie then described a new parallel version of the SOR method known as (PSOR) which uses both domain partitioning and interproicessor communication techniques to achieve the parallelism. Dr. Xie compared the PSOR technique with other prior versions of parallel SOR and reported a performance improvement from using PSOR.

THE SOR METHOD

The sequential version of the Successive Over-Relaxation (SOR) iterative method is an important solution technique for a class of partial differential equation (PDE) linear systems arising from finite element and finite difference approximations. For PDE’s, SOR is a multigrid finite element and finite difference equation solution technique.

Note that according to Friedman and Kandel(1994, pp.347-351), SOR is an alternative to the Gauss-Seidel scheme which includes a relaxation factor, w , as shown here: x _k ^(m) = x _k ^(m-1) + w [ r _kk ^(m) / a _kk ]. For w > 1, the overrelaxation factor speeds the convergence of the Gauss-Seidel scheme.

While SOR speeds the convergence of Gauss-Seidel solutions, the SOR sequential, iterative technique still needs to be speeded up. SOR was chosen by Dr. Xie over several other multigrid methods, because it is a way to parallelize the Gauss-Seidel scheme.

THE PSOR METHOD

PSOR uses both domain partitioning and interprocessor communication techniques to achieve parallelism. Dr. Xie compared his PSOR technique with other prior versions of parallel SOR such as Red-Black (or multicolor) SOR, and reported a performance improvement from using PSOR.

Kumar, et.al. (1994,pp.429-433) cover the parallel implementations of the Gauss-Seidel method, the Multicolor SOR method, and the Block multicolor methods. Note that according to Kumar, et.al. (1994,p.433) "The parallelization issues and the communication costs in a parellel implementation of the SOR algorithm are the same as those for the Gauss-Seidel algorithm"

The difference between Dr. Xie’s findings and Kumar, et.al. ‘s(1994,p.433) findings would seem to be due to the exact nature of the domain partitioning and interprocessor communication techniques used. Since Dr. Xie stated that he used "striped" partitioning, then I assume that the partitioning of the domain among processors was done similar to that suggested by Kumar, et.al. (1994,pp.448-451) for partitioning methods for finite element graphs.

However, since Dr. Xie also described his interprocessor communication protocol in a way that is similar to what would be used in a block-cyclic-striped mapping (See Kumar,et.al.(1994,pp.151-153)), then the interaction of the domain partitioning and the interprocessor communication might completely account for the difference in findings.

SUMMARY AND CONCLUSIONS

The SOR and PSOR Methods were covered in this seminar by Dr. Xie, as well as PSOR’s performance improvement over other prior versions of parallel SOR such as Red-Black (or multicolor) SOR. Dr. Xie’s findings differ markedly from those of Kumar, et.al. (1994,p.433) who stated that "The parallelization issues and the communication costs in a parellel implementation of the SOR algorithm are the same as those for the Gauss-Seidel algorithm".

While the difference between Dr. Xie’s findings and Kumar, et.al.‘s (1994,p.433) findings would seem to be due to the exact nature of the domain partitioning and interprocessor communication techniques used, there is some question in my mind as to the exact definition of "domain partitioning" and "interprocessor communication techniques" used.

Dr. Xie could have been using the same type of striped partitioning of the domain among processors that is suggested by Kumar, et.al. (1994,pp.448-451) for partitioning methods for finite element graphs. However, because of Dr. Xie’s description of his interprocessor communication techniques, then he could have been using a block-cyclic-striped mapping of the domain to processors. The exact nature of Dr. Xie’s domain partitioning and interprocessor communication techniques would have to better known in order to explain his performance improvements.

REFERENCES

Friedman and Kandel(1994) Fundamentals of Computer Numerical Analysis. CRC Press, Inc. Boca Raton, Florida. 1994.

Kumar,et.al.(1994) Introduction to Parallel Computing: Design and Analysis of Algorithms. The Benjamin/Cummings Publishing Co., Inc. Redwood City, California. 1994.