Community Research and Development Information Service - CORDIS


Benchmarks have been made on different most recent PCs with fast and slow memories and compared with computers that are traditionally used for scientific computing such as Compaq Alpha, or NEC SX-5. A sparse matrix times vector kernel with a theoretical peak performance given by the memory bandwidth, a matrix*matrix kernel that approaches peak processor performance and two complete programmes from plasma physics are executed. It is shown that the PC compilers do not yet reach the maturity of those for workstations or vector computers. PC optimisations have to be performed by hand. Specifically, it has been seen that vector operations must have stride 1, the do loops have to be small, and cache optimisation can strongly improve performance. Parallelisation of an application on clusters of such PCs should only be started after single processor optimisation.

Additional information

Authors: COOPER W A, École Polytechnique Fédérale de Lausanne (CH);GRUBER R, École Polytechnique Fédérale de Lausanne (CH);TRAN T-M, École Polytechnique Fédérale de Lausanne (CH)
Bibliographic Reference: An article published in: EPFL Supercomputing Review, no. 13, May 2002, pp.33-36
Availability: The full text of this article is available online at:
Record Number: 200215021 / Last updated on: 2002-07-26
Original language: en
Available languages: en