An efficient sparse matrix-vector multiplication on CUDA-enabled graphic processing units for finite element method simulations


Altınkaynak A.

INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN ENGINEERING, cilt.110, sa.1, ss.57-78, 2017 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 110 Sayı: 1
  • Basım Tarihi: 2017
  • Doi Numarası: 10.1002/nme.5346
  • Dergi Adı: INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN ENGINEERING
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Sayfa Sayıları: ss.57-78
  • Anahtar Kelimeler: finite element method, sparse matrix-vector multiplication, graphic processing unit programming, CUDA, PERFORMANCE
  • İstanbul Teknik Üniversitesi Adresli: Evet

Özet

Finite element method (FEM) is a well-developed method to solve real-world problems that can be modeled with differential equations. As the available computational power increases, complex and large-size problems can be solved using FEM, which typically involves multiple degrees of freedom (DOF) per node, high order of elements, and an iterative solver requiring several sparse matrix-vector multiplication operations. In this work, a new storage scheme is proposed for sparse matrices arising from FEM simulations with multiple DOF per node. A sparse matrix-vector multiplication kernel and its variants using the proposed scheme are also given for CUDA-enabled GPUs. The proposed scheme and the kernels rely on the mesh connectivity data from FEM discretization and the number of DOF per node. The proposed kernel performance was evaluated on seven test matrices for double-precision floating point operations. The performance analysis showed that the proposed GPU kernel outperforms the ELLPACK (ELL) and CUSPARSE Hybrid (HYB) format GPU kernels by an average of 42% and 32%, respectively, on a Tesla K20c card. Copyright (c) 2016 John Wiley & Sons, Ltd.