GPU accelerated initialization of local maximum-entropy meshfree methods for vibrational and acoustic problems

IRIS

This paper presents an efficient strategy for the matrix assembly procedure in a Galerkin implementation of local maximum-entropy (LME) meshfree schemes, using graphic processor units (GPUs) as massive parallel accelerators. LME basis functions show excellent performance in the simulation of vibrational and acoustic problems, described by the Helmholtz equation. However, even considering a locally truncated support, their evaluation requires a significantly higher number of neighbors, as compared to finite elements and other meshfree methods, which poses several challenges towards a computationally efficient allocation and filling of the required sparse matrices structures. The proposed algorithm relies on a clustering strategy, and it is structured to exploit the massive parallelism of GPU architectures. Numerical examples demonstrate that this strategy enables a substantial performance boost, deriving from a synergic effect of the relatively higher computational throughput and typically larger memory bandwidth of GPUs, as compared to conventional CPUs. For the more demanding stage, we report speedups up to 1035X when using a Titan X GPU hosted in a dedicated workstation, and a more modest yet substantial acceleration up to 91X when using a mobile workstation, finally opening up to the possibility of handling industrially relevant applications not only on dedicated high-performance computing infrastructures but also on commodity hardware.

GPU accelerated initialization of local maximum-entropy meshfree methods for vibrational and acoustic problems

Cosco, F.;Greco, F.;Desmet, W.;Mundo, D.

2020-01-01

Abstract

This paper presents an efficient strategy for the matrix assembly procedure in a Galerkin implementation of local maximum-entropy (LME) meshfree schemes, using graphic processor units (GPUs) as massive parallel accelerators. LME basis functions show excellent performance in the simulation of vibrational and acoustic problems, described by the Helmholtz equation. However, even considering a locally truncated support, their evaluation requires a significantly higher number of neighbors, as compared to finite elements and other meshfree methods, which poses several challenges towards a computationally efficient allocation and filling of the required sparse matrices structures. The proposed algorithm relies on a clustering strategy, and it is structured to exploit the massive parallelism of GPU architectures. Numerical examples demonstrate that this strategy enables a substantial performance boost, deriving from a synergic effect of the relatively higher computational throughput and typically larger memory bandwidth of GPUs, as compared to conventional CPUs. For the more demanding stage, we report speedups up to 1035X when using a Titan X GPU hosted in a dedicated workstation, and a more modest yet substantial acceleration up to 91X when using a mobile workstation, finally opening up to the possibility of handling industrially relevant applications not only on dedicated high-performance computing infrastructures but also on commodity hardware.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2020
			
	Parole chiave
	
				Vibrational and acoustic analysisMaximum-entropyMeshfreeMatrix assemblyGPU accelerationCUDA
			
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11770/304561

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

6

6

social impact