A Performance Analysis of Leading Many-Core Technologies for Cellular Automata Execution

De Rango, A.; D'Ambrosio, D.; Senatore, A.; Mendicino, G.; Narasimhan, K.; Goli, M.; Burns, R.

doi:10.1007/978-3-031-50684-0_21

We extend the panorama of performance analyses of CUDA, OpenCL and SYCL for the execution of Cellular Automata. To this end, we apply the SciddicaT landslide model to a real event by considering two complex topographic surfaces of different granularity, thus resulting in two simulations of different computing loads. For each technology, we developed a global memory and two tiled implementations of SciddicaT by adopting the Nvidia nvcc compiler for CUDA, the Nvidia implementation of the OpenCL standard and the CUDA back-end of the Intel DPC++ compiler for SYCL. The experiments, performed on three Nvidia accelerators, point out from good to optimal performances of SYCL compared to CUDA according to the newer device’s architecture. The carried-out Roofline analysis evidences high cache effects, pointing out greater advantages of tiled implementations for older architectures.

A Performance Analysis of Leading Many-Core Technologies for Cellular Automata Execution

De Rango A.;D'Ambrosio D.;Senatore A.;Mendicino G.;Narasimhan K.;Goli M.;Burns R.

2024-01-01

Abstract

We extend the panorama of performance analyses of CUDA, OpenCL and SYCL for the execution of Cellular Automata. To this end, we apply the SciddicaT landslide model to a real event by considering two complex topographic surfaces of different granularity, thus resulting in two simulations of different computing loads. For each technology, we developed a global memory and two tiled implementations of SciddicaT by adopting the Nvidia nvcc compiler for CUDA, the Nvidia implementation of the OpenCL standard and the CUDA back-end of the Intel DPC++ compiler for SYCL. The experiments, performed on three Nvidia accelerators, point out from good to optimal performances of SYCL compared to CUDA according to the newer device’s architecture. The carried-out Roofline analysis evidences high cache effects, pointing out greater advantages of tiled implementations for older architectures.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2024
			
	Codice ISBN
	
				9783031506833
9783031506840
			
	Parole chiave
	
				Cellular Automata
CUDA vs OpenCL vs SYCL
Data-Parallel Structured-Grid
Fluid-Flow Simulation
Roofline
			
	Appare nelle tipologie:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11770/376361

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

0

0

A Performance Analysis of Leading Many-Core Technologies for Cellular Automata Execution

De Rango A.;D'Ambrosio D.;Senatore A.;Mendicino G.;Narasimhan K.;Goli M.;Burns R.

2024-01-01

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Informazioni

Attenzione

Citazioni

social impact

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)