This work is focused on the transparent execution of Cellular Automata models on amulti-GPU architecture. Although Cellular Automata models can be easily parallelized on a single GPU, the domain size and transition function complexity may require the use of multiple GPUs. Our goal is to allow modellers to be completely unaware of the parallel execution context, i.e., the code implementing the Cellular Automata model remains the same regardless if the execution is performed on CPU, single GPU, or multi-GPU systems. This paper supplies meaningful technical insights on how to ensure both transparency and efficiency in multi-GPU execution of Cellular Automata models. In particular, an object-oriented approach is exploited in which a transparent layer is devised that abstracts the parallelization details and allows a strong "separation of concerns" between the execution parallelism issues and the model implementation. Preliminary experiments have been carried out on the multi-GPU cluster CTE-POWER available at the Barcelona Supercomputing Center (BSC), witnessing good speedups notwithstanding the transparency feature supplied by our approach.
Cellular Automata on a Multi-GPU Architecture: A Technical Overview
De Rango, A;D'Ambrosio, D;Rongo, R;Mendicino, G;Spataro, W
2024-01-01
Abstract
This work is focused on the transparent execution of Cellular Automata models on amulti-GPU architecture. Although Cellular Automata models can be easily parallelized on a single GPU, the domain size and transition function complexity may require the use of multiple GPUs. Our goal is to allow modellers to be completely unaware of the parallel execution context, i.e., the code implementing the Cellular Automata model remains the same regardless if the execution is performed on CPU, single GPU, or multi-GPU systems. This paper supplies meaningful technical insights on how to ensure both transparency and efficiency in multi-GPU execution of Cellular Automata models. In particular, an object-oriented approach is exploited in which a transparent layer is devised that abstracts the parallelization details and allows a strong "separation of concerns" between the execution parallelism issues and the model implementation. Preliminary experiments have been carried out on the multi-GPU cluster CTE-POWER available at the Barcelona Supercomputing Center (BSC), witnessing good speedups notwithstanding the transparency feature supplied by our approach.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.