This paper presents an efficient hardware architecture able to perform 2D dilated convolutions and suitable for the integration within modern heterogeneous embedded systems targeting semantic image segmentation. The proposed design supports multiple dilation rates. Moreover, it uses limited amounts of resources even when large convolution windows are processed. As a case study, the novel circuit has been integrated within a Xilinx Zynq-7000 FPSoC device to accelerate a state-of-the-art CNN model for medical images segmentation. Obtained results demonstrate that higher computational capabilities, reduced resources utilization and lower power consumption are achieved with respect to the competitors existing in literature.
An efficient convolution engine based on the à-trous spatial pyramid pooling
Sestito C.;Spagnolo F.;Corsonello P.;Perri S.
2020-01-01
Abstract
This paper presents an efficient hardware architecture able to perform 2D dilated convolutions and suitable for the integration within modern heterogeneous embedded systems targeting semantic image segmentation. The proposed design supports multiple dilation rates. Moreover, it uses limited amounts of resources even when large convolution windows are processed. As a case study, the novel circuit has been integrated within a Xilinx Zynq-7000 FPSoC device to accelerate a state-of-the-art CNN model for medical images segmentation. Obtained results demonstrate that higher computational capabilities, reduced resources utilization and lower power consumption are achieved with respect to the competitors existing in literature.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.