Efficient Architecture for Integral Image Computation on Heterogeneous FPGAs

IRIS

Integral image (IIM) is an intermediate image representation, employed in several computer vision algorithms. Although only simple arithmetic operations are required to compute an IIM, the total number of additions increases quadratically with the input image size. For this reason, the design of hardware architectures able to accelerate the IIM computation receives a great deal of attention. Unfortunately, existing solutions are not appropriate for the integration within high-performance embedded systems, which are currently realized within modern heterogeneous CPU-FPGA System on Chips (SoCs). In this paper, we present a novel hardware architecture for accelerating the IIM computation. The proposed design outperforms existing competitors by parallelizing operations along both rows and columns of the input image. Experiments, conducted on a Zynq-7000 XC7Z020 SoC, demonstrate that the novel accelerator achieves a speed per computation unit up to 124 times higher than prior works, saving more than 70Mbits of on-chip memory resources for 1920×1080 frame resolutions.

Efficient Architecture for Integral Image Computation on Heterogeneous FPGAs

Spagnolo Fanny;Corsonello P.;Perri Stefania

2019-01-01

Abstract

Integral image (IIM) is an intermediate image representation, employed in several computer vision algorithms. Although only simple arithmetic operations are required to compute an IIM, the total number of additions increases quadratically with the input image size. For this reason, the design of hardware architectures able to accelerate the IIM computation receives a great deal of attention. Unfortunately, existing solutions are not appropriate for the integration within high-performance embedded systems, which are currently realized within modern heterogeneous CPU-FPGA System on Chips (SoCs). In this paper, we present a novel hardware architecture for accelerating the IIM computation. The proposed design outperforms existing competitors by parallelizing operations along both rows and columns of the input image. Experiments, conducted on a Zynq-7000 XC7Z020 SoC, demonstrate that the novel accelerator achieves a speed per computation unit up to 124 times higher than prior works, saving more than 70Mbits of on-chip memory resources for 1920×1080 frame resolutions.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2019
			
	Codice ISBN
	
				978-1-7281-3549-6
			
	Parole chiave
	
				embedded vision systems; FPGA SoC; Integral image; parallel computing
			
	Appare nelle tipologie:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11770/299406

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

6

4

social impact