Integral image (IIM) is an intermediate image representation, employed in several computer vision algorithms. Although only simple arithmetic operations are required to compute an IIM, the total number of additions increases quadratically with the input image size. For this reason, the design of hardware architectures able to accelerate the IIM computation receives a great deal of attention. Unfortunately, existing solutions are not appropriate for the integration within high-performance embedded systems, which are currently realized within modern heterogeneous CPU-FPGA System on Chips (SoCs). In this paper, we present a novel hardware architecture for accelerating the IIM computation. The proposed design outperforms existing competitors by parallelizing operations along both rows and columns of the input image. Experiments, conducted on a Zynq-7000 XC7Z020 SoC, demonstrate that the novel accelerator achieves a speed per computation unit up to 124 times higher than prior works, saving more than 70Mbits of on-chip memory resources for 1920×1080 frame resolutions.

Efficient Architecture for Integral Image Computation on Heterogeneous FPGAs

Spagnolo Fanny;Corsonello P.;Perri Stefania
2019-01-01

Abstract

Integral image (IIM) is an intermediate image representation, employed in several computer vision algorithms. Although only simple arithmetic operations are required to compute an IIM, the total number of additions increases quadratically with the input image size. For this reason, the design of hardware architectures able to accelerate the IIM computation receives a great deal of attention. Unfortunately, existing solutions are not appropriate for the integration within high-performance embedded systems, which are currently realized within modern heterogeneous CPU-FPGA System on Chips (SoCs). In this paper, we present a novel hardware architecture for accelerating the IIM computation. The proposed design outperforms existing competitors by parallelizing operations along both rows and columns of the input image. Experiments, conducted on a Zynq-7000 XC7Z020 SoC, demonstrate that the novel accelerator achieves a speed per computation unit up to 124 times higher than prior works, saving more than 70Mbits of on-chip memory resources for 1920×1080 frame resolutions.
2019
978-1-7281-3549-6
embedded vision systems; FPGA SoC; Integral image; parallel computing
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11770/299406
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 5
  • ???jsp.display-item.citation.isi??? 4
social impact