In modern power constrained applications, as with most of those belonging to the Internet-of-Things world, custom hardware supports are ever more commonly adopted to deploy artificial intelligence algorithms. In these operating environments, limiting the power dissipation as much as possible is mandatory, even at the expense of reduced computational accuracy. In this paper we propose a novel prediction method to identify potential predominant features in convolutional layers followed by down-sampling layers, thus reducing the overall number of convolution calculations. This approximation down-sampling strategy has been exploited to design a custom hardware architecture for the inference of Convolutional Neural Network (CNN) models. The proposed approach has been applied to several benchmark CNN models and we achieved an overall energy saving of up to 70% with an accuracy loss lower than 3%, with respect to baseline designs. Performed experiments demonstrate that, when adopted to infer the Visual Geometry Group-16 (VGG16) network model, the proposed architecture implemented on a Xilinx Z-7045 chip and on the STM 28nm process technology dissipates only 680 and 21.9 mJ/frame, respectively. In both cases, the novel design overcomes several state-of-the-art competitors in terms of energy-accuracy drop product.

Approximate Down-Sampling Strategy for Power-Constrained Intelligent Systems

Fanny Spagnolo;Stefania Perri;Pasquale Corsonello
2022-01-01

Abstract

In modern power constrained applications, as with most of those belonging to the Internet-of-Things world, custom hardware supports are ever more commonly adopted to deploy artificial intelligence algorithms. In these operating environments, limiting the power dissipation as much as possible is mandatory, even at the expense of reduced computational accuracy. In this paper we propose a novel prediction method to identify potential predominant features in convolutional layers followed by down-sampling layers, thus reducing the overall number of convolution calculations. This approximation down-sampling strategy has been exploited to design a custom hardware architecture for the inference of Convolutional Neural Network (CNN) models. The proposed approach has been applied to several benchmark CNN models and we achieved an overall energy saving of up to 70% with an accuracy loss lower than 3%, with respect to baseline designs. Performed experiments demonstrate that, when adopted to infer the Visual Geometry Group-16 (VGG16) network model, the proposed architecture implemented on a Xilinx Z-7045 chip and on the STM 28nm process technology dissipates only 680 and 21.9 mJ/frame, respectively. In both cases, the novel design overcomes several state-of-the-art competitors in terms of energy-accuracy drop product.
2022
Approximate computing, convolutional neural networks, low-power hardware architectures, pooling layers.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11770/328592
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
  • ???jsp.display-item.citation.isi??? 4
social impact