Approximate Down-Sampling Strategy for Power-Constrained Intelligent Systems

IRIS

In modern power constrained applications, as with most of those belonging to the Internet-of-Things world, custom hardware supports are ever more commonly adopted to deploy artificial intelligence algorithms. In these operating environments, limiting the power dissipation as much as possible is mandatory, even at the expense of reduced computational accuracy. In this paper we propose a novel prediction method to identify potential predominant features in convolutional layers followed by down-sampling layers, thus reducing the overall number of convolution calculations. This approximation down-sampling strategy has been exploited to design a custom hardware architecture for the inference of Convolutional Neural Network (CNN) models. The proposed approach has been applied to several benchmark CNN models and we achieved an overall energy saving of up to 70% with an accuracy loss lower than 3%, with respect to baseline designs. Performed experiments demonstrate that, when adopted to infer the Visual Geometry Group-16 (VGG16) network model, the proposed architecture implemented on a Xilinx Z-7045 chip and on the STM 28nm process technology dissipates only 680 and 21.9 mJ/frame, respectively. In both cases, the novel design overcomes several state-of-the-art competitors in terms of energy-accuracy drop product.

Approximate Down-Sampling Strategy for Power-Constrained Intelligent Systems

Fanny Spagnolo;Stefania Perri;Pasquale Corsonello

2022-01-01

Abstract

In modern power constrained applications, as with most of those belonging to the Internet-of-Things world, custom hardware supports are ever more commonly adopted to deploy artificial intelligence algorithms. In these operating environments, limiting the power dissipation as much as possible is mandatory, even at the expense of reduced computational accuracy. In this paper we propose a novel prediction method to identify potential predominant features in convolutional layers followed by down-sampling layers, thus reducing the overall number of convolution calculations. This approximation down-sampling strategy has been exploited to design a custom hardware architecture for the inference of Convolutional Neural Network (CNN) models. The proposed approach has been applied to several benchmark CNN models and we achieved an overall energy saving of up to 70% with an accuracy loss lower than 3%, with respect to baseline designs. Performed experiments demonstrate that, when adopted to infer the Visual Geometry Group-16 (VGG16) network model, the proposed architecture implemented on a Xilinx Z-7045 chip and on the STM 28nm process technology dissipates only 680 and 21.9 mJ/frame, respectively. In both cases, the novel design overcomes several state-of-the-art competitors in terms of energy-accuracy drop product.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2022
			
	Parole chiave
	
				Approximate computing, convolutional neural networks, low-power hardware architectures, pooling layers.
			
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11770/328592

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

5

4

social impact