Designing Fast Convolutional Engines for Deep Learning Applications

IRIS

Deep learning is rapidly becoming a strong boost to the already pervasive field of computer vision. State-of-the-art Convolutional Neural Networks reach accuracies comparable to human senses. However, the high computational load and low energy efficiency make their implementation on modern embedded systems hard. In this paper, several strategies for designing fast convolutional engines suitable to hardware accelerate Convolutional Neural Networks are evaluated. When implemented within a complete embedded system based on a Zynq Ultrascale+ SoC device, two of the proposed architectures achieve a peak performance of 131.6 GMAC/s at 234MHz running frequency, by occupying at most ∼13% of the DSP slices available on chip. All the proposed engines overcome state-of-the-art competitors, exhibiting a performance/DSP utilization ratio up to 29.6 times higher.

Designing Fast Convolutional Engines for Deep Learning Applications

Spagnolo, Fanny;Perri, Stefania;Frustaci, Fabio;Corsonello, Pasquale

2018-01-01

Abstract

Deep learning is rapidly becoming a strong boost to the already pervasive field of computer vision. State-of-the-art Convolutional Neural Networks reach accuracies comparable to human senses. However, the high computational load and low energy efficiency make their implementation on modern embedded systems hard. In this paper, several strategies for designing fast convolutional engines suitable to hardware accelerate Convolutional Neural Networks are evaluated. When implemented within a complete embedded system based on a Zynq Ultrascale+ SoC device, two of the proposed architectures achieve a peak performance of 131.6 GMAC/s at 234MHz running frequency, by occupying at most ∼13% of the DSP slices available on chip. All the proposed engines overcome state-of-the-art competitors, exhibiting a performance/DSP utilization ratio up to 29.6 times higher.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2018
			
	Codice ISBN
	
				9781538695623
			
	Parole chiave
	
				Convolutional Neural Networks; DSP slices; FPGA; MACs; SIMD architectures; Electrical and Electronic Engineering; Instrumentation
			
	Appare nelle tipologie:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11770/290094

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

8

5

social impact