Due to the huge requirements in terms of both computational and memory capabilities, implementing energy‐efficient and high‐performance Convolutional Neural Networks (CNNs) by exploiting embedded systems still represents a major challenge for hardware designers. This paper presents the complete design of a heterogeneous embedded system realized by using a Field‐ Programmable Gate Array Systems‐on‐Chip (SoC) and suitable to accelerate the inference of Convolutional Neural Networks in power‐constrained environments, such as those related to IoT applications. The proposed architecture is validated through its exploitation in large‐scale CNNs on low‐cost devices. The prototype realized on a Zynq XC7Z045 device achieves a power efficiency up to 135 Gops/W. When the VGG‐16 model is inferred, a frame rate up to 11.8 fps is reached.
Energy‐efficient architecture for CNNs inference on heterogeneous FPGA
Spagnolo Fanny;Perri Stefania;Frustaci F.;Corsonello P.
2020-01-01
Abstract
Due to the huge requirements in terms of both computational and memory capabilities, implementing energy‐efficient and high‐performance Convolutional Neural Networks (CNNs) by exploiting embedded systems still represents a major challenge for hardware designers. This paper presents the complete design of a heterogeneous embedded system realized by using a Field‐ Programmable Gate Array Systems‐on‐Chip (SoC) and suitable to accelerate the inference of Convolutional Neural Networks in power‐constrained environments, such as those related to IoT applications. The proposed architecture is validated through its exploitation in large‐scale CNNs on low‐cost devices. The prototype realized on a Zynq XC7Z045 device achieves a power efficiency up to 135 Gops/W. When the VGG‐16 model is inferred, a frame rate up to 11.8 fps is reached.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.