Application domains, such as machine learning and big data analytics, impose significant computational challenges to contemporary Von Neumann architectures. To address this issue, logic-in-memory (LiM) has been raised as a promising alternative that targets computing within memory arrays, aimed at alleviating the memory wall, optimizing data transfer, and enabling massive parallelism. Spin-transfer torque magnetic tunnel junction (STT-MTJ) based memory is an emerging memory technology that enables efficient processing using memory. This paper proposes AM5, a novel LiM architecture leveraging MRAM NAND crossbar technology to support in-memory arithmetic operations efficiently. The proposed LiM scheme is designed using a commercial 28nm process node and a Verilog-A-based double-barrier MTJ compact model. Evaluation results show that AM5 consumes about 98 fJ/40.7 fJ/DMTJ per evaluation/write cycle (4.2-4.4 ns/1.9 ns). Additionally, the proposed architecture proposes an in-situ error correction mechanism to mitigate variability, yielding reliable arithmetic operations. These findings show better energy (∼ 6× lower, on average) and competitive latency (∼ 1.3× faster, on average) figures of AM5 compared to other LiM designs based on MTJ-based technology. When used as a LiM unit to perform inference in adder attention Vision Transformer networks, AM5 consumes about one-tenth of the energy required by a processor-centric unit.

AM5: Bulk Logic-in-Memory Using MRAM NAND Crossbar

Garzon E.;Lanuzza M.;
In corso di stampa

Abstract

Application domains, such as machine learning and big data analytics, impose significant computational challenges to contemporary Von Neumann architectures. To address this issue, logic-in-memory (LiM) has been raised as a promising alternative that targets computing within memory arrays, aimed at alleviating the memory wall, optimizing data transfer, and enabling massive parallelism. Spin-transfer torque magnetic tunnel junction (STT-MTJ) based memory is an emerging memory technology that enables efficient processing using memory. This paper proposes AM5, a novel LiM architecture leveraging MRAM NAND crossbar technology to support in-memory arithmetic operations efficiently. The proposed LiM scheme is designed using a commercial 28nm process node and a Verilog-A-based double-barrier MTJ compact model. Evaluation results show that AM5 consumes about 98 fJ/40.7 fJ/DMTJ per evaluation/write cycle (4.2-4.4 ns/1.9 ns). Additionally, the proposed architecture proposes an in-situ error correction mechanism to mitigate variability, yielding reliable arithmetic operations. These findings show better energy (∼ 6× lower, on average) and competitive latency (∼ 1.3× faster, on average) figures of AM5 compared to other LiM designs based on MTJ-based technology. When used as a LiM unit to perform inference in adder attention Vision Transformer networks, AM5 consumes about one-tenth of the energy required by a processor-centric unit.
In corso di stampa
AI
Associative Processor
CAM
DMTJ
eNVM
LiM
MRAM
MTJ
PiM
TCAM
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11770/405038
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact