This paper presents the design of a new dynamic modular addition circuit optimized for the integration into high-speed low-power processors-in-memory (PIMs). The proposed architecture is based on a hybrid ripple-carry/carry-look ahead/carry-bypass approach. In order to reach the required computational speed and the limited power dissipation, the circuit described here is divided into two independent submodules interfaced through dynamic latches. Furthermore, the proposed adder operates in the single instruction multiple data fashion, therefore it is able to manage different operand wordlengths. Our PIM architecture is based on slices containing 16-bit adders. Therefore, the main specification of the design described here is to minimize the effect on speed performance caused by cascading 16-bit blocks. Using a bulk CMOS UMC 0.18-μm 1.8-V process, the optimized version of the 64-bit circuit here proposed, obtained realizing a rippling chain of four 16-bit blocks, shows a power-delay product of only 38.8 pJ*ns and requires less than 4300 transistors.

Efficient addition circuits for modular design of processors-in-memory

CORSONELLO, Pasquale;PERRI, Stefania;
2005-01-01

Abstract

This paper presents the design of a new dynamic modular addition circuit optimized for the integration into high-speed low-power processors-in-memory (PIMs). The proposed architecture is based on a hybrid ripple-carry/carry-look ahead/carry-bypass approach. In order to reach the required computational speed and the limited power dissipation, the circuit described here is divided into two independent submodules interfaced through dynamic latches. Furthermore, the proposed adder operates in the single instruction multiple data fashion, therefore it is able to manage different operand wordlengths. Our PIM architecture is based on slices containing 16-bit adders. Therefore, the main specification of the design described here is to minimize the effect on speed performance caused by cascading 16-bit blocks. Using a bulk CMOS UMC 0.18-μm 1.8-V process, the optimized version of the 64-bit circuit here proposed, obtained realizing a rippling chain of four 16-bit blocks, shows a power-delay product of only 38.8 pJ*ns and requires less than 4300 transistors.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11770/149531
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 6
  • ???jsp.display-item.citation.isi??? 4
social impact