We present CAM4, a novel embedded dynamic storage-based similarity search content addressable memory.CAM4 is designated for in-memory computational genomics applications, particularly the identification and classification of pathogen DNA. CAM4 employs a novel gain cell design and one-hot encoding of DNA bases to address retention time variations,and mitigate potential data loss from pulldown leakage and soft errors in embedded DRAM. CAM4 features performance overhead-free refresh and data upload, allowing simultaneous search and refresh without performance degradation. CAM4 offers approximate search versatility in scenarios with a variety of industrial sequencers with different error profiles. When classifying DNA reads with a 10% error rate, it achieves, on average, a 25% higher F1 score compared to MetaCache-GPU and Kraken2 DNA classification tools. Simulated at 1GHz, CAM4 provides 1, 412× and 1, 040× average speedup over MetaCache-GPU and Kraken2 respectively.

CAM4: In-Memory Viral Pathogen Genome Classification using Similarity Search Dynamic Content-Addressable Memory

Garzon, Esteban;
2025-01-01

Abstract

We present CAM4, a novel embedded dynamic storage-based similarity search content addressable memory.CAM4 is designated for in-memory computational genomics applications, particularly the identification and classification of pathogen DNA. CAM4 employs a novel gain cell design and one-hot encoding of DNA bases to address retention time variations,and mitigate potential data loss from pulldown leakage and soft errors in embedded DRAM. CAM4 features performance overhead-free refresh and data upload, allowing simultaneous search and refresh without performance degradation. CAM4 offers approximate search versatility in scenarios with a variety of industrial sequencers with different error profiles. When classifying DNA reads with a 10% error rate, it achieves, on average, a 25% higher F1 score compared to MetaCache-GPU and Kraken2 DNA classification tools. Simulated at 1GHz, CAM4 provides 1, 412× and 1, 040× average speedup over MetaCache-GPU and Kraken2 respectively.
2025
Content Addressable Memory
GC-eDRAM
Pathogen classification
Pathogen detection
Processing in memory
similarity search
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11770/384781
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact