The development of robust data management and analysis systems leveraging Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) systems has significantly improved the opportunity of extracting knowledge from huge datasets. Moreover, integrating established LLMs with custom-designed RAGs allows treating heterogeneous and complex multidimensional data as those representing biomedical information. Nevertheless, despite these advances, for health-related data, there is an increased requirement of more reliable and precise prediction mechanisms, inducing a necessity of improving data models and mechanisms. The study focuses on defining a framework able to manage high-dimensional biomedical data. The implemented system employs advanced indexing techniques to efficiently store and retrieve extensive datasets, addressing the critical demands of comprehensive cardiology research and analysis. It acquires biomedical multidimensional data and enhances information and utility by combining supervised and unsupervised learning methods, ensuring both high accuracy and practical applications. Integrated data management and RAG systems underscore their ability to enhance the identification of biomarkers and clinical data in health-related patient risk stratification and novel biomarker discovery. By using state-of-the-art metrics, benchmarks and practical applications, the use of integrated data management and RAG systems underscore its ability to enhance the identification of biomarkers and clinical data in health related applications. Finally, CardioTRAP applications prove the importance of integrating data management and RAG systems to positively apply biomedical research results in clinical practice.

CardioTRAP: Design of a Retrieval Augmented System (RAG) for Clinical Data in Cardiology

Vizza, Patrizia;Indolfi, Ciro;Veltri, Pierangelo
2025-01-01

Abstract

The development of robust data management and analysis systems leveraging Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) systems has significantly improved the opportunity of extracting knowledge from huge datasets. Moreover, integrating established LLMs with custom-designed RAGs allows treating heterogeneous and complex multidimensional data as those representing biomedical information. Nevertheless, despite these advances, for health-related data, there is an increased requirement of more reliable and precise prediction mechanisms, inducing a necessity of improving data models and mechanisms. The study focuses on defining a framework able to manage high-dimensional biomedical data. The implemented system employs advanced indexing techniques to efficiently store and retrieve extensive datasets, addressing the critical demands of comprehensive cardiology research and analysis. It acquires biomedical multidimensional data and enhances information and utility by combining supervised and unsupervised learning methods, ensuring both high accuracy and practical applications. Integrated data management and RAG systems underscore their ability to enhance the identification of biomarkers and clinical data in health-related patient risk stratification and novel biomarker discovery. By using state-of-the-art metrics, benchmarks and practical applications, the use of integrated data management and RAG systems underscore its ability to enhance the identification of biomarkers and clinical data in health related applications. Finally, CardioTRAP applications prove the importance of integrating data management and RAG systems to positively apply biomedical research results in clinical practice.
2025
Electronic Health Records
Large Language Models
Retrieval-Augmented Generation systems
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11770/388745
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact