Gene expression visualization and clinical diagnosis

Introduction

Data-driven disease classification can be extreely useful in medical diagnosis, which depends on a complex interaction of many clinical, biological, and pathological variables. Notably, it is possible to discover relevant information and distinctive attribute related to specific diseases by properly analyzing different pieces of information coming from different "instances”, such as gene expression or other clinical data.

In general, an accurate diagnoses of specific diseases require the review of the whole medical history of a patient and, even though many advances have been made for disease monitoring, domain experts are still requested to perform direct analyses in order to get a precise classification, thus implying significant efforts and costs.

Our Research

In our research, we aim to facilitate the task of disease classification by proposing a new automated approach, which takes advantage of data visualization techniques to convert the massive amount of information into images and improve both data readability and classification quality.

Our research consists of three main steps:

Data Reduction, using Principal Component Analysis (PCA), in orderto reduce the dataset dimension;
Data Visualization for each patient, relying on:
- a heatmap, to visualize proportion level of each attibute;
- a hot-spot map, based on Getis-Ord Gi*, to visualize spatial in-formation level of each attibute;
Classification using CNNs.

Representing gene expression as image

Publications

[1] P. Bruno, F. Calimeri, A. S. Kitanidis, and E. De Momi, "Data reduction and data visualization for automatic diagnosis using gene expression and clinical data", Artificial Intelligence in Medicine, vol. 107, no. 101884, 2020. DOI: 10.1016/j.artmed.2020.101884

Page updated

Report abuse