br Rafael S Bressan a
Rafael S. Bressan a, Pedro H. Bugatti a, Priscila T.M. Saito a,b,∗
a Department of Computing, Federal University of Technology - Parana, PR, Brazil b Institute of Computing, University of Campinas, SP, Brazil
Communicated by Yongdong Zhang
Breast cancer diagnosis
One of the cornerstones of content-based image retrieval (CBIR) for medical image diagnosis is to select the images that present higher similarity with a given query image. Different from previous literature efforts, the present work aims to seamlessly fuse a powerful machine learning strategy based on the ac-tive learning paradigm, in order to obtain greater e cacy regarding similarity queries in medical CBIR systems. To do so, we propose a new approach, named as Medical Active leaRning and Retrieval (MAR-Row) to aid the breast cancer diagnosis. It enables to deal with more feasible strategies, specifically for the medical context and its inherent constraints. We also proposed an active learning strategy to select a small set of more informative images, considering selection criteria based on not only similarity, but also on certain degrees of LDN193189 and uncertainty. To validate our proposed approach, we performed experiments using public medical image datasets, different descriptors for each one and compared our approach against four widely applied and well-known literature approaches, such as: Traditional CBIR without relevance feedback strategies, Query Point Movement Strategy (QPM), Query Expansion (QEX) and SVM Active Learning (SVM-AL). From the experiments, we can observe that our approach presents a strong performance over state-of-the-art ones reaching a precision gain of up to 87.3%. MARRow also pre-sented a well-suited and consistent increasing rate along the learning iterations. Moreover, our approach can significantly minimize the expert’s involvement in the analysis and annotation process (reducing up to 88%). The results testify that MARRow improves the precision of the similarity queries. It is capable to explore at the maximum the experts’ intentions, which are captured during the relevance feedback pro-cess, incrementally improving the learning model. Therefore, our approach can be suitable and applied in challenging processes, such as real and medical contexts, enhancing medical decision support systems (e.g. breast cancer diagnosis).
Over the last decades, medical image databases have been growing due to technological advances in data acquisition and stor-age devices. Hence, the improvements of automatic retrieval and classification [1–10] approaches have become necessary to handle and organize such data. To perform these tasks we can use the content-based image retrieval (CBIR) process. It aims to retrieve images based on the similarity (or dissimilarity) between a given query image and an image dataset.
To compute these similarities, low-level features based on color, texture and/or shape are extracted from images . Besides the
∗ Corresponding author. E-mail addresses: [email protected] (R.S. Bressan), [email protected] (P.H. Bugatti), [email protected] (P.T.M. Saito).
set of features, the dissimilarity function (or distance function) and the expert (e.g. radiologist) interaction with the CBIR process are key aspects to obtain more precise results. Once each expert has his/her own perception and expertise, the relevance feedback (RF) paradigm can be applied to capture the expert’s intention in a coarse-grained way . It allows the expert to label the re-trieved images as relevant or irrelevant regarding a given itera-tion. In other words, labia minora leads to a query refinement. Then, when the CBIR process returns the similar images according to a query image, the pipeline can be fed with the relevance degree of each retrieved image. This information is aggregated with the image fea-tures and the distance function to perform a new query that is closer to the expert’s intention. The RF process can be done un-til the expert is satisfied with the returned images.
Although there is a plenty of RF methods in the literature [13–16], to the best of our knowledge, the majority of them leave the definition of the degree of relevance and irrelevance to the ex-
Fig. 1. Examples of informative (uncertain) samples from two different classes: (a) benign and (b) malignant lesions. It is possible to notice that both images (regions of interest from different classes) present a high similarity degree regarding their lesions (highlighted by dashed lines) and other tissues.