Retrieval Benchmark

See Retrieval2 for the currently running Retrieval Benchmark.

This page describes the VISCERALretrieval Benchmark as it was run in 2014-2015.

One of the challenges of medical information retrieval is similar case retrieval in the medical domain based on multimodal data, where cases refer to data about specific patients (used in an anonymised form), such as medical records, radiology images and radiology reports or cases described in the literature or teaching files.

VISCERAL retrieval dataset

The VISCERALretrieval data set consists 2311 volumes originated from various modalities (CT,MRT1,MRT2). These scans have been acquired during the daily clinical routine work from three different data providers (MUW, GENCAT, and UKL-HD). For a subset of these volumes we provide from the volume’s radiologic report extracted anatomy-pathology terms in form of csv files. The following table gives an overview of the dataset in which a participant should perform the retrieval task.

Modality	Body Region	Volumes	Available A-P term files
CT	Abdomen	336	213
CT	Thorax	971	699
CT	Thorax + Abdomen	86	86
CT	Unknown	211	211
CT	Whole body	410	410
MRT1	Abdomen	167	114
MRT1	Unknown	24	24
MRT2	Abdomen	68	18
MRT2	Unkwnown	38	38
TOTAL		2311	1813

The anatomy-pathology term files list pathological terms that occur in the report of a volume together with its anatomy. Both entities are described textually and additionally with their corresponding Radlex ID (RID). Radlex is a unified language of radiology terms that can be used for standardized indexing and retrieval of radiology information resources. Each term file lists both, occurring and explicitly in the report negated pathologies.

Content-based medical image retrieval

It serves the following scenario: a user is assessing a query case in a clinical setting, e.g., a CT volume, and is searching for cases that are relevant in this assessment. The algorithm has to find cases that are relevant in a large database of cases. For each topic (query case) there is:

the patient 3D imaging data (CT, MRI)
3D bounding box region of interest containing the radiological signs of the pathology
binary mask of the main organ affected
radiologic report extracted anatomy-pathology terms in form of csv files.

The participants have to develop an algorithm that finds clinically-relevant (related or useful for differential diagnosis) cases given a query case (imaging data only or imaging and text data), but not necessarily the final diagnosis.

Medical experts will perform relevance assessment of the top ranked cases by each approach, to judge the quality of retrieval. Experts will assess the relevance of the ranked cases. The evaluation measures that will be considered are the precision of the top-ranked X cases. The precision for top-ranked 10 and 30 cases (P@10, P@30), mean uninterpolated average precision (MAP), the bpref measure, and in cases where the establishing of the number of relevant cases in the entire data set is feasible, the R-precision will be evaluated.

Registration and submission of results

Participants for the retrieval benchmark can register using the VISCERAL registration page. Register for the "VISCERAL Retrieval" Benchmark.

The submission guideline (v1.2 of 20141123) is available.

Important Dates

Registration starts: 17 November 2014
Database and topics released: 1 December 2014
Participant submission deadline: 28 February 2015
Results released: 29 March 2015 (at the MRMD workshop)

The results of the VISCERAL Retrieval Benchmark will be presented at the Multimodal Retrieval in the Medical Domain (MRMD) Workshop at the ECIR 2015.