Automatic Annotation of Narrative Radiology Reports

Krsnik, Ivan; Glavaš, Goran; Krsnik, Marina; Miletić, Damir; Štajduhar, Ivan

doi:10.3390/diagnostics10040196

prikaz prve stranice dokumenta Automatic Annotation of Narrative Radiology Reports

Download
PDF 535.2 KB

Scientific paper - Original scientific paper

Automatic Annotation of Narrative Radiology Reports

Diagnostics, 10 (2020), 4; 196. https://doi.org/10.3390/diagnostics10040196

Krsnik, Ivan; Glavaš, Goran; Krsnik, Marina; Miletić, Damir; Štajduhar, Ivan

Cite this document

APA 6th Edition

Krsnik, I., Glavaš, G., Krsnik, M., Miletić, D. & Štajduhar, I. (2020). Automatic Annotation of Narrative Radiology Reports. Diagnostics, 10. (4). doi: 10.3390/diagnostics10040196

MLA 8th Edition

Krsnik, Ivan, et al. "Automatic Annotation of Narrative Radiology Reports." Diagnostics, vol. 10, no. 4, 2020. https://doi.org/10.3390/diagnostics10040196

Chicago 17th Edition

Krsnik, Ivan, Goran Glavaš, Marina Krsnik, Damir Miletić and Ivan Štajduhar. "Automatic Annotation of Narrative Radiology Reports." Diagnostics 10, no. 4 (2020). https://doi.org/10.3390/diagnostics10040196

Harvard

Krsnik, I., et al. (2020) 'Automatic Annotation of Narrative Radiology Reports', Diagnostics, 10(4). doi: 10.3390/diagnostics10040196

Vancouver

Krsnik I, Glavaš G, Krsnik M, Miletić D, Štajduhar I. Automatic Annotation of Narrative Radiology Reports. Diagnostics [Internet]. 2020 April 01 [cited 2025 March 25];10(4). doi: 10.3390/diagnostics10040196

IEEE

I. Krsnik, G. Glavaš, M. Krsnik, D. Miletić and I. Štajduhar, "Automatic Annotation of Narrative Radiology Reports", Diagnostics, vol. 10, no. 4, April 2020. [Online]. Available at: https://urn.nsk.hr/urn:nbn:hr:184:982562. [Accessed: 25 March 2025]

Cite this item: https://urn.nsk.hr/urn:nbn:hr:184:982562

Please login to the repository to save this object to your list.

Metadata

Title (english)	Automatic Annotation of Narrative Radiology Reports
Author	Ivan Krsnik
Author	Goran Glavaš
Author	Marina Krsnik
Author	Damir Miletić
Author	Ivan Štajduhar
Author's institution	University of Rijeka Faculty of Medicine (Department of Radiology)
Scientific / art field, discipline and subdiscipline	BIOMEDICINE AND HEALTHCARE Clinical Medical Sciences Radiology
Scientific / art field, discipline and subdiscipline	TECHNICAL SCIENCES Computing
Abstract (english)	Narrative texts in electronic health records can be efficiently utilized for building decision support systems in the clinic, only if they are correctly interpreted automatically in accordance with a specified standard. This paper tackles the problem of developing an automated method of labeling free-form radiology reports, as a precursor for building query-capable report databases in hospitals. The analyzed dataset consists of 1295 radiology reports concerning the condition of a knee, retrospectively gathered at the Clinical Hospital Centre Rijeka, Croatia. Reports were manually labeled with one or more labels from a set of 10 most commonly occurring clinical conditions. After primary preprocessing of the texts, two sets of text classification methods were compared: (1) traditional classification models—Naive Bayes (NB), Logistic Regression (LR), Support Vector Machine (SVM), and Random Forests (RF)—coupled with Bag-of-Words (BoW) features (i.e., symbolic text representation) and (2) Convolutional Neural Network (CNN) coupled with dense word vectors (i.e., word embeddings as a semantic text representation) as input features. We resorted to nested 10-fold cross-validation to evaluate the performance of competing methods using accuracy, precision, recall, and F 1 score. The CNN with semantic word representations as input yielded the overall best performance, having a micro-averaged F 1 score of 86 . 7 % . The CNN classifier yielded particularly encouraging results for the most represented conditions: degenerative disease ( 95 . 9 % ), arthrosis ( 93 . 3 % ), and injury ( 89 . 2 % ). As a data-hungry deep learning model, the CNN, however, performed notably worse than the competing models on underrepresented classes with fewer training instances such as multicausal disease or metabolic disease. LR, RF, and SVM performed comparably well, with the obtained micro-averaged F 1 scores of 84 . 6 % , 82 . 2 % , and 82 . 1 % , respectively.
Keywords (english)
Language	english
Publication type	Scientific paper - Original scientific paper
Publication status	Published
Peer review	Peer review - international
Publication version	Published version
Journal title	Diagnostics
Numbering	vol. 10, no. 4, 196
e-ISSN	2075-4418
DOI	https://doi.org/10.3390/diagnostics10040196
URN:NBN	urn:nbn:hr:184:982562
Publication	2020-04-01
Document URL	https://www.mdpi.com/2075-4418/10/4/196
Type of resource	Text
Access conditions	Open access
Terms of use
Created on	2020-04-02 16:12:57