Authors
Faizan E Mustafa, Corina Dima, Juan G Diaz Ochoa, Steffen Staab
Publication date
2024/4/24
Description
Biomedical Entity Linking (BEL) is a challenging task for low-resource languages, dueto the lack of appropriate resources: datasets,knowledge bases (KBs), and pre-trained models. In this paper, we propose an approach to create a biomedical knowledge base for German BEL using UMLS information from Wikidata, that provides good coverage and can be easily extended to further languages. As a further contribution, we adapt several existing approaches for use in the German BEL setup, and report on their results. The chosen methods include a sparse model using character n-grams,a multilingual biomedical entity linker, and two general-purpose text retrieval models. Our results show that a language-specific KB that provides good coverage leads to most improvement in entity linking performance, irrespective of the used model. The fine tuned German BEL model, newly created UMLS Wikidata KB as well as the code to reproduce our results are publicly available..
Total citations