Create automatic alignment between professional medical vocabulary and non-expert vocabulary in Swedish in order to enhance an information retrieval system.
Health care professionals and lay persons express themselves in different ways when discussing medical issues. When searching for documents on a medical topic they most likely are interested in finding documents on different reading levels and with different vocabulary. It could also be the case that the user expresses the search query in terms typical for one group or the other, while being interested in finding documents from both categories.
Språkbanken has a Swedish medical test collection with documents marked for target group: Doctors or Patients which could be used both for categorization of terms and for testing.
The approach is a question of automatic alignment between expert and non-expert terminology. The objective is to enrich an information retrieval system with links between corresponding concepts in the two sublanguages. The alignment can be done by different machine-learning techniques, such as k-nearest neighbor classifiers or support vector machines.
Automatic alignment of the vocabulary of the two groups could help the user either to find documents written for a certain target group or to find documents for either group even if the query only contains terms from one.
General knowledge in Swedish.
Some knowledge of information retrieval.
Some knowledge of machine learning.
Programming skills, for example in Python.
Karin Friberg Heppin and possibly others from Språkbanken.
Diosan, Rogozan and Pècuchet. 2009. Automatic alignment of medical terminologies with general dictionaries for an efficient information retrieval. Information retrieval in biomedicine: Natural language processing for knowledge integration.
Friberg Heppin. 2010.Resolving power of search keys in MedEval – A Swedish medical test collection with user groups: Doctors and Patients.