Automatic conceptual indexing of French pharmaceutical theses
French pharmaceutical theses are rarely quoted. If the main obstacles originate from language or access barriers, proper indexation could also be blamed. Manually extracted keywords don't necessarily come from a structured thesaurus. In this paper the manual indexing method is compared to an automated one, "Nomindex", based on UMLS. The automated method is improved by the addition of a relevance scoring system. The first indexing step consists of downloading, adapting and indexing theses in electronic format. Results will then be analysed and scored by relevance, by comparing classic statistical indices (noise/silence/relevance). It was assumed that the manually obtained keywords were always relevant. The silence of the manual indexing is nevertheless high: seven new keywords are proposed by Nomindex, whose results are mixed (10% of silence, but 50% of noise). These results are promising for the first experiment: pharmaceutical document without lexicon improvement. The indexing, if it is currently insufficient for a real life use, could easily be improved by specific updates of the lexicon.
Bibliographic Reference: An oral report given at: MIE2002, XVIIth International Congress of the European Federation for Medical Informatics Organised by: European Federation for Medical Informatics (EFMI) Held at: Budapest (HU), 25-29 August 2002
Record Number: 200214520 / Last updated on: 2002-04-03
Original language: en
Available languages: en