Automated thesaurus population and management

E. Petraki; C. Kapetis; E. J. Yannakoudakis

E. Petraki Athens University of Economics and Business, Department of Informatics, Athens 10434, Greece
C. Kapetis Athens University of Economics and Business, Department of Informatics, Athens 10434, Greece
E. J. Yannakoudakis Athens University of Economics and Business, Department of Informatics, Athens 10434, Greece

Abstract

FDB is a set theoretical model which allows the definition of multilingual databases and thesauri through a universal schema, offering administration utilities at both data and interface level, the definition of variable length objects, authority control, etc. The purpose of this paper is to investigate the issue of automatic thesaurus population of the FDB database as well as the automatic correlation between data records and thesaurus terms. The thesaurus forms part of the FDB model; more than one thesauri can be defined in FDB which can be multilingual or monolingual while the linking of each frame object (data record in terms of a traditional database) with the appropriate thesaurus terms can be achieved easily. In this paper we firstly describe briefly the FDB model, and proceed to present a) the algorithms which implement the linking of each frame object with the underlying thesaurus terms automatically, b) define algorithms for automatic thesaurus enrichment with terms derived from the data base, and c) outline research concepts and related work about the automatic thesaurus creation.