Artificial Intelligence federates numerous scientific fields in the aim of developing machines able to assist human operators performing complex treatments; most of which demand high cognitive skills (e.g. learning or decision processes). Central to this quest is to give machines the ability to estimate the likeness or similarity between things in the way human beings estimate the similarity between stimuli.
In this context, this book focuses on semantic measures: approaches designed for comparing semantic entities such as units of language, e.g. words, sentences, or concepts and instances defined into knowledge bases. The aim of these measures is to assess the similarity or relatedness of such semantic entities by taking into account their semantics, i.e. their meaning; intuitively, the words tea and coffee, which both refer to stimulating beverage, will be estimated to be more semantically similar than the words toffee (confection) and coffee, despite that the last pair has a higher syntactic similarity. The two state-of-the-art approaches for estimating and quantifying semantic similarities/relatedness of semantic entities are presented in detail: the first one relies on corpora analysis and is based on Natural Language Processing techniques and semantic models while the second is based on more or less formal, computer-readable and workable forms of knowledge such as semantic networks, thesauri or ontologies.
Semantic measures are widely used today to compare units of language, concepts, instances or even resources indexed by them (e.g., documents, genes). They are central elements of a large variety of Natural Language Processing applications and knowledge-based treatments, and have therefore naturally been subject to intensive and interdisciplinary research efforts during last decades. Beyond a simple inventory and categorization of existing measures, the aim of this monograph is to convey novices as well as researchers of these domains toward a better understanding of semantic similarity estimation and more generally semantic measures. To this end, we propose an in-depth characterization of existing proposals by discussing their features, the assumptions on which they are based and empirical results regarding their performance in particular applications. By answering these questions and by providing a detailed discussion on the foundations of semantic measures, our aim is to give the reader key knowledge required to: (i) select the more relevant methods according to a particular usage context, (ii) understand the challenges offered to this field of study, (iii) distinguish room of improvements for state-of-the-art approaches and (iv) stimulate creativity toward the development of new approaches. In this aim, several definitions, theoretical and practical details, as well as concrete applications are presented.
Table of Contents
Introduction to Semantic Measures
Corpus-Based Semantic Measures
Knowledge-Based Semantic Measures
Methods and Datasets for the Evaluation of Semantic Measures
Conclusion and Research Directions
About the Author(s)
Sebastien Harispe, School of Mines (Ales, France)
Sebastien Harispe holds a Master’s and PhD in Computer Science from the University of Montpelier II. His research focuses on Artificial Intelligence and more particularly on the diversity of methods which can be used to support decision making from text and knowledge base analysis, e.g. Information and Extraction and Knowledge inference. He proposed several theoretical and practical contributions related to semantic measures. He is the project leader and main developer of the Semantic Measures Library project, a project dedicated to the development of open source software solutions for semantic measures computation and analysis.
Sylvie Ranwez, School of Mines (Ales, France)
Sylvie Ranwez is an Associate Professor at the LGI2P Research Center at the School of Mines. Since 2000, she has been interested in the research endeavor of one part of the Artificial Intelligence; Knowledge Engineering. Her research is dedicated to ontologies used as a guideline in conceptual annotation process and information retrieval systems, navigation over numerous resources and visualization.
Stefan Janaqi, School of Mines (Ales, France)
Stefan is a research member of the LGI2P Research Center team at the School of Mines. He holds a PhD in Computer Science from University Joseph Fourier, Grenoble (France), dealing with geometric properties of graphs. His research focuses on mathematical models for optimization, image treatment, evolutionary algorithms and convexity in discrete structures such as graphs.
Jacky Montmain, School of Mines (Ales, France)
Jacky Montmain received the Master’s degree from the Ecole Nationale Superieure d’Ingenieurs Electriciens de Grenoble France in 1987 and a PhD from the National Polytechnic Institute in 1992; both in control theory. He was a research engineer at the French Atomic Energy Commission from 1991 to 2005 where he was appointed as Senior Expert in the field of Mathematics, Computer Sciences, Software, and System Technologies in 2003. He is currently a Professor at the School of Mines. His research interests include the application of artificial intelligence techniques to model-based diagnosis and supervision, industrial performance improvement, multicriteria and fuzzy approaches to decision-making.