Biperpedia: An Ontology for Search Applications
Rahul Gupta† Alon Halevy† Xuezhi Wang§∗ Steven Euijong Whang† Fei Wu†
§Carnegie Mellon University
Search engines make significant efforts to recognize queries that can be answered by structured data and invest heavily in creating and maintaining high-precision databases. While these databases have a relatively wide coverage of entities, the number of attributes they model (e.g., GDP, CAPITAL, ANTHEM) is relatively small. Ex- tending the number of attributes known to the search engine can enable it to more precisely answer queries from the long and heavy tail, extract a broader range of facts from the Web, and recover the semantics of tables on the Web.
We describe Biperpedia, an ontology with 1.6M (class, attribute) pairs and 67K distinct attribute names. Biperpedia extracts attributes from the query stream, and then uses the best extractions to seed at- tribute extraction from text. For every attribute Biperpedia saves a set of synonyms and text patterns in which it appears, thereby en- abling it to recognize the attribute in more contexts. In addition to a detailed analysis of the quality of Biperpedia, we show that it can increase the number of Web tables whose semantics we can recover by more than a factor of 4 compared with Freebase.
Download article (LINK to .PDF) –Copyright 2014 VLDB Endowment