Mining "hidden phrase" definitions from the web

Hung V. Nguyen, P. Velamuru, D. Kolippakkam, Hasan Davulcu, Huan Liu, M. Ates

Research output: Contribution to journalArticlepeer-review

5 Scopus citations

Abstract

Keyword searching is the most common form of document search on the Web. Many Web publishers manually annotate the META tags and titles of their pages with frequently queried phrases in order to improve their placement and ranking. A "hidden phrase" is defined as a phrase that occurs in the META tag of a Web page but not in its body. In this paper we present an algorithm that mines the definitions of hidden phrases from the Web documents. Phrase definitions allow (i) publishers to find relevant phrases with high query frequency, and, (ii) search engines to test if the content of the body of a document matches the phrases. We use co-occurrence clustering and association rule mining algorithms to learn phrase definitions from high-dimensional data sets. We also provide experimental results.

Original languageEnglish (US)
Pages (from-to)156-165
Number of pages10
JournalLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume2642
StatePublished - Dec 1 2003

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Mining "hidden phrase" definitions from the web'. Together they form a unique fingerprint.

Cite this