Towards geospatial semantic search: Exploiting latent semantic relations in geospatial data

Research output: Contribution to journalArticlepeer-review

50 Scopus citations


This paper reports our efforts to address the grand challenge of the Digital Earth vision in terms of intelligent data discovery from vast quantities of geo-referenced data. We propose an algorithm combining LSA and a Two-Tier Ranking (LSATTR) algorithm based on revised cosine similarity to build a more efficient search engine - Semantic Indexing and Ranking (SIR) - for a semantic-enabled, more effective data discovery. In addition to its ability to handle subject-based search, we propose a mechanism to combine geospatial taxonomy and Yahoo! GeoPlanet for automatic identification of location information from a spatial query and automatic filtering of datasets that are not spatially related. The metadata set, in the format of ISO19115, from NASA's SEDAC (Socio-Economic Data Application Center) is used as the corpus of SIR. Results show that our semantic search engine SIR built on LSATTR methods outperforms existing keyword-matching techniques, such as Lucene, in terms of both recall and precision. Moreover, the semantic associations among all existing words in the corpus are discovered. These associations provide substantial support for automating the population of spatial ontologies. We expect this work to support the operationalization of the Digital Earth vision by advancing the semantic-based geospatial data discovery.

Original languageEnglish (US)
Pages (from-to)17-37
Number of pages21
JournalInternational Journal of Digital Earth
Issue number1
StatePublished - 2014


  • Digital Earth
  • geospatial semantics
  • ontology
  • search effectiveness
  • search engine
  • similarity

ASJC Scopus subject areas

  • Software
  • Computer Science Applications
  • Earth and Planetary Sciences(all)


Dive into the research topics of 'Towards geospatial semantic search: Exploiting latent semantic relations in geospatial data'. Together they form a unique fingerprint.

Cite this