Data Extraction and Integration for Scholar Recommendation System

Jaydeep Chakraborty, Gurusrikar Thopugunta, Srividya Bansal

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Scopus citations


Recommendation systems have been an integral part of massive open online courses (MOOCs). With a large amount of availability of data and resources, recommending scholars and professors through general reviews and academic advisor applications has become a tiresome job. Finding professors and scholars relevant to a student's area of interest involves a combination of multiple factors like field of study, depth of research area, research background of professors, ongoing research opportunities, etc. As recommending scholars and professors deals with so many different factors, it is very complex and unreliable when done manually. In this paper, we present a content-based mining approach to go through all relevant resources, extract required information, and use it to recommend a list of scholars based on student's area of interest. For our experimental model, we gathered information about a number of professors at our institution from various web resources such as IEEE, Springer, ACM, Sciencedirect, arxiv and department website. We use topic modeling and clustering algorithms in our content-based mining approach. We present a comparative analysis of the following topic model algorithms: latent dirichlet allocation (LDA), hierarchical dirichlet process(HDP), latent semantic analysis (LSA) and clustering techniques: k-means and hierarchical clustering in determining the most accurate recommendation list of professors or scholars.

Original languageEnglish (US)
Title of host publicationProceedings - 12th IEEE International Conference on Semantic Computing, ICSC 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Number of pages6
ISBN (Electronic)9781538644072
StatePublished - Apr 9 2018
Event12th IEEE International Conference on Semantic Computing, ICSC 2018 - Laguna Hills, United States
Duration: Jan 31 2018Feb 2 2018

Publication series

NameProceedings - 12th IEEE International Conference on Semantic Computing, ICSC 2018


Other12th IEEE International Conference on Semantic Computing, ICSC 2018
Country/TerritoryUnited States
CityLaguna Hills


  • Clustering
  • Hierarchical Dirichlet Process
  • Hierarchical clustering
  • Latent Dirichlet Allocation
  • Latent Semantic Analysis
  • Recommendation System
  • Topic modeling
  • k-means clustering

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Human-Computer Interaction
  • Information Systems and Management


Dive into the research topics of 'Data Extraction and Integration for Scholar Recommendation System'. Together they form a unique fingerprint.

Cite this