Predicting multi-document comprehension: Cohesion network analysis

Bogdan Nicula, Cecile A. Perret, Mihai Dascalu, Danielle S. McNamara

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Scopus citations


Theories of discourse comprehension assume that understanding is a process of making connections between new information (e.g., in a text) and prior knowledge, and that the quality of comprehension is a function of the coherence of the mental representation. When readers are exposed to multiple sources of information, they must make connections both within and between the texts. One challenge is how to represent this coherence and in turn how to predict readers’ levels of comprehension. In this study, we represent coherence using Cohesion Network Analysis (CNA) in which we model a global cohesion graph that semantically links reference texts to different student verbal productions. Our aim is to create an automated model of comprehension prediction based on features extracted from the CNA graph. We examine the cohesion links between the four texts read by 146 students and their (a) self-explanations generated on target sentences and (b) responses to open-ended questions. We analyze the degree to which features derived from the cohesive links from the extended CNA graph are predictive of students’ comprehension scores (on a [0 to 12] scale) using either (a) students’ self-explanations, (b) responses to comprehension questions, or (c) both. We compared the use of Linear Regression, Extra Trees Regressor, Support Vector Regression, and Multi-Layer Perceptron. Our best model used Linear Regression, obtaining a 1.29 mean absolute error when predicting comprehension scores using both sources of verbal responses (i.e., self-explanations and question answers).

Original languageEnglish (US)
Title of host publicationArtificial Intelligence in Education - 20th International Conference, AIED 2019, Proceedings
EditorsSeiji Isotani, Eva Millán, Amy Ogan, Bruce McLaren, Peter Hastings, Rose Luckin
PublisherSpringer Verlag
Number of pages12
ISBN (Print)9783030232030
StatePublished - 2019
Event20th International Conference on Artificial Intelligence in Education, AIED 2019 - Chicago, United States
Duration: Jun 25 2019Jun 29 2019

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11625 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference20th International Conference on Artificial Intelligence in Education, AIED 2019
Country/TerritoryUnited States


  • Cohesion network analysis
  • Comprehension modeling
  • Machine learning
  • Multi-document comprehension and integration
  • Natural language processing

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)


Dive into the research topics of 'Predicting multi-document comprehension: Cohesion network analysis'. Together they form a unique fingerprint.

Cite this