Newness and givenness of information: Automated identification in written discourse

Philip M. McCarthy; David Dufty; Christian F. Hempelmann; Zhiqiang Cai; Danielle McNamara; Arthur C. Graesser

doi:10.4018/978-1-61350-447-5.ch014

Newness and givenness of information: Automated identification in written discourse

Philip M. McCarthy, David Dufty, Christian F. Hempelmann, Zhiqiang Cai, Danielle McNamara, Arthur C. Graesser

Research output: Chapter in Book/Report/Conference proceeding › Chapter

Abstract

The identification of new versus given information within a text has been frequently investigated by researchers of language and discourse. Despite theoretical advances, an accurate computational method for assessing the degree to which a text contains new versus given information has not previously been implemented. This study discusses a variety of computational new/given systems and analyzes four typical expository and narrative texts against a widely accepted theory of new/given proposed by Prince (1981). Findings suggest that a latent semantic analysis (LSA) based measure called span outperforms standard LSA in detecting both new and given information in text. Further, the span measure outperforms standard LSA for distinguishing low versus high cohesion versions of text. Results suggest that span may be a useful variable in a wide array of discourse analyses.

Original language	English (US)
Title of host publication	Cross-Disciplinary Advances in Applied Natural Language Processing
Subtitle of host publication	Issues and Approaches
Publisher	IGI Global
Pages	202-224
Number of pages	23
ISBN (Print)	9781613504475
DOIs	https://doi.org/10.4018/978-1-61350-447-5.ch014
State	Published - 2011
Externally published	Yes

ASJC Scopus subject areas

General Computer Science

Access to Document

10.4018/978-1-61350-447-5.ch014

Cite this

@inbook{031c6e5f53c5497f8818b9c70ba09ac2,

title = "Newness and givenness of information: Automated identification in written discourse",

abstract = "The identification of new versus given information within a text has been frequently investigated by researchers of language and discourse. Despite theoretical advances, an accurate computational method for assessing the degree to which a text contains new versus given information has not previously been implemented. This study discusses a variety of computational new/given systems and analyzes four typical expository and narrative texts against a widely accepted theory of new/given proposed by Prince (1981). Findings suggest that a latent semantic analysis (LSA) based measure called span outperforms standard LSA in detecting both new and given information in text. Further, the span measure outperforms standard LSA for distinguishing low versus high cohesion versions of text. Results suggest that span may be a useful variable in a wide array of discourse analyses.",

author = "McCarthy, {Philip M.} and David Dufty and Hempelmann, {Christian F.} and Zhiqiang Cai and Danielle McNamara and Graesser, {Arthur C.}",

year = "2011",

doi = "10.4018/978-1-61350-447-5.ch014",

language = "English (US)",

isbn = "9781613504475",

pages = "202--224",

booktitle = "Cross-Disciplinary Advances in Applied Natural Language Processing",

publisher = "IGI Global",

}

TY - CHAP

T1 - Newness and givenness of information

T2 - Automated identification in written discourse

AU - McCarthy, Philip M.

AU - Dufty, David

AU - Hempelmann, Christian F.

AU - Cai, Zhiqiang

AU - McNamara, Danielle

AU - Graesser, Arthur C.

PY - 2011

Y1 - 2011

N2 - The identification of new versus given information within a text has been frequently investigated by researchers of language and discourse. Despite theoretical advances, an accurate computational method for assessing the degree to which a text contains new versus given information has not previously been implemented. This study discusses a variety of computational new/given systems and analyzes four typical expository and narrative texts against a widely accepted theory of new/given proposed by Prince (1981). Findings suggest that a latent semantic analysis (LSA) based measure called span outperforms standard LSA in detecting both new and given information in text. Further, the span measure outperforms standard LSA for distinguishing low versus high cohesion versions of text. Results suggest that span may be a useful variable in a wide array of discourse analyses.

AB - The identification of new versus given information within a text has been frequently investigated by researchers of language and discourse. Despite theoretical advances, an accurate computational method for assessing the degree to which a text contains new versus given information has not previously been implemented. This study discusses a variety of computational new/given systems and analyzes four typical expository and narrative texts against a widely accepted theory of new/given proposed by Prince (1981). Findings suggest that a latent semantic analysis (LSA) based measure called span outperforms standard LSA in detecting both new and given information in text. Further, the span measure outperforms standard LSA for distinguishing low versus high cohesion versions of text. Results suggest that span may be a useful variable in a wide array of discourse analyses.

UR - http://www.scopus.com/inward/record.url?scp=84898586559&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84898586559&partnerID=8YFLogxK

U2 - 10.4018/978-1-61350-447-5.ch014

DO - 10.4018/978-1-61350-447-5.ch014

M3 - Chapter

AN - SCOPUS:84898586559

SN - 9781613504475

SP - 202

EP - 224

BT - Cross-Disciplinary Advances in Applied Natural Language Processing

PB - IGI Global

ER -

Newness and givenness of information: Automated identification in written discourse

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this