TY - CHAP
T1 - Textual signatures
T2 - Identifying text-types using latent semantic analysis to measure the cohesion of text structures
AU - McCarthy, Philip M.
AU - Briner, Stephen W.
AU - Rus, Vasile
AU - McNamara, Danielle S.
PY - 2007
Y1 - 2007
N2 - Just as a sentence is far more than a mere concatenation of words, a text is far more than a mere concatenation of sentences. Texts contain pertinent information that co-refers across sentences and paragraphs [30]; texts contain relations between phrases, clauses, and sentences that are often causally linked [21], [51], [56]; and texts that depend on relating a series of chronological events contain temporal features that help the reader to build a coherent representation of the text [19], [55]. We refer to textual features such as these as cohesive elements, and they occur within paragraphs (locally), across paragraphs (globally), and in forms such as referential, causal, temporal, and structural [18], [22], [36]. But cohesive elements, and by consequence cohesion, does not simply feature in a text as dialogues tend to feature in narratives, or as cartoons tend to feature in newspapers. That is, cohesion is not present or absent in a binary or optional sense. Instead, cohesion in text exists on a continuum of presence, which is sometimes indicative of the text-type in question [12], [37], [41] and sometimes indicative of the audience for which the text was written [44], [47]. In this chapter, we discuss the nature and importance of cohesion; we demonstrate a computational tool that measures cohesion; and, most importantly, we demonstrate a novel approach to identifying text-types by incorporating contrasting rates of cohesion.
AB - Just as a sentence is far more than a mere concatenation of words, a text is far more than a mere concatenation of sentences. Texts contain pertinent information that co-refers across sentences and paragraphs [30]; texts contain relations between phrases, clauses, and sentences that are often causally linked [21], [51], [56]; and texts that depend on relating a series of chronological events contain temporal features that help the reader to build a coherent representation of the text [19], [55]. We refer to textual features such as these as cohesive elements, and they occur within paragraphs (locally), across paragraphs (globally), and in forms such as referential, causal, temporal, and structural [18], [22], [36]. But cohesive elements, and by consequence cohesion, does not simply feature in a text as dialogues tend to feature in narratives, or as cartoons tend to feature in newspapers. That is, cohesion is not present or absent in a binary or optional sense. Instead, cohesion in text exists on a continuum of presence, which is sometimes indicative of the text-type in question [12], [37], [41] and sometimes indicative of the audience for which the text was written [44], [47]. In this chapter, we discuss the nature and importance of cohesion; we demonstrate a computational tool that measures cohesion; and, most importantly, we demonstrate a novel approach to identifying text-types by incorporating contrasting rates of cohesion.
UR - http://www.scopus.com/inward/record.url?scp=84890215811&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84890215811&partnerID=8YFLogxK
U2 - 10.1007/978-1-84628-754-1_7
DO - 10.1007/978-1-84628-754-1_7
M3 - Chapter
AN - SCOPUS:84890215811
SN - 184628175X
SN - 9781846281754
SP - 107
EP - 122
BT - Natural Language Processing and Text Mining
PB - Springer London
ER -