The user-language paraphrase corpus

Philip M. McCarthy; Danielle S. McNamara

doi:10.4018/978-1-61350-447-5.ch006

The user-language paraphrase corpus

Philip M. McCarthy, Danielle S. McNamara

Research output: Chapter in Book/Report/Conference proceeding › Chapter

8 Scopus citations

Abstract

The corpus in this challenge comprises 1998 target-sentence/student response text-pairs, or protocols. The protocols have been evaluated by extensively trained human raters; however, unlike established paraphrase corpora that evaluate paraphrases as either true or false, the User-Language Paraphrase Corpus evaluates protocols along 10 dimensions of paraphrase characteristics on a six point scale. Along with the protocols, the database comprising the challenge includes 10 computational indices that have been used to assess these protocols. The challenge posed for researchers is to describe and assess their own approach (computational or statistical) to evaluating, characterizing, and/or categorizing, any, some, or all of the paraphrase dimensions in this corpus. The purpose of establishing such evaluations of user-language paraphrases is so that ITSs may provide users with accurate assessment and subsequently facilitative feedback, such that the assessment would be comparable to one or more trained human raters. Thus, these evaluations will help to develop the field of natural language assessment and understanding (Rus, McCarthy, McNamara, & Graesser, 2008 [a]).

Original language	English (US)
Title of host publication	Cross-Disciplinary Advances in Applied Natural Language Processing
Subtitle of host publication	Issues and Approaches
Publisher	IGI Global
Pages	73-89
Number of pages	17
ISBN (Print)	9781613504475
DOIs	https://doi.org/10.4018/978-1-61350-447-5.ch006
State	Published - 2011
Externally published	Yes

ASJC Scopus subject areas

General Computer Science

Access to Document

10.4018/978-1-61350-447-5.ch006

Cite this

@inbook{9b0fd088fb0e4c6aaa8972171e3174d8,

title = "The user-language paraphrase corpus",

abstract = "The corpus in this challenge comprises 1998 target-sentence/student response text-pairs, or protocols. The protocols have been evaluated by extensively trained human raters; however, unlike established paraphrase corpora that evaluate paraphrases as either true or false, the User-Language Paraphrase Corpus evaluates protocols along 10 dimensions of paraphrase characteristics on a six point scale. Along with the protocols, the database comprising the challenge includes 10 computational indices that have been used to assess these protocols. The challenge posed for researchers is to describe and assess their own approach (computational or statistical) to evaluating, characterizing, and/or categorizing, any, some, or all of the paraphrase dimensions in this corpus. The purpose of establishing such evaluations of user-language paraphrases is so that ITSs may provide users with accurate assessment and subsequently facilitative feedback, such that the assessment would be comparable to one or more trained human raters. Thus, these evaluations will help to develop the field of natural language assessment and understanding (Rus, McCarthy, McNamara, & Graesser, 2008 [a]).",

author = "McCarthy, {Philip M.} and McNamara, {Danielle S.}",

year = "2011",

doi = "10.4018/978-1-61350-447-5.ch006",

language = "English (US)",

isbn = "9781613504475",

pages = "73--89",

booktitle = "Cross-Disciplinary Advances in Applied Natural Language Processing",

publisher = "IGI Global",

}

TY - CHAP

T1 - The user-language paraphrase corpus

AU - McCarthy, Philip M.

AU - McNamara, Danielle S.

PY - 2011

Y1 - 2011

N2 - The corpus in this challenge comprises 1998 target-sentence/student response text-pairs, or protocols. The protocols have been evaluated by extensively trained human raters; however, unlike established paraphrase corpora that evaluate paraphrases as either true or false, the User-Language Paraphrase Corpus evaluates protocols along 10 dimensions of paraphrase characteristics on a six point scale. Along with the protocols, the database comprising the challenge includes 10 computational indices that have been used to assess these protocols. The challenge posed for researchers is to describe and assess their own approach (computational or statistical) to evaluating, characterizing, and/or categorizing, any, some, or all of the paraphrase dimensions in this corpus. The purpose of establishing such evaluations of user-language paraphrases is so that ITSs may provide users with accurate assessment and subsequently facilitative feedback, such that the assessment would be comparable to one or more trained human raters. Thus, these evaluations will help to develop the field of natural language assessment and understanding (Rus, McCarthy, McNamara, & Graesser, 2008 [a]).

AB - The corpus in this challenge comprises 1998 target-sentence/student response text-pairs, or protocols. The protocols have been evaluated by extensively trained human raters; however, unlike established paraphrase corpora that evaluate paraphrases as either true or false, the User-Language Paraphrase Corpus evaluates protocols along 10 dimensions of paraphrase characteristics on a six point scale. Along with the protocols, the database comprising the challenge includes 10 computational indices that have been used to assess these protocols. The challenge posed for researchers is to describe and assess their own approach (computational or statistical) to evaluating, characterizing, and/or categorizing, any, some, or all of the paraphrase dimensions in this corpus. The purpose of establishing such evaluations of user-language paraphrases is so that ITSs may provide users with accurate assessment and subsequently facilitative feedback, such that the assessment would be comparable to one or more trained human raters. Thus, these evaluations will help to develop the field of natural language assessment and understanding (Rus, McCarthy, McNamara, & Graesser, 2008 [a]).

UR - http://www.scopus.com/inward/record.url?scp=84898573641&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84898573641&partnerID=8YFLogxK

U2 - 10.4018/978-1-61350-447-5.ch006

DO - 10.4018/978-1-61350-447-5.ch006

M3 - Chapter

AN - SCOPUS:84898573641

SN - 9781613504475

SP - 73

EP - 89

BT - Cross-Disciplinary Advances in Applied Natural Language Processing

PB - IGI Global

ER -

The user-language paraphrase corpus

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this