Automated paraphrase quality assessment using language models and transfer learning

Bogdan Nicula; Mihai Dascalu; Natalie N. Newton; Ellen Orcutt; Danielle S. McNamara

doi:10.3390/computers10120166

Automated paraphrase quality assessment using language models and transfer learning

Bogdan Nicula, Mihai Dascalu, Natalie N. Newton, Ellen Orcutt, Danielle S. McNamara

Psychology

Research output: Contribution to journal › Article › peer-review

4 Scopus citations

Abstract

Learning to paraphrase supports both writing ability and reading comprehension, particularly for less skilled learners. As such, educational tools that integrate automated evaluations of paraphrases can be used to provide timely feedback to enhance learner paraphrasing skills more efficiently and effectively. Paraphrase identification is a popular NLP classification task that involves establishing whether two sentences share a similar meaning. Paraphrase quality assessment is a slightly more complex task, in which pairs of sentences are evaluated in-depth across multiple dimensions. In this study, we focus on four dimensions: lexical, syntactical, semantic, and overall quality. Our study introduces and evaluates various machine learning models using handcrafted features combined with Extra Trees, Siamese neural networks using BiLSTM RNNs, and pretrained BERT-based models, together with transfer learning from a larger general paraphrase corpus, to estimate the quality of paraphrases across the four dimensions. Two datasets are considered for the tasks involving paraphrase quality: ULPC (User Language Paraphrase Corpus) containing 1998 paraphrases and a smaller dataset with 115 paraphrases based on children’s inputs. The paraphrase identification dataset used for the transfer learning task is the MSRP dataset (Microsoft Research Paraphrase Corpus) containing 5801 paraphrases. On the ULPC dataset, our BERT model improves upon the previous baseline by at least 0.1 in F1-score across the four dimensions. When using fine-tuning from ULPC for the children dataset, both the BERT and Siamese neural network models improve upon their original scores by at least 0.11 F1-score. The results of these experiments suggest that transfer learning using generic paraphrase identification datasets can be successful, while at the same time obtaining comparable results in fewer epochs.

Original language	English (US)
Article number	166
Journal	Computers
Volume	10
Issue number	12
DOIs	https://doi.org/10.3390/computers10120166
State	Published - Dec 2021

Keywords

Language models
Natural language processing
Paraphrase quality assessment
Recurrent neural networks
Transfer learning

ASJC Scopus subject areas

Human-Computer Interaction
Computer Networks and Communications

Access to Document

10.3390/computers10120166

Cite this

@article{ed5b0ff14cd7499dbfbae0aff46b258f,

title = "Automated paraphrase quality assessment using language models and transfer learning",

abstract = "Learning to paraphrase supports both writing ability and reading comprehension, particularly for less skilled learners. As such, educational tools that integrate automated evaluations of paraphrases can be used to provide timely feedback to enhance learner paraphrasing skills more efficiently and effectively. Paraphrase identification is a popular NLP classification task that involves establishing whether two sentences share a similar meaning. Paraphrase quality assessment is a slightly more complex task, in which pairs of sentences are evaluated in-depth across multiple dimensions. In this study, we focus on four dimensions: lexical, syntactical, semantic, and overall quality. Our study introduces and evaluates various machine learning models using handcrafted features combined with Extra Trees, Siamese neural networks using BiLSTM RNNs, and pretrained BERT-based models, together with transfer learning from a larger general paraphrase corpus, to estimate the quality of paraphrases across the four dimensions. Two datasets are considered for the tasks involving paraphrase quality: ULPC (User Language Paraphrase Corpus) containing 1998 paraphrases and a smaller dataset with 115 paraphrases based on children{\textquoteright}s inputs. The paraphrase identification dataset used for the transfer learning task is the MSRP dataset (Microsoft Research Paraphrase Corpus) containing 5801 paraphrases. On the ULPC dataset, our BERT model improves upon the previous baseline by at least 0.1 in F1-score across the four dimensions. When using fine-tuning from ULPC for the children dataset, both the BERT and Siamese neural network models improve upon their original scores by at least 0.11 F1-score. The results of these experiments suggest that transfer learning using generic paraphrase identification datasets can be successful, while at the same time obtaining comparable results in fewer epochs.",

keywords = "Language models, Natural language processing, Paraphrase quality assessment, Recurrent neural networks, Transfer learning",

author = "Bogdan Nicula and Mihai Dascalu and Newton, {Natalie N.} and Ellen Orcutt and McNamara, {Danielle S.}",

note = "Funding Information: Funding: The work was funded by a grant of the Romanian National Authority for Scientific Research and Innovation, CNCS–UEFISCDI, project number TE 70 PN-III-P1-1.1-TE-2019-2209, ATES–“Automated Text Evaluation and Simplification” and by the Institute of Education Sciences (R305A190050) and the Office of Naval Research (N00014-17-1-2300 and N00014-20-1-2623). The opinions expressed are those of the authors and do not represent views of the IES or ONR. Funding Information: The work was funded by a grant of the Romanian National Authority for Scientific Research and Innovation, CNCS?UEFISCDI, project number TE 70 PN-III-P1-1.1-TE-2019-2209, ATES??Automated Text Evaluation and Simplification? and by the Institute of Education Sciences (R305A190050) and the Office of Naval Research (N00014-17-1-2300 and N00014-20-1-2623). The opinions expressed are those of the authors and do not represent views of the IES or ONR. Publisher Copyright: {\textcopyright} 2021 by the authors. Licensee MDPI, Basel, Switzerland.",

year = "2021",

month = dec,

doi = "10.3390/computers10120166",

language = "English (US)",

volume = "10",

journal = "Computers",

issn = "2073-431X",

publisher = "MDPI AG",

number = "12",

}

TY - JOUR

T1 - Automated paraphrase quality assessment using language models and transfer learning

AU - Nicula, Bogdan

AU - Dascalu, Mihai

AU - Newton, Natalie N.

AU - Orcutt, Ellen

AU - McNamara, Danielle S.

N1 - Funding Information: Funding: The work was funded by a grant of the Romanian National Authority for Scientific Research and Innovation, CNCS–UEFISCDI, project number TE 70 PN-III-P1-1.1-TE-2019-2209, ATES–“Automated Text Evaluation and Simplification” and by the Institute of Education Sciences (R305A190050) and the Office of Naval Research (N00014-17-1-2300 and N00014-20-1-2623). The opinions expressed are those of the authors and do not represent views of the IES or ONR. Funding Information: The work was funded by a grant of the Romanian National Authority for Scientific Research and Innovation, CNCS?UEFISCDI, project number TE 70 PN-III-P1-1.1-TE-2019-2209, ATES??Automated Text Evaluation and Simplification? and by the Institute of Education Sciences (R305A190050) and the Office of Naval Research (N00014-17-1-2300 and N00014-20-1-2623). The opinions expressed are those of the authors and do not represent views of the IES or ONR. Publisher Copyright: © 2021 by the authors. Licensee MDPI, Basel, Switzerland.

PY - 2021/12

Y1 - 2021/12

N2 - Learning to paraphrase supports both writing ability and reading comprehension, particularly for less skilled learners. As such, educational tools that integrate automated evaluations of paraphrases can be used to provide timely feedback to enhance learner paraphrasing skills more efficiently and effectively. Paraphrase identification is a popular NLP classification task that involves establishing whether two sentences share a similar meaning. Paraphrase quality assessment is a slightly more complex task, in which pairs of sentences are evaluated in-depth across multiple dimensions. In this study, we focus on four dimensions: lexical, syntactical, semantic, and overall quality. Our study introduces and evaluates various machine learning models using handcrafted features combined with Extra Trees, Siamese neural networks using BiLSTM RNNs, and pretrained BERT-based models, together with transfer learning from a larger general paraphrase corpus, to estimate the quality of paraphrases across the four dimensions. Two datasets are considered for the tasks involving paraphrase quality: ULPC (User Language Paraphrase Corpus) containing 1998 paraphrases and a smaller dataset with 115 paraphrases based on children’s inputs. The paraphrase identification dataset used for the transfer learning task is the MSRP dataset (Microsoft Research Paraphrase Corpus) containing 5801 paraphrases. On the ULPC dataset, our BERT model improves upon the previous baseline by at least 0.1 in F1-score across the four dimensions. When using fine-tuning from ULPC for the children dataset, both the BERT and Siamese neural network models improve upon their original scores by at least 0.11 F1-score. The results of these experiments suggest that transfer learning using generic paraphrase identification datasets can be successful, while at the same time obtaining comparable results in fewer epochs.

AB - Learning to paraphrase supports both writing ability and reading comprehension, particularly for less skilled learners. As such, educational tools that integrate automated evaluations of paraphrases can be used to provide timely feedback to enhance learner paraphrasing skills more efficiently and effectively. Paraphrase identification is a popular NLP classification task that involves establishing whether two sentences share a similar meaning. Paraphrase quality assessment is a slightly more complex task, in which pairs of sentences are evaluated in-depth across multiple dimensions. In this study, we focus on four dimensions: lexical, syntactical, semantic, and overall quality. Our study introduces and evaluates various machine learning models using handcrafted features combined with Extra Trees, Siamese neural networks using BiLSTM RNNs, and pretrained BERT-based models, together with transfer learning from a larger general paraphrase corpus, to estimate the quality of paraphrases across the four dimensions. Two datasets are considered for the tasks involving paraphrase quality: ULPC (User Language Paraphrase Corpus) containing 1998 paraphrases and a smaller dataset with 115 paraphrases based on children’s inputs. The paraphrase identification dataset used for the transfer learning task is the MSRP dataset (Microsoft Research Paraphrase Corpus) containing 5801 paraphrases. On the ULPC dataset, our BERT model improves upon the previous baseline by at least 0.1 in F1-score across the four dimensions. When using fine-tuning from ULPC for the children dataset, both the BERT and Siamese neural network models improve upon their original scores by at least 0.11 F1-score. The results of these experiments suggest that transfer learning using generic paraphrase identification datasets can be successful, while at the same time obtaining comparable results in fewer epochs.

KW - Language models

KW - Natural language processing

KW - Paraphrase quality assessment

KW - Recurrent neural networks

KW - Transfer learning

UR - http://www.scopus.com/inward/record.url?scp=85121844841&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85121844841&partnerID=8YFLogxK

U2 - 10.3390/computers10120166

DO - 10.3390/computers10120166

M3 - Article

AN - SCOPUS:85121844841

SN - 2073-431X

VL - 10

JO - Computers

JF - Computers

IS - 12

M1 - 166

ER -

Automated paraphrase quality assessment using language models and transfer learning

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this