TY - GEN
T1 - Multilingual Age of Exposure
AU - Botarleanu, Robert Mihai
AU - Dascalu, Mihai
AU - Watanabe, Micah
AU - McNamara, Danielle S.
AU - Crossley, Scott Andrew
N1 - Funding Information:
Acknowledgements. This research was supported by a grant of the Romanian National Authority for Scientific Research and Innovation, CNCS – UEFISCDI, project number TE 70 PN-III-P1-1.1-TE-2019-2209, ATES – “Automated Text Evaluation and Simplification”, the Institute of Education Sciences (R305A180144 and R305A180261), and the Office of Naval Research (N00014-17-1-2300; N00014-20-1-2623). The opinions expressed are those of the authors and do not represent views of the IES or ONR.
Publisher Copyright:
© 2021, Springer Nature Switzerland AG.
PY - 2021
Y1 - 2021
N2 - The ability to objectively quantify the complexity of a text can be a useful indicator of how likely learners of a given level will comprehend it. Before creating more complex models of assessing text difficulty, the basic building block of a text consists of words and, inherently, its overall difficulty is greatly influenced by the complexity of underlying words. One approach is to measure a word’s Age of Acquisition (AoA), an estimate of the average age at which a speaker of a language understands the semantics of a specific word. Age of Exposure (AoE) statistically models the process of word learning, and in turn an estimate of a given word’s AoA. In this paper, we expand on the model proposed by AoE by training regression models that learn and generalize AoA word lists across multiple languages including English, German, French, and Spanish. Our approach allows for the estimation of AoA scores for words that are not found in the original lists, up to the majority of the target language’s vocabulary. Our method can be uniformly applied across multiple languages though the usage of parallel corpora and helps bridge the gap in the size of AoA word lists available for non-English languages. This effort is particularly important for efforts toward extending AI to languages with fewer resources and benchmarked corpora.
AB - The ability to objectively quantify the complexity of a text can be a useful indicator of how likely learners of a given level will comprehend it. Before creating more complex models of assessing text difficulty, the basic building block of a text consists of words and, inherently, its overall difficulty is greatly influenced by the complexity of underlying words. One approach is to measure a word’s Age of Acquisition (AoA), an estimate of the average age at which a speaker of a language understands the semantics of a specific word. Age of Exposure (AoE) statistically models the process of word learning, and in turn an estimate of a given word’s AoA. In this paper, we expand on the model proposed by AoE by training regression models that learn and generalize AoA word lists across multiple languages including English, German, French, and Spanish. Our approach allows for the estimation of AoA scores for words that are not found in the original lists, up to the majority of the target language’s vocabulary. Our method can be uniformly applied across multiple languages though the usage of parallel corpora and helps bridge the gap in the size of AoA word lists available for non-English languages. This effort is particularly important for efforts toward extending AI to languages with fewer resources and benchmarked corpora.
KW - Age of acquisition
KW - Age of exposure
KW - Multilingual
KW - Natural language processing
UR - http://www.scopus.com/inward/record.url?scp=85124766147&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85124766147&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-78292-4_7
DO - 10.1007/978-3-030-78292-4_7
M3 - Conference contribution
AN - SCOPUS:85124766147
SN - 9783030782917
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 77
EP - 87
BT - Artificial Intelligence in Education - 22nd International Conference, AIED 2021, Proceedings
A2 - Roll, Ido
A2 - McNamara, Danielle
A2 - Sosnovsky, Sergey
A2 - Luckin, Rose
A2 - Dimitrova, Vania
PB - Springer Science and Business Media Deutschland GmbH
T2 - 22nd International Conference on Artificial Intelligence in Education, AIED 2021
Y2 - 14 June 2021 through 18 June 2021
ER -