Automated Scoring of Speaking and Writing: Starting to Hit its Stride

Jones, Daniel Marc; Cheng, Liying; Tweedie, M. Gregory

doi:https://doi.org/10.21432/cjlt28241

Aluthman, E. S. (2016). The effect of using automated essay evaluation on ESL undergraduate students’ writing skill. International Journal of English Linguistics, 6(5), 54-67. https://doi.org/10.5539/ijel.v6n5p54

Google Scholar

Attali, Y. (2011). Automated subscores for TOEFL iBT® independent essays. (ED525308). ETS Research Report Series, 2011(2), i-16. https://doi.org/10.1002/j.2333-8504.2011.tb02275.x

Google Scholar

Attali, Y., Lewis, W., & Steier, M. (2012). Scoring with the computer: Alternative procedures for improving the reliability of holistic essay scoring. Language Testing, 30(1), 125-141. https://doi.org/10.1177/0265532212452396

Google Scholar

Bejar, I. I., VanWinkle, W., Madnani, N., Lewis, W., & Steier, M. (2013). Length of textual response as a construct-irrelevant response strategy: The case of shell language. ETS Research Report Series, 2013(1), i-39. https://doi.org/10.1002/j.2333-8504.2013.tb02314.x

Google Scholar

Braun, V., & Clarke, V. (2006). Using thematic analysis in psychology. Qualitative Research in Psychology, 3(2), 77-101.

Google Scholar

Bridgeman, B., Powers, D., Stone, E., & Mollaun, P. (2012a). TOEFL iBT speaking test scores as indicators of oral communicative language proficiency. Language Testing, 29(1), 91-108. https://doi.org/10.1177/0265532211411078

Google Scholar

Bridgeman, B., Trapani, C., & Attali, Y. (2012b). Comparison of human and machine scoring of essays: Differences by gender, ethnicity, and country. Applied Measurement in Education, 25(1), 27-40. https://doi.org/10.1080/08957347.2012.635502

Google Scholar

Burstein, J., LaFlair, G. T., Kunnan, A. J., & von Davier, A. A. (2021). A theoretical assessment ecosystem for a digital-first assessment—The Duolingo English test. http://duolingo-papers.s3.amazonaws.com/other/det-assessment-ecosystem.pdf

Google Scholar

Cahill, A., & Evanini, K. (2020). Natural language processing for speaking and writing. In D. Yan, A. A. Rupp, & P. W. Foltz (Eds.), Handbook of automated scoring: Theory into practice (pp. 69-92). CRC Press, Taylor & Francis Group. https://doi.org/10.1201/9781351264808

Google Scholar

Chapelle, C. A., Cotos, E., & Lee, J. (2015). Validity arguments for diagnostic assessment using automated writing evaluation. Language testing, 32(3), 385-405. https://doi.org/10.1177/0265532214565386

Google Scholar

Cheng, J., Chen, X., & Metallinou, A. (2015). Deep neural network acoustic models for spoken assessment applications. Speech Communication, 73, 14-27. https://doi.org/10.1016/j.specom.2015.07.006

Google Scholar

D’Mello, S. (2020). Multimodal analytics for automated assessment. In D. Yan, A. A. Rupp, & P. W. Foltz (Eds.), Handbook of automated scoring: Theory into practice (pp. 93-111). CRC Press, Taylor & Francis Group. https://doi.org/10.1201/9781351264808

Google Scholar

d’Orville, H. (2020). COVID-19 causes unprecedented educational disruption: Is there a road towards a new normal? Prospects, 49, 11-15. https://doi.org/10.1007/s11125-020-09475-0

Google Scholar

DiCerbo, K., Lai, E., & Ventura, M. (2020). Assessment design with automated scoring in mind. In D. Yan, A. A. Rupp, & P. W. Foltz (Eds.), Handbook of automated scoring: Theory into practice (pp. 29-47). CRC Press, Taylor & Francis Group. https://doi.org/10.1201/9781351264808

Google Scholar

Douglas, D. (2013). Technology and language testing. In C. A. Chapelle (Eds.), The encyclopedia of applied linguistics (pp. 1-7). Wiley-Blackwell. https://doi.org/10.1002/9781405198431.wbeal1182

Google Scholar

Foltz, P. W., Yan, D., & Rupp, A. A. (2020). The past, present, and future of automated scoring for complex tasks. In D. Yan, A. A. Rupp, & P. W. Foltz (Eds.), Handbook of automated scoring: Theory into practice (pp. 1-11). CRC Press, Taylor & Francis Group. https://doi.org/10.1201/9781351264808

Google Scholar

Fu, J., Chiba, Y., Nose, T., & Ito, A. (2020). Automatic assessment of English proficiency for Japanese learners without reference sentences based on deep neural network acoustic models. Speech Communication, 116, 86-97. https://doi.org/10.1016/j.specom.2019.12.002

Google Scholar

Golkova, D., & Hubackova, S. (2014). Productive skills in second language learning. Procedia-Social and Behavioral Sciences, 143, 477-481. https://doi.org/10.1016/j.sbspro.2014.07.520

Google Scholar

Gu, L., Davis, L., Tao, J., & Zechner, K. (2021). Using spoken language technology for generating feedback to prepare for the TOEFL iBT® test: A user perception study. Assessment in Education: Principles, Policy & Practice, 28(1), 58-76. https://doi.org/10.1080/0969594X.2020.1735995

Google Scholar

Higgins, D., Xi, X., Zechner, K., & Williamson, D. (2011). A three-stage approach to the automated scoring of spontaneous spoken responses. Computer Speech & Language, 25(2), 282-306. https://doi.org/10.1016/j.csl.2010.06.001

Google Scholar

Hussein, M. A., Hassan, H., & Nassef, M. (2019). Automated language essay scoring systems: A literature review. PeerJ Computer Science, 5, e208. https://doi.org/10.7717/peerj-cs.208

Google Scholar

Kaushik, V., & Drolet, J. (2018). Settlement and integration needs of skilled immigrants in Canada. Social Sciences, 7(5), 76. https://doi.org/10.3390/socsci7050076

Google Scholar

Latifi, S., & Gierl, M. (2021). Automated scoring of junior and senior high essays using Coh-Metrix features: Implications for large-scale language testing. Language Testing, 38(1), 62-85. https://doi.org/10.1177/0265532220929918

Google Scholar

Litman, D., Strik, H., & Lim, G. S. (2018). Speech technologies and the assessment of second language speaking: Approaches, challenges, and opportunities. Language Assessment Quarterly, 15(3), 294-309. https://doi.org/10.1080/15434303.2018.1472265

Google Scholar

Loewen, S., Crowther, D., Isbell, D. R., Kim, K. M., Maloney, J., Miller, Z. F., & Rawal, H. (2019). Mobile-assisted language learning: A Duolingo case study. ReCALL, 31(3), 293-311. https://doi.org/10.1017/S0958344019000065

Google Scholar

McNamara, T. (2005). 21st century shibboleth: Language tests, identity and intergroup conflict. Language Policy, 4(4), 351-370. https://doi.org/10.1007/s10993-005-2886-0

Google Scholar

Powers, D. E., Escoffery, D. S., & Duchnowski, M. P. (2015). Validating automated essay scoring: A (modest) refinement of the “gold standard.” Applied Measurement in Education, 28(2), 130-142. https://doi.org/10.1080/08957347.2014.1002920

Google Scholar

Ricker-Pedley, K., Hines, S., & Connolley, C. (2020). Operational human scoring at scale. In D. Yan, A. A. Rupp, & P. W. Foltz (Eds.), Handbook of automated scoring: Theory into practice (pp. 171-193). CRC Press, Taylor & Francis Group. https://doi.org/10.1201/9781351264808

Google Scholar

Rupp, A., Foltz, P., & Yan, D. (2020). Theory into practice: Reflections on the handbook. In D. Yan, A. A. Rupp, & P. W. Foltz (Eds.), Handbook of automated scoring: Theory into practice (pp. 475-487). CRC Press, Taylor & Francis Group. https://doi.org/10.1201/9781351264808

Google Scholar

Sackett, P. R., Schmitt, N., Ellingson, J. E., & Kabin, M. B. (2001). High-stakes testing in employment, credentialing, and higher education: Prospects in a post-affirmative-action world. American Psychologist, 56(4), 302. https://doi.org/10.1037/0003-066X.56.4.302

Google Scholar

Schmidgall, J. E., & Powers, D. E. (2017). Technology and high-stakes language testing. In C. A. Chapelle, & S. Sauro (Eds.), The handbook of technology and second language teaching and learning (pp. 317-331). Wiley Blackwell. https://doi.org/10.1002/9781118914069.ch21

Google Scholar

Schneider, C., & Boyer, M. (2020). Design and implementation for automated scoring systems. In D. Yan, A. A. Rupp, & P. W. Foltz (Eds.), Handbook of automated scoring: Theory into practice (pp. 217-239). CRC Press, Taylor & Francis Group. https://doi.org/10.1201/9781351264808

Google Scholar

Settles, B., LaFlair, G. T., & Hagiwara, M. (2020). Machine learning–driven language assessment. Transactions of the Association for Computational Linguistics, 8, 247-263. https://doi.org/10.1162/tacl_a_00310

Google Scholar

Shermis, M. D., & Burstein, J. (2013). Handbook of automated essay evaluation: Current applications and new directions. Routledge Academic.

Google Scholar

Shin, J., & Gierl, M. J. (2021). More efficient processes for creating automated essay scoring frameworks: A demonstration of two algorithms. Language Testing, 38(2), 247-272. https://doi.org/10.1177/0265532220937830

Google Scholar

Shohamy, E. (2013). The discourse of language testing as a tool for shaping national, global, and transnational identities. Language and Intercultural Communication, 13(2), 225-236. https://doi.org/10.1080/14708477.2013.770868

Google Scholar

Voogt, J., & Knezek, G. (2021). Teaching and learning with technology during the COVID-19 pandemic: Highlighting the need for micro-meso-macro alignments. Canadian Journal of Learning and Technology, 47(4). https://doi.org/10.21432/cjlt28150

Google Scholar

Wang, Y. (2021). Detecting pronunciation errors in spoken English tests based on multifeature fusion algorithm. Complexity, 2021, 1-11. https://doi.org/10.1155/2021/6623885

Google Scholar

Wang, Z., & von Davier, A. A. (2014). Monitoring of scoring using the e‐rater® automated scoring system and human raters on a writing test. ETS Research Report Series, 2014(1), 1-21. https://doi.org/10.1002/ets2.12005

Google Scholar

Wang, Z., Zechner, K., & Sun, Y. (2018). Monitoring the performance of human and automated scores for spoken responses. Language Testing, 35(1), 101-120. https://doi.org/10.1177/0265532216679451

Google Scholar

Williamson, D. M., Xi, X., & Breyer, F. J. (2012). A framework for evaluation and use of automated scoring. Educational Measurement: Issues and Practice, 31(1), 2–13. https://doi.org/10.1111/j.1745-3992.2011.00223.x

Google Scholar

Wind, S. A., Wolfe, E. W., Engelhard Jr, G., Foltz, P., & Rosenstein, M. (2018). The influence of rater effects in training sets on the psychometric quality of automated scoring for writing assessments. International Journal of Testing, 18(1), 27-49. https://doi.org/10.1080/15305058.2017.1361426

Google Scholar

Wood, S. (2020). Public perception and communication around automated essay scoring. In D. Yan, A. A. Rupp, & P. W. Foltz (Eds.), Handbook of automated scoring: Theory into practice (pp. 133-150). CRC Press, Taylor & Francis Group. https://doi.org/10.1201/9781351264808

Google Scholar

Xi, X., Higgins, D., Zechner, K., & Williamson, D. (2012). A comparison of two scoring methods for an automated speech scoring system. Language Testing, 29(3), 371-394. https://doi.org/10.1177/0265532211425673

Google Scholar

Yan, D., & Bridgeman, B. (2020). Validation of automated scoring systems. In D. Yan, A. A. Rupp, & P. W. Foltz (Eds.), Handbook of automated scoring: Theory into practice (pp. 297-318). CRC Press, Taylor & Francis Group. https://doi.org/10.1201/9781351264808

Google Scholar

Yoon, S. Y., & Zechner, K. (2017). Combining human and automated scores for the improved assessment of non-native speech. Speech Communication, 93, 43-52. https://doi.org/10.1016/j.specom.2017.08.001

Google Scholar

Zechner, K., Chen, L., Davis, L., Evanini, K., Lee, C. M., Leong, C. W., Wang, X., & Yoon, S. Y. (2015). Automated scoring of speaking tasks in the Test of English-for-Teaching (TEFT™). ETS Research Report Series, 2015(2), 1-17. https://doi.org/10.1002/ets2.12080

Google Scholar

Zechner, K., Yoon, S. Y., Bhat, S., & Leong, C. W. (2017). Comparative evaluation of automated scoring of syntactic competence of non-native speakers. Computers in Human Behavior, 76, 672-682. https://doi.org/10.1016/j.chb.2017.01.060

Google Scholar

Zhang, M., Breyer, F. J., & Lorenz, F. (2013). Investigating the suitability of implementing the E‐Rater® scoring engine in a large-scale English language testing program. ETS Research Report Series, 2013(2), i-60. https://doi.org/10.1002/j.2333-8504.2013.tb02343.x

Google Scholar

Automated Scoring of Speaking and Writing: Starting to Hit its StrideNotation automatisée de l’expression orale et écrite : un début prometteur

Abstract

Résumé

Bibliography

Résumés

Abstract

Résumé

Parties annexes

Bibliography

Outils de citation

Citer cet article

Exporter la notice de cet article