Abstracts
Abstract
English occupies a central position in scholarly publishing, but using a lingua franca for scholarly publishing has consequences for scholars, science, and society. For instance, non Anglophone researchers may need longer to read and write in English and may face more manuscript revisions and rejections, potentially leading to a lower volume of research output, which could negatively affect career advancement. To what extent can machine translation (MT) tools (e.g., Google Translate) help to support a more multilingual scholarly publishing ecosystem? To find out, we undertook a scoping review of the literature to investigate how MT tools are being used for multilingual scholarly publishing. Following a multilingual search in nine bibliographic databases, 875 papers were retrieved and screened, and 39 were included for closer investigation. Analysis reveals that MT tools are being actively developed, tested, applied, and evaluated in the context of scholarly publishing. However, at present, these tools are not displacing English from its central position; the main use of MT tools currently is to reduce the burden of publishing in English for scholars with limited English proficiency. This suggests that technology alone cannot create or sustain a multilingual scholarly publishing ecosystem. Hence, meaningful policies, in addition to improved MT tools and language resources, are needed to create a more linguistically diverse and equitable scholarly publishing landscape.
Keywords:
- machine translation (MT),
- scholarly publishing,
- multilingualism,
- scoping review,
- linguistic diversity,
- equity,
- policy
Résumé
L'anglais occupe une position centrale dans la publication savante, mais l'utilisation d'une lingua franca pour la publication savante a des conséquences pour les chercheurs, la science et la société. Par exemple, les chercheurs non anglophones peuvent mettre plus de temps à lire et à écrire en anglais et peuvent faire face à davantage de révisions et de rejets de manuscrits, ce qui peut potentiellement entraîner un volume de production de recherche plus faible, pouvant nuire à l'avancement de leur carrière. Dans quelle mesure les outils de traduction automatique (TA) (par exemple, Google Translate) peuvent-ils aider à soutenir un écosystème de publication savante plus multilingue ? Pour le savoir, nous avons entrepris une revue exploratoire de la littérature pour enquêter sur la manière dont les outils de TA sont utilisés pour la publication savante multilingue. Suite à une recherche multilingue dans neuf bases de données bibliographiques, 875 articles ont été récupérés et examinés, et 39 ont été inclus pour une enquête plus approfondie. L'analyse révèle que les outils de TA sont activement développés, testés, appliqués et évalués dans le contexte de la publication savante. Cependant, à l'heure actuelle, ces outils ne déplacent pas l'anglais de sa position centrale ; l'utilisation principale des outils de TA est actuellement de réduire la charge de la publication en anglais pour les chercheurs ayant une maîtrise limitée de l'anglais. Cela suggère que la technologie seule ne peut pas créer ou maintenir un écosystème de publication savante multilingue. Par conséquent, des politiques significatives, en plus d'outils de TA améliorés et de ressources linguistiques, sont nécessaires pour créer un paysage de publication savante plus diversifié et équitable sur le plan linguistique.
Mots-clés :
- traduction automatique,
- communication savante,
- outils de traduction,
- multilinguisme,
- diversité linguistique,
- équité,
- politique,
- revue de littérature
Download the article in PDF to read it.
Download
Appendices
Bibliography
- Amano, T., Ramírez-Castañeda, V., Berdejo-Espinola, V., Borokini, I., Chowdhury, S., Golivets, M., González-Trujillo, J. D., Montaño-Centellas, F., Paudel, K., White, R. L., & Veríssimo, D. (2023). The manifold costs of being a non-native English speaker in science. PLoS Biology 27(1): 21(7): e3002184 https://doi.org/10.1371/journal.pbio.3002184
- Angulo, E., Diagne, C., Ballesteros-Mejia, L., Adamjy, T., Ahmed, D. A., Akulov, E., Banerjee, A. K., Capinha, C., Dia, C. A. K. M., Dobigny, G., Duboscq-Carra, V. G., Golivets, M., Haubrock, P. J., Heringer, G., Kirichenko, N., Kourantidou, M., Liu, C., Nuñez, M., Renault, D., Roiz, D., Taheri, A., Verbrugge, L. N. H., Watari, Y., Xiong, W., & Courchamp, F. (2021). Non-English languages enrich scientific knowledge: The example of economic costs of biological invasions. Science of the Total Environment 775: 144441. https://doi.org/10.1016/j.scitotenv.2020.144441
- Bawden, R., Di Nunzio, G. M., Grozea, C., Jauregi Unanue, I., Jimeno Yepes, A., Mah, N., Martinez, D., Névéol, A., Neves, M., Oronoz, M., Perez-de-Viñaspre, O., Piccardi, M., Roller, R., Siu, A., Thomas, P., Vezzani, F., Vicente Navarro, M., Wiemann, D., & Yeganova, L. (2020). Findings of the WMT 2020 Biomedical Translation Shared Task: Basque, Italian and Russian as New Additional Languages. Proceedings of the Fifth Conference on Machine Translation, 660–687. https://aclanthology.org/2020.wmt-1.76
- Bowker, L. (2020). Chinese speakers’ use of machine translation as an aid for scholarly writing in English: A review of the literature and a report on a pilot workshop on machine translation literacy. Asia Pacific Translation and Intercultural Studies, 7(3), 288–298. https://doi.org/10.1080/23306343.2020.1805843
- Bowker, L. (2019). Machine translation literacy: Academic libraries' role. Proceedings of the Association for Information Science and Technology, 56: 618-619. https://doi.org/10.1002/pra2.108
- Bowker, L. (2018). Machine translation and author keywords: A viable search strategy for scholars with limited English proficiency? Advances in Classification Research Online, 13–16. https://doi.org/10.7152/acro.v29i1.15455
- Briner, R. B., & Denyer, D. (2012). Systematic review and evidence synthesis as a practice and scholarship tool. In D. Rousseau (Ed.), The Oxford Handbook of Evidence-Based Management: Companies, Classrooms, and Research, 112-129. Oxford University Press.
- Callison-Burch, C., Osborne, M., & Koehn, P. (2006). Re-evaluating the role of BLEU in machine translation research. 11th Conference of the European Chapter of the Association for Computational Linguistics, 249-256. https://aclanthology.org/E06-1032/
- Castilho, S., Doherty, S., Gaspari, F., & Moorkens, J. (2018). Approaches to Human and Machine Translation Quality Assessment. In J. Moorkens, S. Castilho, F. Gaspari, & S. Doherty (Eds.), Translation Quality Assessment: From Principles to Practice (pp. 9–38). Springer International Publishing. https://doi.org/10.1007/978-3-319-91241-7_2
- Chang, C.-M., Chang, C.-H., & Hwang, S.-Y. (2020). Employing word mover’s distance for cross‐lingual plagiarized text detection. Proc Assoc Inf Sci Technol. 2020;57:e229 https://doi.org/10.1002/pra2.229
- Commissaire à la langue française. (2023). Le français, langue du savoir? Pour une approche structurée de l’usage de la traduction automatique dans le milieu scientifique. https://commissairelanguefrancaise.quebec/publications/avis/francais-traduction-milieu-scientifique.pdf
- Daniele, F. (2019). Performance of an automatic translator in translating medical abstracts. Heliyon, 5(10), e02687. https://doi.org/10.1016/j.heliyon.2019.e02687
- Dobrynina, O. L. (2021). Academic writing for publication purposes and machine translation: is the symbiosis possible? Www. Vovr. Elpub. Ru; Www. Vovr. Ru Журнал Издаётся с 1992 Года, 30(12), 88.
- Esmailpour, R., Ebrahimy, S., Fakhrahmad, S. M., Mohammadi, M., & Abbaspour, J. (2020). Developing an effective scheme for translation and expansion of Persian user queries. Digital Scholarship in the Humanities, 35(3), 493–506. https://doi.org/10.1093/llc/fqz041
- Fadaee, M., & Monz, C. (2022). The unreasonable volatility of neural machine translation models. Proceedings of the 4th Workshop on Neural Generation and Translation (WNGT 2020), 88–96 Online, July 10, 2020. https://aclanthology.org/2020.ngt-1.10.pdf
- Forcada, M.L. (2017). Making sense of neural machine translation. Translation Spaces 6(2) : 291-309.
- Habibie, P., & Hultgren, A. K. (eds). (2022). The Inner World of Gatekeeping in Scholarly Communication. Cham: Palgrave Macmillan.
- Hawker, S., Payne, S., Kerr, C., Hardey, M., & Powell, J. (2002). Appraising the evidence: reviewing disparate data systematically. Qualitative Health Research, 12(9), 1284-1299. https://doi.org/10.1177/1049732302238251
- Helsinki Initiative. (2019). Helsinki Initiative on Multilingualism in Scholarly Communication. Helsinki: Federation of Finnish Learned Societies, Committee for Public Information, Finnish Association for Scholarly Publishing, Universities Norway & European Network for Research Evaluation in the Social Sciences and the Humanities. https://doi.org/10.6084/m9.figshare.7887059
- Hutchins, W. J., & Somers, H. L. (1992). An Introduction to Machine Translation. London: Academic Press.
- Jackson, J. L., & Kuriyama, A. (2019) How often do systematic reviews exclude articles not published in English? Journal of General Internal Medicine, 34(8), 1388–1389. https://doi.org/10.1007/s11606-019-04976-x
- Kim, E.-Y. J., & LaBianca, A. S. (2018). Ethics in Academic Writing Help for International Students in Higher Education: Perceptions of Faculty and Students. Journal of Academic Ethics, 16(1), 39–59. https://doi.org/10.1007/s10805-017-9299-5
- Koehn, P. (2020). Neural machine translation. Cambridge: Cambridge University Press.
- Kostadinova, D. (2019). On some particular cases of translation and self-translation. Езиков свят - Orbis Linguarum, 17(2), 86–92.
- Lin, L. H. F., & Morrison, B. (2021). Challenges in academic writing: Perspectives of Engineering faculty and L2 postgraduate research students. English for Specific Purposes, 63, 59–70. https://doi.org/10.1016/j.esp.2021.03.004
- Matsumura, Y., Katsumata, S., & Komachi, M. (2018). TMU Japanese-English neural machine translation system using Generative Adversarial Network for WAT 2018. Proceedings of the 32nd Pacific Asia Conference on Language, Information and Computation: 5th Workshop on Asian Translation: 5th Workshop on Asian Translation. PACLIC 2018, Hong Kong. https://aclanthology.org/Y18-3014
- Mino, H., Kinugawa, K., Ito, H., Goto, I., Yamada, I., & Tokunaga, T. (2021). NHK’s lexically-constrained neural machine translation at WAT 2021. Proceedings of the 8th Workshop on Asian Translation (WAT2021), 46–52. https://doi.org/10.18653/v1/2021.wat-1.2
- Moher, D., Liberati, A., Tetzlaff, J., Altman, D.G., & The PRISMA Group. (2009). Preferred Reporting Items for Systematic reviews and Meta-Analyses: The PRISMA statement. PLoS Medicine 6(7), e1000100. https://doi.org/10.1136/bmj.b2535
- Morishita, M., Suzuki, J., & Nagata, M. (2019). NTT Neural Machine Translation Systems at WAT 2019. Proceedings of the 6th Workshop on Asian Translation, 99–105. https://doi.org/10.18653/v1/D19-5211
- Nayak, P., Haque, R., & Way, A. (2020, November). The ADAPT’s submissions to the WMT20 biomedical translation task. The Fifth Conference on Machine Translation (The Biomedical Shared Task), Dominican Republic (Online). https://aclanthology.org/2020.wmt-1.91.pdf
- Neves, M., Jimeno Yepes, A., Névéol, A., Grozea, C., Siu, A., Kittner, M., & Verspoor, K. (2018). Findings of the WMT 2018 Biomedical Translation Shared Task: Evaluation on Medline test sets. Proceedings of the Third Conference on Machine Translation: Shared Task Papers, 324–339. https://doi.org/10.18653/v1/W18-6403
- Neimann Rasmussen, L., & Montgomery, P. (2018). The prevalence of and factors associated with inclusion of non-English language studies in Campbell systematic reviews: A survey and meta-epidemiological study. Systematic Reviews, 7(1), 129. https://doi.org/10.1186/s13643-018-0786-6
- O’Brien, S., Simard, M., & Goulet, M.-J. (2018). Machine translation and self-post-editing for academic writing support: Quality Explorations. In J. Moorkens, S. Castilho, F. Gaspari, & S. Doherty (Eds.), Translation Quality Assessment: From Principles to Practice (pp. 237–262). Springer International Publishing. https://doi.org/10.1007/978-3-319-91241-7_11
- Pérez-Ortiz, J.-A., Forcada, M. L., & Sánchez-Martínez, F. (2022). How neural machine translation works. In D. Kenny (ed.), Machine translation for everyone: Empowering users in the age of artificial intelligence, 141-164. Berlin: Language Science Press. http://doi.org/10.5281/zenodo.6760020
- Potter, W. (2024, July 23). An academic publisher has struck an AI deal with Microsoft—without their authors’ knowledge. The Conversation. https://theconversation.com/an-academic-publisher-has-struck-an-ai-data-deal-with-microsoft-without-their-authors-knowledge-235203
- Roussis, D., Papavassiliou, V., Prokopidis, P., Piperidis, S., & Katsouros, V. (2022). SciPar: A Collection of Parallel Corpora from Scientific Abstracts. Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2652–2657. https://aclanthology.org/2022.lrec-1.284
- Sel, İ., & Hanbay, D. (2022). Fully attentional network for low-resource academic machine translation and post editing. Applied Sciences, 12(22), Article 22. https://doi.org/10.3390/app122211456
- Soares, F., Moreira, V. P., & Becker, K. (2019). A large Parallel Corpus of full-text scientific articles (arXiv:1905.01852). arXiv. https://doi.org/10.48550/arXiv.1905.01852
- Soares, F., Tateisi, Y., Takatsuki, T., & Yamaguchi, A. (2021). O-JMeSH: Creating a bilingual English-Japanese controlled vocabulary of MeSH UIDs through machine translation and mutual information. Genomics & Informatics, 19(3), e26. https://doi.org/10.5808/gi.21014
- Soares, F., Yamashita, G. H., & Anzanello, M. J. (2018). A Parallel Corpus of theses and dissertations abstracts. In A. Villavicencio, V. Moreira, A. Abad, H. Caseli, P. Gamallo, C. Ramisch, H. Gonçalo Oliveira, & G. H. Paetzold (Eds.), Computational Processing of the Portuguese Language (pp. 345–352). Springer International Publishing. https://doi.org/10.1007/978-3-319-99722-3_35
- St-Onge, S., Forgues, É., Larivière, V., Riddles, A. & Volkanova, V. (2021). Portrait et défis de la recherche en français en contexte minoritaire au Canada. Acfas. https://www.acfas.ca/sites/default/files/documents_utiles/rapport_francophonie_final_1.pdf
- Sun, Y.-C., & Yang, F.-Y. (2023). Exploring the process and strategies of Chinese–English abstract writing using machine translation tools. Journal of Scholarly Publishing, 54(2), 260–289. https://doi.org/10.3138/jsp-2022-0039
- Sun, Y.-C., Yang, F.-Y., & Liu, H.-J. (2022). Exploring Google Translate-friendly strategies for optimizing the quality of Google Translate in academic writing contexts. SN Social Sciences, 2(8), 147. https://doi.org/10.1007/s43545-022-00455-z
- Takakusagi, Y., Oike, T., Shirai, K., Sato, H., Kano, K., Shima, S., Tsuchida, K., Mizoguchi, N., Serizawa, I., Yoshida, D., Kamada, T., & Katoh, H. (2021). Validation of the reliability of machine translation for a medical article from Japanese to English using DeepL Translator. Cureus. https://doi.org/10.7759/cureus.17778
- Takeshita, S., Green, T., Friedrich, N., Eckert, K., & Ponzetto, S. P. (2022). X-SCITLDR: Cross-lingual extreme summarization of scholarly documents. Proceedings of the 22nd ACM/IEEE Joint Conference on Digital Libraries, 1–12. https://doi.org/10.1145/3529372.3530938
- Tehseen, I., Tahir, G. R., Shakeel, K., & Ali, M. (2018). Corpus based machine translation for scientific text. In L. Iliadis, I. Maglogiannis, & V. Plagianakos (Eds.), Artificial Intelligence Applications and Innovations (pp. 196–206). Springer International Publishing. https://doi.org/10.1007/978-3-319-92007-8_17
- Thomas, J and Harden, A (2008). Methods for the thematic synthesis of qualitative research in systematic reviews. BMC Medical Research Methodology 8(1), 1–10.
- Tongpoon-Patanasorn, A., & Griffith, K. (2020). Google Translate and translation quality: A Case of translating academic abstracts from Thai to English. PASAA: Journal of Language Teaching and Learning in Thailand, 60, 134–163.
- UNESCO. (2021). Recommendation on Open Science. https://unesdoc.unesco.org/ark:/48223/pf0000378841
- Wahab, M. F., Zulfiqar, S., Sarwar, M. I., & Lieberwirth, I. (2020). Facile solutions to the problems associated with Chemical information and mathematical symbolism while using machine translation tools. Journal of Chemical Information and Modeling, 60(7), 3423–3430. https://doi.org/10.1021/acs.jcim.0c00274
- Windsor, L. C., Cupit, J. G., & Windsor, A. J. (2019). Automated content analysis across six languages. PLOS ONE, 14(11), e0224425. https://doi.org/10.1371/journal.pone.0224425
- Winiharti, M., & Sudana, D. (2021). The English Google translation of Indonesian lecturer’s academic writing: A preliminary study. Journal of Language and Linguistic Studies, 17(2), 706–719. https://doi.org/10.3316/informit.215875937433823
- Xie, Q., Zhang, X., Ding, Y., & Song, M. (2020). Monolingual and multilingual topic analysis using LDA and BERT embeddings. Journal of Informetrics, 14(3), 101055. https://doi.org/10.1016/j.joi.2020.101055
- Xu, J., Abdul Rauf, S., Pham, M. Q., & Yvon, F. (2021). LISN @ WMT 2021. 6th Conference on Statistical Machine Translation. https://hal.science/hal-03430610
- Yamamoto, S., Suzuki, R., Fukusato, T., Kataoka, H., & Morishima, S. (2021). A case study on user evaluation of scientific publication summarization by Japanese Students. Applied Sciences, 11(14), Article 14. https://doi.org/10.3390/app11146287
- Zhang, B., & Misra, A. (2023). Machine translation impact in e-commerce multilingual search (arXiv:2302.00119). arXiv. https://doi.org/10.48550/arXiv.2302.00119
- Zhivotova, A. A., Berdonosov, V. D., & Redkolis, E. V. (2020). Improving the quality of scientific articles Machine Translation while writing original text. 2020 International Multi-Conference on Industrial Engineering and Modern Technologies (FarEastCon), 1–4. https://doi.org/10.1109/FarEastCon50210.2020.9271442
- Zomer, G., & Frankenberg-Garcia, A. (2021). Beyond grammatical error correction: Improving L1-influenced research writing in English using pre-trained encoder-decoder models. Findings of the Association for Computational Linguistics: EMNLP 2021, 2534–2540.
- Zou, C., Gong, W., & Li, P. (2023). Using online machine translation in international scholarly writing and publishing: A longitudinal case of a Chinese engineering scholar. Learned Publishing, n/a(n/a). https://doi.org/10.1002/leap.1565
- Zulfiqar, S., Wahab, M. F., Sarwar, M. I., & Lieberwirth, I. (2018). Is Machine Translation a reliable tool for reading German scientific databases and research articles? Journal of Chemical Information and Modeling, 58(11), 2214–2223. https://doi.org/10.1021/acs.jcim.8b00534