Qualifying with Different Types of Quizzes in an Online EFL Course: Influences on Perceived Learning and Academic Achievement

This quasi-experimental study explored how different online exam types differentiate learners’ academic achievement and perceived learning. The participants comprised 95 undergraduate students enrolled in an English course at a Turkish university in three groups, each taking a different type of quiz: with multiple-choice, open-ended, and mixed type questions. The results indicated that the academic achievement of the students in multiple-choice and open-ended groups increased and that quiz results improved the most for the multiple-choice group relative to the other groups. The study found a moderate level of significant relationship between cognitive and affective perceived learning and multiple-choice quiz scores. In addition, the study found a weak level of significant relationship between cognitive and affective perceived learning and mixed-design quiz scores, and between cognitive learning and the academic achievement scores of the mixed-design group. Semi-structured online interviews undertaken to further explain the quantitative data displayed positive influences of the different types of quizzes in terms of study behaviors and satisfaction. The findings of this study are expected to shed light for practitioners aiming to use different online assessment types.


Introduction
In recent years, technological and pedagogical improvements have made online learning more attractive, and a great number of students prefer to study English as a Foreign Language (EFL) in courses delivered online. Moreover, methods used in computer-assisted language learning have proven to be effective in delivering EFL courses (Ebadi & Rahimi, 2018;Yang, 2017) and in facilitating teachers to monitor learner progress through online formative assessments (Alharbi & Meccawy, 2020).
Recent studies highlight the benefits of online assessment tools, such as improving student motivation, enhancing active learning, and deterring cheating as long as the questions are not too easy (Rinaldi et al., 2017;Schneider et al., 2018). The use of online exams in different forms of online assessment tools, such as fill-in-the blanks, multiple-choice, true-false, cloze test, word-order, match the columns, and table-verbs, has led to the discovery of a new world of teaching effectiveness and learning approaches (Yadollahi & Rahimi, 2011). Some scholars have highlighted the advantages of various online formative assessment tools such as Google Forms, Blackboard, Plickers, Socrative, and Kahoot! (Alharbi & Meccawy, 2020;Fageeh, 2015;Jazil et al., 2020). These are perceived as positive tools that enhance achievement in different ways, offering to improve learners' responses (Elbasyouny, 2021). For example, Fageeh (2015) reported that online testing via Blackboard provides opportunities for multiple practices, influencing achievement, automated scoring and instant feedback. Another study found that students perceived their learning as effective via an online grammar assessment included in Google Forms, a supportive tool that provides immediate feedback after completing the exam (Jazil et al., 2020). The perceived usefulness of online testing can also enhance students' performance. In fact, a high correlation has been reported between students' performance, perceived learning, and satisfaction with online learning (Gray & DiLoreto, 2016). Thus, applying different assessment approaches may result in different learning outcomes such as academic achievements or perceived learning.

Assessment tools in online learning
Online assessment tools can be used for formative assessments (quizzes) or summative assessments (exams). A series of studies have been carried out to examine the learning outcomes of various types of online assessments. Sek et al. (2012) find that the most preferred assessment format was multiple-choice questions, followed by true/false questions and single choice questions. Kılıç and Çetin (2018) report the most preferred exam type to be multiple-choice tests because students consider they could succeed better. In contrast, Ogange et al. (2018) document that students perceive the various types of formative online assessments as nonsignificant. Accordingly, understanding the effects of different quiz types on students' perceived performance and success has important implications for instructors' decisions regarding online assessment.
The research to date has tended to focus on the effect of online exams on students' performance, motivation, study style, and exam anxiety (Pan et al., 2019;Vayre & Vonthron, 2019) rather than on their perceived learning which allows students to evaluate themselves.

Perceived Learning in Online Learning
Perceived learning is an indicator of the effectiveness of online learning environments (Barbera et al., 2013) and is considered as an evaluation of learning experience (Caspi & Blau, 2011). Researchers define perceived learning in cognitive and socio-emotional dimensions. While the cognitive dimension is the sense of achieving new knowledge, the socio-emotional dimension includes the students' degree of involvement, experiences, and feelings in the learning process (Caspi & Blau, 2011).
Previous studies have reported that perceived learning has a significant and positive relationship with online course flexibility and student-student interaction (Marks et al., 2005); student-instructor interaction (Kang & Im, 2013); cognitive presence, social presence, and teaching presence (Arbaugh, 2008;Rockinson-Szapkiw et al., 2016); and learning content and course design (Barbera et al., 2013). Additionally, Paechter et al. (2010) report that students' expectations regarding subject knowledge have predictive power for their perceived learning in online learning settings. Furthermore, Artino (2008) reports a significant correlation between perceived learning and satisfaction in online learning settings.
Perceived learning is also considered to be a significant predictor of students' course grades in online learning settings (Rockinson-Szapkiw et al., 2016). Having different learning approaches gives students options to study with various tools in different time periods with varying goals and expectations about their learning. In line with this expectation, they engage actively in online quizzes by paying more attention to the classes (Dobbins & Denton, 2017). In this sense, previous studies have addressed positive perceptions about online quizzes in terms of learning outcomes, technology acceptance, and perceived usefulness (Raes & Depaepe, 2020). Accordingly, different types of online exams can be thought to differentiate the perceived learning levels, and the relationship between online quizzes and students' perceived learning is considered to be a variable worth examining. Within this framework, this study aims to reveal the relationship between perceived learning and the academic achievement of students studying using different types of online exams in an online EFL course.

Aim of the Study
One prominent area that uses online assessment is EFL classes. These are commonly required courses in the first year of all universities in Turkey, targeting basic skills such as reading, writing, speaking, listening, and grammar and vocabulary. Given the importance of the use of quizzes in online EFL classes, this study aims to determine how online exam types differentiate learners' perceived learning and academic achievement. As the nature of the topics is appropriate for the preparation of different kinds of tests for the same outcomes, we focused on an EFL course. The motivation for the study was the idea that an online EFL course supported by different quiz types would differentiate students' academic achievements and perceived learning. Focusing on the relationship between academic achievement and perceived learning, this research seeks answers to the following research questions: 1. Is there a significant difference between the academic achievement (quizzes, EFL test) scores of the students who study with different quiz types? 2. Is there a significant difference between the perceived learning scores of the students who study with different quiz types?
3. What is the relationship between students' academic achievement (with regard to question types in quizzes, EFL test) and their perceived learning scores?

Method
This quasi-experimental study with three randomized study groups, using a pre-and post-test design, was carried out in a university-level EFL course in Turkey. The same instructor taught the same instructional package to all groups during the fall semester of the 2020-2021 academic year. An academic achievement test including all the targeted teaching modules was applied as a pre-test.
The students were introduced to the question types of the quizzes they would have following every module: Group A-multiple-choice questions; Group B-mixed-design questions (fill-in-the blanks, true-false, matching); and Group C-open-ended questions. The study lasted for 16 weeks with a total of 18 online quizzes, 6 for each group. The academic achievement test was applied as a post-test followed by an online interview with volunteering students. The perceived learning scale was used to determine the perceived learning scores. Figure 1 shows the procedure followed during the study.

Study Procedure
Other than the established methodology of Presentation, Practice, and Production in teaching EFL, the online course design was outlined to fit with the quizzes. Each group had one synchronous lesson of at least 90 minutes on Adobe Connect Web Conferencing System every week. The quizzes were conducted as out-of-class activities at the end of each module and reviewed at the beginning of the following lesson, as shown in Figure 2.

Course Design with Quizzes
In part 1 of the course design, the quiz questions were answered, and students discussed the questions, contextual clues, and correct answers. In part 2, the students formed quiz-type sample questions and shared them with each other to get familiar with the quiz question type(s). Sample materials were presented to students to study for the next quiz, shaping their learning approach fitting with the question type. In part 3, the course topic was delivered via an appropriate teaching strategy by providing examples according to the type of the quiz questions. In part 4, a question-answer session was followed by a summary of the lesson and information about the next quiz (duration, number of questions, etc.).

Data Collection Tools
The quantitative data were gathered using the perceived learning scale, academic achievement test, and online quizzes.

Perceived Learning Scale
The original Cognitive, Affective and Psychomotor (CAP) Perceived Learning Scale designed by Rovai et al. (2009) for both face-to-face and online learning includes a total of 9 items, 3 items for each of the cognitive learning, affective learning, and psychomotor learning subscales. Since this study did not include psychomotor skills as learning goals, we were only concerned with cognitive and affective learning, each represented by 1 item. The item, or question, related to cognitive perceived learning was the following: "When you evaluate on a scale of 0 to 9, how much do you think you learned in this course? (0: meaning I think I learned nothing; 9: meaning I think I learned a lot)" as adapted by Çelik (2020) from Richmond et al. (1987) with a correlation coefficient of .806. Adapting this item with the help of four field experts, we tested the affective perceived learning with the item, "When you evaluate on a scale of 0 to 9, how much do you think your attitude towards the course changed? (0:meaning I think my attitude didn't change at all; 9: meaning I think my attitude changed a lot)." The Pearson correlation to assess the test retest reliability of the affective perceived learning was found r(74) = .802. The two tenpoint Likert items ranging from 0 to 9 had an internal consistency reliability of Cronbach's alpha of .933.

Academic Achievement Test
We assessed the effectiveness of the interventions on students' academic achievement scores using a standardized academic achievement test of 20 mixed-type questions, each worth 5 points, in four parts selected from the course book end-of-term tests that serve as the framework for the course. The Smart Choice course book (Wilson & Healy, 2016) by Oxford University Press, which is widely used in various K-12 and tertiary level schools, includes tests containing questions targeting the outcomes of the foreign language curriculum and is also the main instrument of various online EFL studies (Jakob & Afdaliah, 2019;Wongpornprateep & Boonmoh, 2019). The instructor and one field expert reviewed the test items in terms of content validity. Google Forms was used to implement the tests.

Online Quizzes
The online quizzes developed by the researchers, one of whom was also the course instructor, covered the relevant module of that week. While the content of the questions was the same, the form of the questions in the quizzes differed for each group. Figure 3 outlines the forms of questions in the quizzes.

Some Examples of the Same Questions of Different Question Types
Each quiz included 20 questions, targeting the same vocabulary and grammatical content with different question types. In all groups, some questions also included visuals for clarification and guidance. All the quiz questions were similar to the question contents of the academic achievement test. Two field experts checked whether each question in the quizzes aimed at the same objectives.

Interviews
Online interviews were carried out with a total of 18 volunteering students after the post-test based on their perceived learning scores-2 low, 2 medium, and 2 high scores (6 students from each quiz group)to further explain the data from the scales. The participants were asked questions about the effects of the quizzes on their learning, study behavior, attitude towards the lesson, academic achievement, and the factors they perceive contributing to their learning.

Participants
Ninety-five freshman students (F = 64, M = 31, mean age = 19) enrolled in the English 1 course participated in this study. None of the participants completed preparatory English class as this is not required for the vocational school students. However, they all had the same A1 level of basic English classes. The participants were from the departments of Banking and Insurance, Finance, Public Relations, and Accounting, and they were randomly assigned to one of three groups: open-ended (n = 32), multiple-choice (n = 33), and mixed-design (n = 30).

Data Analysis
An analysis of variance (ANOVA) with the Tukey HSD post-hoc test and Kruskal-Wallis H test were used to determine significant differences among the students' scores in pre-tests, post-tests, and quizzes. Split Plot ANOVA was used to show the change in academic achievement in the pre-tests and post-tests. A ttest was used to put forth the difference in pre-tests and post-tests within groups.
The data obtained from the interviews were analyzed descriptively and students' opinions were presented referring to the perceived learning scores of the groups. For example, when quoting from the open-ended quiz group, students with perceived learning total score of <9 were considered as "low," between 9 and 15 as "medium," and >15 as "high"; and these were coded as OEL-1, OEM-1, and OEH-1, respectively.

Results
The quantitative data were presented with regard to the research questions, and the interview data were used to explain the factors in the intervention process.

Academic Achievement Scores of the Groups
Is there a significant difference in the pre-test results of the students in different quiz types?
The normal distribution of the data for the pre-test was confirmed with the homogeneity of the variance test. Levene's test showed that the variances for quiz types were equal, F(2,92) = 0.163; p = 0.850; p>0.05. Since the data showed the normality assumptions, we performed a one-way ANOVA test to compare the pre-test scores of the groups, as shown in Table 1.

Is there a significant difference in the post-test results of the students in different quiz types?
The normal distribution of the data for the post-test was confirmed by the homogeneity of the variance test. Levene's test showed that the variances for post-test were not equal, F(2,92) = 3.281; p = 0.042; p<0.05. Therefore, a non-parametric test, the Kruskal-Wallis H test, was carried out and the difference between the posttest scores of the groups is shown in Table 2.  Table 2 indicates that there was not a statistically significant difference between groups as χ 2 (2) = 2.755, p = 0.252, p>0.05. However, the change in the academic achievement scores is shown in Figure 4.

Figure 4
Change of Pre-tests and Post-tests for Academic Achievement Figure 4 reveals that the post-test scores increased significantly from x̄ = 43.18 to x̄ = 55.90 in multiplechoice groups; from x̄ = 48.83 to x̄ =52.66 in mixed-design groups; and from x̄ = 55.15 to x̄ = 62.81 in open-ended groups. The change was higher in the multiple-choice group compared to the others, surpassing the average post-test score of the mixed-design group, which had a higher academic achievement score in the pre-test.
Is there a significant difference in the quiz scores of the students in different quiz types?
The homogeneity of the variance test showed that the data were normally distributed and Levene's test showed that the variances for quiz mean scores were equal, F(2,92) = 1.301; p = 0.277; p>0.05. The difference of the groups in the quiz mean scores is shown in Table 3.  Is there a significant difference in the academic achievement scores of the students within groups?
The differences in the academic achievement scores of the multiple-choice, mixed-design and openended groups were all confirmed with the paired samples t-test as shown in Table 4, Table 5 and Table   6, respectively.

Perceived Learning Scores of the Groups
Is there a significant difference in the perceived learning scores of the students in different quiz types?
We examined the perceived learning scores of the groups under affective learning and cognitive learning dimensions. We confirmed the normal distribution of the data for affective learning and cognitive learning using the homogeneity of variance test. Levene's test showed that the variances for affective learning, F(2,92) = 6.210; p = 0.003; p<0.05, and for cognitive learning were not equal, F(2,92) = 5.345; p= 0.006; p<0.05. Table 7 shows the Kruskal-Wallis H test results showing the difference between the affective and cognitive perceived learning scores of the groups. The descriptive results revealed the mean scores of:

Relationships between Academic Achievement and Perceived Learning Scores of the Groups
Is there a significant relationship between students' learning performance and their perceived learning scores?
We used a Pearson correlation coefficient to determine the relationship between affective learning, cognitive learning, averages of quiz scores, and academic achievement scores of multiple-choice, mixed- design, and open-ended quiz groups, as shown in Table 8, Table 9 and Table 10, respectively. The results revealed a very strong positive correlation between perceived affective learning and cognitive learning, a moderate positive correlation between perceived affective learning and average quiz scores, and a weak positive correlation between affective learning and academic achievement scores. In addition, there was a moderate positive correlation between perceived cognitive learning and the average quiz scores, a weak positive correlation between cognitive learning and academic achievement, but a strong positive correlation between average quiz scores and academic achievement, which means increases in quiz averages were correlated with increases in academic achievement.
Students' perspectives about the process also provided clues to explain the positive effect of quizzes on their perceived learning. In this regard, students with high perceived learning scores stated that quizzes had a positive effect on their learning, while students with low perceived learning scores stated that quizzes had no effect on their learning. For example, MCH-5 stated, "The exams are very efficient as they are loaded right after finishing the subject and I understand the subjects better," while MCL-1 stated, "The quizzes are not very effective on my learning as they are easy to answer and I know I will pass the test very easily." We found a very strong positive correlation between affective learning and cognitive learning, and a weak positive correlation between affective learning and average quiz scores. In addition, we found a weak positive correlation between affective learning and academic achievement scores, and a weak positive correlation between cognitive learning and average quiz scores. We found a weak positive correlation between cognitive learning and academic achievement and a strong positive correlation between quiz averages and academic achievement, which means increases in quiz means were correlated with increases in academic achievement at a high level.
Students' perspectives showing the factors explaining the effect of the interventions to the research variables were generally in line with the quantitative data. While students with low perceived learning scores stated that the quizzes were not impressive on their learning, students with medium and high perceived learning scores expressed the positive effects of quizzes. In this sense, MIXL-1 stated, "Quizzes are good work but not beneficial for my learning," while MIXM-4 stated, "Quizzes allow me to repeat the topics I have learned," and MIXH-6 stated, "The exams help me improve what I learned in the lesson." Overall, students reported that quizzes had positive effects on their learning. We found a strong positive correlation between affective learning and cognitive learning, but a very weak positive correlation between affective learning and average quiz scores and between affective learning and academic achievement scores and a weak positive relationship between cognitive learning and quiz scores and between cognitive learning and academic achievement scores. However, we found a moderate positive correlation between average quiz and academic achievement scores, which indicates that increases in quiz average scores were correlated with increases in academic achievement at a moderate level.
The perspectives of the open-ended quiz group 'students explain the relationship between perceived learning scores and academic achievement, though at a low level. Unlike the other two test groups, all of the volunteering interview students in this group stated the positive effects of quizzes on their perceived learning. For example, OEL-2 stated, "I studied for quizzes, and they helped me learn by making it easier to learn," and, similarly, OEM-4 claimed, "I think quizzes were difficult for me but having to study affected my learning." OEH-5 stated, "Quizzes certainly have an effect on my learning. I realized that I understood and improved my English skills more with them," and OEH-6 stated, "I think I improved what I learned in the lessons better with quizzes." The overall interviews revealed that the students with high perceived learning scores in all groups reported that quizzes helped them learn by providing opportunities to review and practice the topics they learned. Students with low perceived learning scores in multiple-choice and mixed groups generally stated that quizzes had no effect on their learning.

Discussion
Various studies have reported the positive effects of using several assessments instead of a single final exam, such as improved student learning and retention (Rezaei, 2015), student engagement, and feedback opportunities (Holmes, 2015). Day et al. (2018) indicate that assessment leads to more effective study behavior promoting student academic achievement, but that the type of continuous assessment does not influence academic achievement; that is, students' performances do not differ depending on whether assessment is through a written assignment, a partial exam, or homework assignments. However, Brown and Wang (2013) claim that the types of exams used for assessment lead students to use different learning approaches in the process of preparing for the exam. Different from the former study and similar to the latter, our study found that students' quiz scores improved more in the multiple-choice group, mixed-design group, and open-ended group. The fact that the correct option in multiple-choice exams is among the choices, which act as clues, makes it easier for students to recall the correct answer, but the answers in open-ended exams are required to be written in students' own sentences. The mixed-design group includes both types of questions, which may result in the great discrepancy between the quiz mean scores.
The average quiz scores of the mixed-design group were lower than the multiple-choice group but higher than the open-ended group.
However, the open-ended quiz group was the most successful in the post achievement test with all question types. The comparison of the descriptive results of the pre-and posttest academic achievement verifies the role of the question types. Given that students in multiple-choice and the mixed-design groups, to some extent, choose among predetermined options or statements, they may have felt all the questions would be uncomplicated. This resulted in superficial studying or none, as they assumed they would pass the quiz or the exam readily. In contrast, the open-ended quiz group was overwhelmed with questions, with limited or no hints other than contextual clues or pictures. This required students to use all their academic knowledge, learning strategies, and skills. However, students may use an in-depth learning approach to understand the subject when asked questions requiring answers based on interpretation. Because test items leading to remembrance or guesswork require less mental effort, the academic achievement scores of the students in this study may have fallen behind the open-ended group, who studied with deeper learning strategies. Notwithstanding the seeming disadvantage in the quizzes, the open-ended group acquired higher perceived learning and achievement scores and expressed absolute ideas on the benefits of this kind of quiz type. Confronted with challenging questions, the open-ended group might have been compelled to study for sentence structures or different expressions and to consider that the only option to pass the quiz or the exam was to study hard, leading to higher learning.
Regarding the perceived learning, this study did not find any statistically significant differences between groups in total perceived learning scores. However, the open-ended group had the highest total perceived learning scores, which may be explained by the fact that the open-ended group used deeper learning strategies to study. The mixed-design group fell behind the multiple-choice group, despite facing fill-in-the-blanks type questions as well as true/false and match-type questions. This might be the result of their feeling they did superficial learning with the less accustomed type of matching questions.
In-depth analysis of the multiple-choice group revealed a moderate level of relationship between affective learning scores and average quiz scores, and between cognitive learning scores and average quiz scores. There was a weak level of relationship between affective learning scores and academic achievement scores, and between cognitive learning scores and academic achievement scores. This variation may be the result of students' perceiving higher learning in quizzes with options facilitating ease-of-decision but their low performance in the academic achievement test with open-ended, short answer, fill-in-the-blanks as well as multiple-choice questions, which they are more familiar with.
Previous studies have reported higher cognitive perceived learning in online courses as a result of increased student satisfaction (Baturay, 2011) and higher achievement (Rockinson-Szapkiw et al., 2016). Consistent with the literature, the students in the multiple-choice quiz group achieved higher scores in the quizzes and referred to the positive impacts of quizzes, implying their satisfaction with their higher scores, which might have led them to believe they acquired higher learning. The differences between multiple-choice and mixed-design groups can be explained by the fact that the true/false questions that the mixed-design group faced required less thinking and had simpler, easy-to-guess matching questions. The students frequently face multiple-choice tests or open-ended questions in their academic lives, but they seldom face the mixed-design exams, which may have adversely affected their learning approach and academic achievement.
Finally, the analysis of the open-ended group showed a very weak positive relationship between affective learning scores and average quiz scores, and between affective learning scores and academic achievement scores. There was a weak relationship between cognitive learning scores and average quiz scores, and between cognitive learning scores and academic achievement scores. The discrepancy from the other two groups is obviously a result of the students' feeling they have learnt more while studying for questions that provide no hints but have made more mistakes in answering the questions without any options or clues within the given limited time. As is seen in previous studies, enhancing interactions that influence learners' perceived learning and satisfaction relates strongly to learner-content interaction (Alqurashi, 2019;Baber, 2020;Baturay, 2011;Lin et al., 2017). In this framework, all the students interviewed in this group confirmed the positive role of the open-ended type of questions in guiding them to use deeper learning and studying strategies while interacting more with the content of the quizzes.
Overall, the types of the questions given to different groups might have changed the study behaviors. Thus, students' study behaviors may have influenced the expectations and satisfactions that were indirectly related to the perceived learning. In addition, the observed increase in the academic achievement of the open-ended group could be attributed to their study behavior.
Some researchers argue that the determinants of perceived learning and satisfaction outcomes of students in online learning are course structure, instructor knowledge, and facilitation of learning process by feedback (Baber, 2020;Cole et al., 2021). Others report the variables that principally influence student satisfaction and perceived learning in online courses to be the course design, interaction, and the learning content (Barbera et al., 2013;Cui, 2021). In accordance with the previous studies, the course structure in our study enabled the instructor to give feedback on the quizzes and learning processes by answering the quiz questions and allowing students to create similar questions in the first lesson following the related quiz. This allowed higher interaction, which may have positively affected students' perceived learning.

Conclusion
This study set out the relationships between different quiz types, academic achievement, and perceived learning. The participants were most successful in the multiple-choice type of questions and least successful in the open-ended questions. Conversely, those who were exposed to open-ended quizzes were most successful in the achievement test, revealing the effect of this type of question in improving study behaviors and deepening learning strategies for mixed-design exams. The students in the openended quiz group displayed the highest affective and cognitive learning scores, implying the impact of dealing with questions that require deeper learning strategies. Finally, the current study confirmed the positive relationship between the overall perceived learning scores and academic achievement scores, that is, the higher the perceived learning score, the higher the academic achievement score. The question types in this study shaped students' study behaviors and also affected their expectations and satisfactions.

Limitations and Implications
This study is not exempt from limitations. The sample size in the groups was small and the instructional package was specific to the English course. A larger sample size and content would enhance the sensitivity analysis. The study was carried out through the most frequently used online quiz types. Further studies could examine the types of quizzes created with other assessment types. We hope the results of the study are helpful to online instructors who desire to make more effective use of various types of quizzes in online EFL courses.