Heuristics to Support the Evaluation of Optimal Experience in Educational Games for Learning Japanese as a Second Language

The evaluation of Computer-Assisted Language Learning (CALL) tools can be carried out from the perspective of different aspects, and there’s still no proposal from the literature for the evaluation of CALL games from the perspective of an “optimal experience”. Therefore, this paper proposes a set of 32 heuristics for the evaluation of educational games for Japanese Language Learning (JCALL). Furthermore, the heuristics were applied in the evaluation of JCALL L2 (i.e., as a second language) games in order to verify their usefulness. Findings show the heuristics provide good support in the evaluation of language learning educational games, and can collaborate in the redesign of these games in order to improve the optimal player experience. Finally, results also indicate the feasibility of using these heuristics in the evaluation of other types of CALL tools other than games.


Introduction
Software for Computer-Assisted Language Learning (CALL) are used at the study of foreign languages, providing important support to the study of languages by students, even without the presence of language teachers.These educational technologies offer flexible study schedules and allow learning to be carried out individually and at each learner's pace (Miangah and Nezarat, 2012;Sung et al., 2015).In addition, the activities and contents of CALL tools can present interactive elements and adapt to the learning style of each student (Torat, 2000).Given the technological advances of mobile devices (e.g., smartphones and tablets) that have been happening since the early 2010s, the study of foreign languages using Mobile-Assisted Language Learning (MALL) has gained a lot of attention, despite the m-Learning challenges already presented by several authors (e.g., (Fernandes et al., 2012;Mohammad et al., 2012;Traxler et al., 2015)).
In this context of Education, games designed for teaching are presented as relevant educational technologies, being developed with the objective of teaching about a certain subject, reinforcing the development of skills, expanding concepts, or helping in the teaching or revision of content, through a "simple" game (Petri and von Wangenheim, 2016).Educa-tional games, designed with a combination of game design principles and learning theories (Ibrahim et al., 2011), provide a fun and safe environment where learners feel comfortable to risk making mistakes in exercises, while having fun playing (Chinnery, 2006).Furthermore, they are capable of adapting the individual learning and playing experience to the needs, preferences, goals and abilities of each learner (Kickmeier-Rust et al., 2011).The adoption of educational games in the teaching-learning process aims to bring educational content capable of contributing to greater motivation and engagement in studies.
Within the scope of the evaluation, educational games are commonly evaluated from the perspective of several aspects (Marciano et al., 2014), ranging from interface (e.g., accessibility and usability), gameplay (e.g., rules and mechanics), contents (e.g., history and pedagogical contents), multimedia (e.g., images, sound and video) and even learning potential, focusing on the pedagogical issues that can effectively assist language learning.However, researchers commonly overlook the evaluation of the potential of CALL games to provide the optimal experience (Csikszentmihalyi, 1993;Marques and Miranda, 2022b), which could improve the player's focus and satisfaction while learning the second language through the game.
Given the above, the objective of the present work is to present heuristics for the evaluation of JCALL games.The heuristics are conceptually based on two pillars, that is, the dimensions of the Flow Theory (Csikszentmihalyi, 1990(Csikszentmihalyi, , 1993(Csikszentmihalyi, , 2000) ) and the components of Hubbard's framework (Hubbard, 1988(Hubbard, , 2006(Hubbard, , 2011)).The premise for carrying out this work is that evaluating JCALL educational games from the flow perspective could contribute to the games being evaluated in relation to the optimal experience they provide, and if they would be capable of inducing optimal levels of concentration and satisfaction, contributing, in this way, for a playful and effective learning of foreign languages.The proposed heuristics are also evaluated in the context for which they were conceived, that is, aiming to support the evaluation of

Related work
Related work includes studies that propose heuristics, conceptually based on flow dimensions, to: (i) evaluate different perspectives of games with specifically educational purposes; or (ii) evaluate CALL tools.Thus, Rêgo and Medeiros (2015) presented a set of 16 heuristics for the evaluation of m-Learning games in terms of usability and game experience, based on the Heuristic Evaluation for Playability (HEP) (Desurvire et al., 2004), on Gameflow (heuristics for the evaluation of flow on entertainment games) (Sweetser and Wyeth, 2005), and on Criteria for Designing Educational Computer Games (Whitton, 2009).The heuristics were evaluated with five games, selected from academic projects by the authors themselves.Although the heuristics are based on Flow Theory, they involve only three of the nine flow dimensions.Also, the study did not present how the heuristics were instantiated in questions, nor did it detail the evaluation of the games according to each heuristic.Mohamed et al. (2010) present heuristics for evaluating aspects of computer games for educational purposes.The heuristics evaluate playability, usability, interface, content, multimedia and educational/pedagogical aspects.Although some heuristics are related to the dimensions of Flow Theory (e.g., "clear goals and learning objectives", "clear and understandable content structure", and "challenge offered is in line with user standards"), the heuristics do not reach all flow dimensions.Also, the applicability of the proposed heuristics has not been presented.The study of Ishaq et al. (2021) proposed heuristics for evaluating serious games for language learning, considering interface, gameplay, feedback, content (e.g., multimedia elements, and alignment of questions in exercises), teaching effectiveness, learnability, satisfaction, and cultural contexts (i.e., figures with cultural contexts, to make the learners more engaged to learn the language).Similar to the above study of Mohamed et al. (2010), some heuristics have similarities with the flow dimensions (e.g., "the game provides instantaneous progress feedback", and "the game has clear game goals, teaching objectives and structure"), but this set of heuristics seems to disregard other flow dimensions.
Moreno Fuentes et al. (2018) proposed a model for evaluating websites that teach English as a second language.The model considers aspects such as usability, ergonomics, and linguistic and pedagogical points of view.The model was instantiated in a checklist, and contains items related to some flow dimensions, such as, adequacy to the level (adaptation to the student's abilities), personalized guidance (availability of the site to adapt according to the difficulties and preferences of the student), and sense of control (i.e., allows the learner to customize the site according to their preferences).However, the checklist items do not seem to cover all nine flow dimensions.Also, despite mentioning validation with specialists, the applicability of the instrument was not presented with websites for teaching foreign languages.Zaibon and Shiratuddin (2010) proposed heuristics to evaluate educational games for mobile devices.The heuristics are divided into four components: usability (related to interface and game controls), mobility (player's ease of immersion into the game world, and the ability to play the game anytime and anywhere), game play (how the game environment is consistently and logically presented, and how it is meaningful and not boring to the player), and learning content.The heuristics were instantiated in items of a questionnaire, applied to the evaluation of the game called 1M'sia, which was designed to teach cultural elements and to promote values such as unity, humility and acceptance among all ethnic groups in Malaysia.Some heuristics relate to flow dimensions (e.g., "the game provides clear goals or supports player-created goals", "player is in control", and "challenge, strategy and pace are in balance"), but the heuristics do not involve all dimensions of the flow and were employed in the evaluation of just one game.
The related works demonstrate the interest of different researchers in proposing heuristics to evaluate educational games and CALL tools.However, we understand that these works have limitations, as they do not cover all flow dimensions and, therefore, can have compromised evaluation of the optimal experience.Therefore, the present work proposes and employs heuristics for evaluating educational games for teaching Japanese as a second language, based on the Theory of Flow and its nine dimensions, and also on a framework for evaluating CALL tools, as shown below.

The heuristics
In this section, the set of heuristics developed for the evaluation of JCALL games will be presented.Initially, the theoretical foundation on which the heuristics were based will be introduced.Then, the heuristics are presented and, finally, the methodology used for the elaboration of the heuristics will be described.
In his search for what makes a satisfying experience, Csikszentmihalyi (1993) interviewed thousands of people with different profiles and who practiced different activities without the intention of receiving any monetary satisfaction.In the synthesis of his studies, Csikszentmihalyi classified a satisfactory experience as the "optimal experience", or "flow".He describes flow as a psychological state of optimal experience, resulting from an activity in which the person feels he is being challenged to his limits, but continues to perform the activity for the pure satisfaction of conducting it.During the activity, the person develops skills to deal with the task at hand, and the challenges presented to him keep growing along with the pace of the skills.When people come out of the flow state and reflect on the experience, they feel that they have grown in knowledge and skills in order to deal with the faced challenges, and that they had a great experience while growing.That feeling motivates them to want to do the task again, hoping to get back to feeling the optimal experience.
Activities structured to induce flow offer a system of gradually increasing challenges, capable of accommodating continuous and deepening satisfaction from one individual to another as their skills grow (Csikszentmihalyi and Csikszentmihalyi, 2006;Nakamura et al., 2002).They also have clear and achievable goals (Csikszentmihalyi, 1988) and are possible to gain control of (i.e., during the activity, the person feels empowered enough to deal with the situation at hand and any other events arising from it) (Csikszentmihalhi, 2020).In addition, they provide clear feedback, and facilitate concentration and engagement while making the practice of the activity as distinct as possible from everyday reality (Csikszentmihalyi, 2013).Games, in general, should present these features described above, which, in turn, highlights the importance of evaluating JCALL games from the perspective of flow and its dimensions.Csikszentmihalyi (1990) describes flow theory with nine dimensions, briefly described in Table 1.
In addition to the Flow Theory, the present work is also based on the framework for quantitative and qualitative language learning software evaluation proposed by Hubbard (1988Hubbard ( , 2006Hubbard ( , 2011)).Hubbard initially proposed a framework for evaluating CALL software designed to be used as support material in language teaching courses, known at the time as coursewares (Hubbard, 1988).Later on, Hubbard proposed the use of the framework for evaluating CALL tools in general (Hubbard, 2006(Hubbard, , 2011)).This framework has also been adopted by the literature for other related purposes, such as analysing English language learning mobile applications (Kim and Kwon, 2012).The framework is based on six core components, briefly described in Table 2.

The set of heuristics
In total, 32 heuristics were developed to support the evaluation of JCALL educational games.The heuristics, detailed in Table 3, were grouped among the nine dimensions of the flow and, for each heuristic, it is mentioned which Hubbard framework components it is associated with, as well as the description of the heuristic itself.Finally, references from the literature on CALL and MALL that defend and/or discuss concepts related to the heuristic in question are cited.

Development methodology
In a first step, a theoretical-conceptual analysis was carried out about the principal works in the literature related to the Flow Theory (Csikszentmihalyi, 1990(Csikszentmihalyi, , 2000(Csikszentmihalyi, , 1993) ) since this theory constitutes an important theoretical reference of the present work.Next, Hubbard's framework (Hubbard, 1988(Hubbard, , 2006(Hubbard, , 2011) ) was studied, aiming to allow the elaboration of heuristics also based on the components described in this other important basis of the present work.The six components of the mentioned framework were analysed according to the nine dimensions of the Flow Theory, in order to trace a relationship between the Theory and the Framework, and so that the proposed heuristics had both references as a basis, contributing to the evaluation of CALL games from the perspective of gaming experience and learning potential.
In the second stage, candidate heuristics began to be elaborated for the context of educational games aimed at language learning and, therefore, it was decided to also revisit the works in the literature on CALL and MALL in order to provide a greater basis for the heuristics that, at this point in the study, were beginning to be elaborated (e.g., (Ciampa, 2014;Kacetl and Klímová, 2019;Kukulska-Hulme and Traxler, 2013;Godwin-Jones, 2014;Traxler, 2009)).
In the third stage, an initial set of heuristics was defined and, subsequently, they were reviewed among the researchers.Based on the discussions held by the authors, the heuristics were refined until no new changes were necessary.At the end of this process, the final set of heuristics had been formalized, as described in the previous subsection.Figure 1 presents a diagram illustrating the flow of the heuristic development process.The next step was to practically apply the formalized heuristics; this work step is described below.

Evaluation
In order to verify the usefulness of the heuristics developed in the present work, this section will present the evaluation carried out with different JCALL educational games.

Materials and procedures
For the evaluation, first, studies were selected from the literature, focusing on papers that present educational JCALL games available for free download via the Internet (e.g., in commercial game stores or web repositories), so that the inspection of the games could be conducted through software which were executed on the researcher's computer or device.
The main focus was to select JCALL games, since the heuristics were originally developed for this context (Marques and Miranda, 2022a).However, in order to verify whether the proposed heuristics could also be used to evaluate other types of JCALL software, other than games, additionally, a non-game JCALL software was also selected.We also sought to use tools with different characteristics.Thus, four JCALL games were selected, that is, Sumo Sensei (Marques et al., 2015b), Karuta Kanji (Marques et al., 2015a), Karuchā Ships Invaders (Marciano et al., 2013b) and Katakana Star Samurai (Marciano et al., 2015) , and also the non-game Kanji JLPT N5 tool (Haristiani and Firmansyah, 2016).
The process of evaluating the games, and also of the nongame tool, involved the participation of a specialist who has the appropriate profile to participate in this study, that is, with varied knowledge, skills and competences, as described below.Since the evaluation involved JCALL games, it was a basic condition that the specialist had proficiency in the Japanese language.From the description found in the literature papers of the tools that would be tested in this study, it was found that the games, and the non-game tool, exercise different alphabets of the Japanese Language, that is, hiragana (in Karuchā

Dimension name Dimension summary
Challenge-skill balance During flow, the person feels that the challenge proposed by the activity at hand is manageable with the skills he already has, that is, the proposed challenge is in balance with his skills.

Merging of action and awareness
During flow, the person becomes so involved in the activity that it becomes spontaneous, almost automatically.The concentration required to conduct the activity is perceived as effortless.

Clear goals
The challenges proposed by the activity at hand are perceived as achievable and intuitive, so that the person in flow knows how to reach the goal without too much difficulty.

Unambiguous feedback
During the entire execution of the activity, the person in flow knows how he is doing in achieving the objectives, and has the perception that the actions he performs are contributing to deal with the challenge.

Concentration on the task at hand
The person has a full focus on the task at hand, so that concerns and anxieties unrelated to task goals are temporarily inhibited, and all focus is directed at what is relevant to the task at hand.

Feeling of control
The person feels that, with the skills he has and develops throughout the activity, the margin of error is as close to zero as possible.This dimension is also related to the feeling that the decisions taken during the task are relevant to the results achieved.
Loss of self-consciousness All concerns about the way the person presents himself to other people are temporarily inhibited, so that the person may expand his concept of self through activity.

Time distortion
During flow, there is the feeling that time does not pass in the way it normally does, and the passage of time becomes irrelevant to the rhythms dictated by the activity.

Autotelic experience
Feeling that carrying out the activity at hand is a reward in itself, and personal skill growth during the activity is more important than success or failure in the activity.
Table 2.The six components of Hubbard's framework for evaluating CALL tools.

Component name Component summary
Technical preview Make sure the software will run on equipment accessible to students, and that teaching materials are accessible to learners.

Operational description
Seek to understand how the software works from a user's point of view, in order to evaluate the flow of lessons, approaches adopted for teaching and reviewing content, and how activities are presented (e.g., screen layout, user input, feedback, exercise response time, and help options).

Teacher fit
Infer and evaluate the teaching approach adopted by the software.It is important to observe if the approach used by the tool is compatible with the approach adopted by teachers in the classroom, the tool exercises content in a contextualized way in real-life scenarios, and offers succinct explanations for why the questions were wrong and correct.

Student fit
Observing how well the software fits the student's study interests and preferences, and how well the taught content fits the learner's needs, as well as motivates and encourages him to think on his own.

Implementation schemes
Reflect on how the software can be integrated into the classroom as teaching material for a course or regular curriculum, including reflecting on how long this process takes.Issues to consider include accessibility, preparatory activities (e.g., whether a class is required to learn the content before using the tool), and various teacher control variables (e.g., classroom management, monitoring student performance, and monitoring taught content).

Appropriateness judgements
Evaluate software suitability based on quality, suitability for the teacher, and also suitability for the student, in addition to cost-benefit considerations.Table 3.The 32 heuristics for evaluating educational games aimed at learning Japanese as a second language.

Flow Theory dimension
Hubbard's Framework component(s)

Challenge-skill balance
Student fit 1 In each game phase, present adequate levels of introduced learning content (e.g., vocabulary, expressions, phrases, and grammar) and also appropriate levels of revised content, so that the learner does not feel overwhelmed with new teaching content to memorize.(Ciampa, 2014;Godwin-Jones, 2014) Operational description 2 Provide new experiences when redoing game exercises (e.g., random events based on luck), in order to provide a new challenge to the learner while he reviews what he has previously studied.(Macedonia, 2005;Xu et al., 2020) Student fit 3 Create believable human-like behavior for non-gaming opponents, by adapting behavior according to the student's cognitive ability, and making mistakes similar to the learners, in order to facilitate flow experiences.In the case of online student-to-student exercises, use complex algorithms to match learners of similar language proficiency levels.(Ang and Zaphiris, 2008;Kacetl and Klímová, 2019;Kukulska-Hulme and Traxler, 2013;Macedonia, 2005;Godwin-Jones, 2014;Traxler, 2009;Xu et al., 2020) Clear goals Operational description 4 Present achievable objectives (in relation to the time spent to learn and exercise content in the second language).
(Kukulska-Hulme and Traxler, 2013;Traxler, 2009) Student fit 5 Present attainable goals (in relation to difficulty), considering the student's level of proficiency in the language.
(Kukulska-Hulme and Traxler, 2013;Traxler, 2009) Student fit 6 Present real contexts of use of the content taught, so that it is clear to the student that he is learning content involving real-world problems that are relevant and interesting to him.(Butler et al., 2014;Kukulska-Hulme and Traxler, 2013;Macedonia, 2005;Godwin-Jones, 2014;Traxler, 2009) Student fit 7 Present error feedback to the student in a positive way, so that he continues to believe that learning the second language is an achievable goal.(Butler et al., 2014;Ciampa, 2014) Unambiguous Feedback Operational description, Teacher fit 8 When the student gets questions wrong, offer feedback that not only shows that the question was wrong but also presents tips for memorizing the correct answer, and explanations (preferably bringing real contexts) that help in understanding the error, and promote the reasoning of understanding the error.(Sykes, 2018;Xu et al., 2020) Student fit 9 During teaching, offer tips that help the student to remember the translation of terms in the second language.(Xu et al., 2020) Student fit 10 Encourage the student to compose their own associations between foreign language words and their translations.It is also recommended allowing students to share associations created with each other, promoting cooperation in studies.(Ciampa, 2014;Kacetl and Klímová, 2019;Godwin-Jones, 2014;Xu et al., 2020) Operational description 11 Avoid too much text.It is recommended to use other media for explanation, such as pictures, animations, and audio.(Ciampa, 2014;Xu et al., 2020) Merging of action and awareness

Operational description 12
Present simple game mechanics and objectives, in a way that allows playing the game to be spontaneous and automatic, while the educational content related to the player's tasks is consciously processed and reflected.(Macedonia, 2005; Godwin-Jones, 2014)

Operational description 13
Avoid an excess of commands available in menus and game actions available during the game, in order to facilitate the automation of game actions.
(Godwin-Jones, 2014) Focus on the task at hand Operational description 14 Present audiovisual elements in an attractive way, and contextualized with cultural elements of the second language.(Butler et al., 2014;Ciampa, 2014;Godwin-Jones, 2014) Operational description 15 Game elements cannot distract the player from his primary goal in using the tool, which is to learn the second language.(Butler et al., 2014;Kacetl and Klímová, 2019) Operational description 16 Present an engaging narrative with cultural elements from the country of origin of the second language.(Ang and Zaphiris, 2008;Godwin-Jones, 2014;Sykes, 2018) Continue on the next page Table 3.The 32 heuristics for evaluating educational games aimed at learning Japanese as a second language.(Continue)

Feeling of control
Technical preview 17 Design the game for an operating system and hardware available to students.
(Kukulska-Hulme and Traxler, 2013;Traxler, 2009;Xu et al., 2020) Teacher fit 18 Allow the teacher to suggest new content lists for their students to exercise, and/or base content lists on didactic materials to be used in the classroom.(Kacetl and Klímová, 2019) Student fit 19 Allow the student to customize the duration of exercise routines and content exercised in routines, so that the student can study at his own pace.(Ciampa, 2014;Sykes, 2018;Traxler, 2009;Xu et al., 2020) Implementation schemes 20 Allow the teacher to have access to student performance data, in order to have control over which content his students are having more difficulty with.Introduce cultural elements into the game environment so that the player feels "drawn" into places where the language he wants to learn is spoken.(Kukulska-Hulme and Traxler, 2013)

Teacher fit 26
Present exercise routines of moderate duration, in order to allow the game to be used in a few moments of class, but without taking up all of class time.(Ciampa, 2014;Macedonia, 2005;Traxler, 2009) Operational description 27 Avoid showing time spent on exercises, unless voluntarily presented to the student (e.g., in an activity log).
(Godwin-Jones, 2014) Student fit 28 Do not force a fixed schedule for studying the language, in order to introduce a "mandatory schedule" of studies.(Ciampa, 2014;Kukulska-Hulme and Traxler, 2013) Autotelic experience

Operational description 29
Design the game with varied approaches to teaching and reviewing content in order to review content in a nonrepetitive manner.

Operational description 30
Provide surprise elements (e.g., random situations involving luck in games) and unseen elements for the student (e.g., new dialogues of non-player characters, also known as NPCs), in order to make each game session unique.(Butler et al., 2014;Xu et al., 2020) Operational description 31 Propose game mechanics that train the student's knowledge playfully, so that the learner does not feel that he is in a learning environment but in a game environment, where learning occurs in the most unconscious way possible.(Macedonia, 2005;Godwin-Jones, 2014;Xu et al., 2020) Student fit 32 Offer breaks during language learning, offering game exercises where the player does not practice language training, but relaxes and saves energy for the next training, allowing for more extensive training sessions.
(Godwin-Jones, 2014) Ships Invaders) and katakana (in Katakana Star Samurai), in addition to kanji and vocabulary from JLPT N5 (in Sumo Sensei, and in Kanji JLPT N5), and kanji and vocabulary from JLPT N4 (in Karuta Kanji).Thus, the evaluation had to rely on a specialist with Japanese proficiency capable of understanding all these subjects in this language.Also, given the relationship between the proposed heuristics and design issues, it was sought a specialist with knowledge in Human-Computer Interaction, and finally, given that the heuristics were designed to evaluate CALL games, it was also desired that the specialist had previous experience with development of CALL games.
For the evaluation of each game, and also the non-game tool, first, the researcher generates questions from the set of heuristics, which guide the composition of the questions, with each heuristic being instantiated in a question for evaluation.The specialist instantiated the heuristics in questions, specific to the evaluation scenario of each game in this study.It should be noted that in an evaluation based on heuristics, not all of them need to be, in fact, used, since their applicability will depend on the game to be evaluated.However, it is important to note that the entire set of heuristics needs to be considered when formulating the questions.In the context of the tools tested in this study, it was observed that two heuristics related to the autotelic experience did not apply to the non-game tool evaluation context (i.e., heuristics #30 and #31), and these were disregarded for the tool evaluation.
After instantiating the heuristics in questions, the latest stable versions of the games were installed.Given that games for mobile devices and PCs were selected, designed to run on operating systems, respectively, Android and Windows, it was necessary to configure the necessary hardware to fully execute the tests on the platforms in question.Katakana Star Samurai offers versions for Android and Windows, so the evaluation of the game alternated between both platforms.Also, it was observed that Sumo Sensei and Karuta Kanji offer two game modes involving online competition between two players, and in order to fully evaluate the games, two tablets were needed.The tablets used were similar in model and operating system.Android app tests were conducted on two 7-inch Samsung Galaxy Tab 3 tablets, with Android 4.1 Jelly Bean Operating System.Tests of Windows games were conducted on a computer with an Intel core i7 9th Gen processor, running at 1.60 GHz using 8 GB of RAM, with Windows 10 Operating System version 22H2.
After installing the games, the pilot testing of the software began, in order to observe the content of the games, plan how to go through their content, and check for any difficulties before conducting the evaluations.For the tools that divide the content into lessons (i.e., Karuchā Ships Invaders, Katakana Star Samurai, and Kanji JLPT N5), it was decided that the assessment would run through all the lessons.Analogously, for games that do not divide the content into lessons, but into categories (i.e., Sumo Sensei, and Karuta Kanji), it was decided that the evaluation would go through all categories.For games that offer the option to choose a difficulty level (i.e., Karuchā Ships Invaders, and Katakana Star Samurai), the games were tested on medium and hard difficulty.The difficulty of the tested games does not change the revised content, only the reaction time to answer.Since Katakana Star Samurai features both Windows and Android versions, the medium level was played on the Android version, while the hard level was played on the Windows version.Karuta Kanji offers a training mode of infinite duration, however, up to level 33, new words are added for the player to learn, therefore it was decided that the evaluation would advance up to level 34.
After the pilot test, the actual game evaluation step began.The specialist inspected the games based on the formulated questions, then synthesized the discoveries according to the answers to the previously formulated questions.Next, problems found in the games were described, considering the results of this analysis.At another moment, the degree of severity of each of the problems was defined so that, finally, main improvements were proposed, based on the problems detected, and the degree of severity of each problem.The degree of severity attributed to the problems was decided based on a classification adapted from the scale proposed by Nielsen, which was originally designed to classify severity of usability heuristics (Nielsen, 1992).The severity scale adapted for the context of this study, presented in Table 4, has a severity rating from zero to four, with a higher rating indicating a greater severity of the problem encountered.Figure 2 summarizes the procedural steps performed during the evaluation of the JCALL software, as detailed above.

Results
The results achieved with the evaluation of games and software aimed specifically at learning Japanese as a second language will be presented below.Figure 3 illustrates a screen of each tested tool.

Sumo Sensei
Sumo Sensei (Marques et al., 2015b) is an educational game designed to support the study of kanji (part of the Japanese alphabet) used in the most basic level of the Japanese Language Proficiency Test (JLPT), that is, the N5.The game, available exclusively for Android, is set in the context of a sumo fight, and offers three game modes: individual training (called "teppo"), casual competitive (where the player selects competitors to play online matches), and competition (where the player competes online against other players for positions in a global ranking, and can win titles).The focus of the game is on online competition between players, where players control sumo wrestlers and must push opponents out of the ring by hitting the correct translation of a combination of kanji that compose the test's vocabulary.Version 0.1.13-betafor Android was tested on two tablets with the configurations specified in Section 4.1.To test all the game modes, a total of around five hours were dedicated, with test sessions of approximately 1.2 hours on different days.
Table 5 presents the questions formulated from the heuristics for the evaluation of Sumo Sensei.Then, the educational game was inspected in order to answer these questions, also presented in this table.Finally, a degree of severity in dealing with observed problems was proposed, as specified in Table 4.

Rating
Description I don't agree that this is a problem at all.Cosmetic problem only.Don't need to be fixed unless extra time is available on project.Minor problem.Fixing this should be given low priority.Major problem.Important to fix, so it should be given high priority.Catastrophic problem.It is imperative to fix this as soon as possible.The game presents a gameplay that uses touch controls in a simple and intuitive way (i.e., the player touches the option he considers correct, and, if it is correct, it automatically pushes the opponent one step closer to the outside of the tatami).
Focus on the task at hand 14 Are audiovisual elements contextualized with cultural elements of Japan?
Audiovisual elements are contextualized in sumo, which is a sport of Japanese origin.

15
Can game elements distract the student from the task of learning? No.
16 Does it present an engaging narrative with cultural elements from Japan?
The game does not feature a story.
Continue on the next page All activities in the game involve practicing the Japanese language without interruptions.
Sumo Sensei stands out for allowing competition between players, with matchmaking algorithms to pair players of similar skill levels.All game modes are set in the context of sumo fights, which contributes to the player's greater immersion in Japanese culture.Finally, matches between players have game items capable of providing random events in matches, which contributes to providing new game experiences with each training session.
Among improvements for the game, a better division of revised content is suggested, in order to allow the student not to feel overloaded with too much vocabulary to learn in a single game session.The division should be designed in such a way as to provide a gradually increasing difficulty of content and, preferably, should follow a teaching order indicated by teachers of Japanese for foreigners, or books and workbooks for JLPT N5 training.Allowing the player to customize the amount of terms they review in each training session is also preferable.It is also recommended to bring examples of real contexts of using the learned vocabulary.In addition, it is recommended to offer breaks during exercise routines, for example, the introduction of non-teaching minigames, or articles and curiosities about Japanese culture or sumo.In this way, the student rests during kanji revision routines and, possibly, would be able to continue training sessions for longer periods of time.
Finally, a greater variety of Sumo Sensei exercises is suggested, in order to provide new game experiences during training.Offering a greater variety of in-game items during player-to-player matches is a good start, given that the game only features four items, and more items make for more varied random events in matches.However, other types of exercises would be recommended (e.g., exercises that make the student fill a sentence or word with correct kanji, set in sumo practice).

Karuta Kanji
Karuta Kanji (Marques et al., 2015a) is another JCALL game designed to support the study of the kanji presented in the second most basic level of the proficiency test (JLPT N4).The game, available exclusively for Android, is set in a popular Japanese card game called Karuta, which involves being able to accurately and quickly determine which card, out of a series of cards, correctly extends a Japanese poem recited before each turn, and picking up the card before your opponent (Bull, 1996).In Karuta Kanji, poems are replaced by words written in kanji, and the player must quickly select the correct translation among the offered cards.Three game modes are offered: individual training ("renshuu"), casual competitive (where the player selects competitors to play online matches), and competition (where the player competes online against other players for positions in a global ranking, and can win titles).The focus of the game is on online competition between players, where players compete in agility and accuracy, choosing cards that translate kanji exercised in JLPT N4.Version 0.1.13-betafor Android was tested on two tablets with the configurations specified in Section 4.1.A total of around six hours were dedicated to testing all game modes, with daily sessions lasting around one hour.
The evaluation was conducted analogously to the evalua-tion of Sumo Sensei.Table 6 presents the elaborated questions, which are based on the heuristics for the evaluation of the Karuta Kanji game, the answers to the posed questions, and degree of severity in dealing with problems encountered (c.f.Table 4).Karuta Kanji deserves to be highlighted for the gradually incremental challenge of individual training, setting the game in Japanese sport, online competition between players, with matchmaking algorithms capable of pairing learners of similar skill levels, and ingenious randomization of a variety of items during competition between players, in order to provide new gameplay experiences.
For Karuta Kanji, improvements similar to those suggested for Sumo Sensei (see Section 4.2.1) are recommended.Karuta Kanji handles some of the suggested improvements for Sumo Sensei: the division of revised content is better, given that each game level introduces only four new words to the learner; and there is a greater variety of items to be used in online modes.However, as with Sumo Sensei, it is suggested to bring examples of real context of use for the learned vocabulary, breaks during exercise routines, and more varied exercises.Also, exclusively for Karuta Kanji, it is recommended to use time limits during training sessions, in order to make the time spent on exercises more manageable.

Karuchā Ships Invaders
Karuchā Ships Invaders (Marciano et al., 2013b) is a Missile Command-style game, focusing on the learning of hiragana, which is part of the Japanese alphabet, simpler than kanji, and usually taught before kanji in Japanese as a second language courses.The game, available for Windows Operating System, involves reading the hiragana of the spaceships and typing the character reading in romaji (i.e., phonetic transcription of the Japanese language into the Latin alphabet) before the ships "collide" on the scenery terrain.In the game's story, the characters Alex and Ana prepare the player to welcome their Japanese friends, through the teaching of hiragana and elements of Japanese culture (e.g., sushi, shitake mushrooms, and sake).Version 0.3.37-stablefor Windows was tested, on a computer with the specifications described in Section 4.1.To test all 30 game levels, on medium and hard difficulty modes, a total of around five hours were dedicated, with sessions of 1.2 hours per day.
Table 7 presents the questions prepared for the evaluation of the Karuchā Ships Invaders game, based on the heuristics presented, answers to the questions, and degree of severity in dealing with problems encountered (c.f.Table 4).
Karuchā Ships Invaders stands out for its engaging and playful gameplay, based on the popular game Missile Command, and involving typing exercises in the style of Pokémon Typing Adventure and The Typing of the Dead.Also, the learning of cultural elements throughout the hiragana brings variety to the taught content.Finally, the content is divided into levels, allowing for a gradual study of hiragana.Phases are short and simple, perfect for brief review sessions.
Among the main recommendations for the game, it is suggested a greater variety of game modes, more unique exercises, and/or the introduction of random game events, in order to provide different game experiences to learners.One sug- Practiced content is divided into categories, and new vocabulary for the player is gradually introduced in groups of four words throughout the levels of the game.
2 Does it introduce new content to gameplay when the student is revisiting game levels?
Competitive modes offer nine in-game items to be used during a match, randomly offered during a duel.The use of these items brings randomness to the matches, providing new experiences.
3 Does it use algorithms to balance matches against AI and against players?
In player versus player matches, players can duel against opponents of similar ranks in the game's global ranking, calculated from the number of wins and losses of players.The training mode does not offer matches against AI, but it presents a challenge with a gradual increase in difficulty, and allows the player to train the kanji that he misses most often.

Clear goals 4
Is the time to learn and practice kanji affordable?
Competitions between players have a fixed time of 90 seconds.However, training mode ends only when the player loses four life points, which can take a while.

5
Are the exercises adapted according to the student's main difficulties, in order to offer training with achievable objectives at the student's level of proficiency?
In training mode, players can choose to exercise the most often mistaken kanji in previous exercises, in order to work on the major difficulties.In competition mode, players are matched against opponents according to the amount of wins and losses, possibly pairing players of similar proficiency levels.
6 Does it teach the content in real contexts of use?
Kanji are exercised by composing vocabulary, however, phrases and vocabulary application contexts are not taught.
7 Is error feedback presented in a positive way?
The end of practice message is either neutral (e.g., "Shall we try again?") or in a lightly laid-back tone (e.g., "kanji doesn't equal chicken soup" and "this game is too easy for you").
Unambiguous feedback 8 Does it present error feedback that helps understanding the error?
At the end of the exercises, it presents a list of misspelled words with correct translations and the number of mistakes made in each word.9 Does it offer tips to help memorize kanji?It does not offer tips for memorizing the terms studied.
10 Does it encourage the player to compose their own associations between kanji and their translations?
It does not explicitly encourage the creation of memorization hints.
11 Does it avoid too much text?
During exercises, it presents animations and sound effects to show errors and successes.After exercises, errors are reviewed in a short and straightforward list.

Merging of action and awareness 12
Are game mechanics and objectives simple?
Game mechanics are simple (hit the correct word translation to win) and game objectives are intuitive and based on the Japanese karuta card game.
13 Does it avoid excessive game commands and actions?
The alternative selection mechanic is done through touch controls, in a simple and intuitive way (i.e., the player taps the cards to select the option he considers correct).
Focus on the task at hand 14 Are audiovisual elements contextualized with cultural elements of Japan?
Audiovisual elements are contextualized in karuta, which is a game of Japanese origin.

15
Can game elements distract the student from the task of learning? No.
16 Does it present an engaging narrative with cultural elements from Japan?
The game does not feature a story.
Continue on the next page Yes, based on karuta card game.

24
Is the student's self-esteem impaired at some point?
The lightly laid-back tone of feedback after exercises can be interpreted as detrimental to self-esteem by some students ( e.g., feedback "this game is too easy" can be misinterpreted by students who have difficulties in the game).

Time distortion 25
Are time-limited challenges proposed in moderate doses?
Individual training mode has no time limit, while online matches have a time limit of 90 seconds.

26
Does it present exercise routines of moderate duration in order to be adopted in the classroom?
Individual training mode is infinite, and its application in the classroom can take too long.All activities in the game involve practicing the Japanese language without interruptions.Content is divided into 30 lessons, with each lesson involving review of just one family of hiragana (which are denominated "gyo"), and some cultural elements.
Learning content for each lesson can be consulted at any time with a help menu.
2 Does it introduce new content to gameplay when the student is revisiting game levels?
No unprecedented aspects of gameplay are introduced over the course of reviewed exercises.
3 Does it use algorithms to balance matches against AI and against players?
The training does not offer matches against AI, and does not adapt the difficulty according to the student's performance.

Clear goals 4
Is the time to learn and practice hiragana affordable?All lessons have a fixed time of 60 seconds.

5
Are the exercises adapted according to the student's main difficulties, in order to offer training with achievable objectives at the student's level of proficiency?
Lessons of gradually incremental difficulty are introduced, however, the exercises do not adapt according to the difficulties of the learners.
6 Does it teach the content in real contexts of use?
Review hiragana without composing words, plus some words from Japanese culture.Phrases and application contexts of hiragana are not presented.

7
Is error feedback presented in a positive way?
Failure in lessons is responded to with "Level failed!Try again!" and the game mascot with a sad expression.
Unambiguous feedback 8 Does it present error feedback that helps understanding the error?
At the end of the exercises, it presents a list of wrong hiragana with the respective correct translations in romaji.
9 Does it offer tips to help memorize hiragana?
It does not offer tips for memorizing the content studied.
10 Does it encourage the player to compose their own associations between hiragana and their translations?
It does not explicitly encourage the creation of memorization hints.
11 Does it avoid too much text?
Features animations and sound effects to indicate successes and failures.

Merging of action and awareness 12
Are game mechanics and objectives simple?
Game mechanics are simple and based on the popular Missile Command game, with typing exercises.
13 Does it avoid excessive game commands and actions?
The game is done through typing exercises, in a simple and direct way (type the hiragana translation on the screen to prevent the ships from colliding).
Focus on the task at hand 14 Are audiovisual elements contextualized with cultural elements of Japan?
The game's mascots are Japanese, and elements that are part of Japanese culture are occasionally exercised.

15
Can game elements distract the student from the task of learning? No.
16 Does it present an engaging narrative with cultural elements from Japan?
The game presents an initial story that explains the objective of the game and features Japanese mascots, but it does not develop the narrative in other moments.
Continue on the next page All activities in the game involve practicing the Japanese Language without interruptions.
gestion would be to introduce levels in which there are rows of ships with a single hiragana gradually falling, with movement similar to the aliens in the popular game Space Invaders.The game mechanics would remain unchanged, however the movement of the enemies would be changed, which makes the proposal possibly viable.Similar to Sumo Sensei and Karuta Kanji (see Sections 4.2.1 and 4.2.2), it is recommended to introduce some breaks during hiragana training, which could be done by presenting some curiosities about the Japanese culture, for example.The game's mascots could be used for these moments.These moments could also be used to present contexts of use, in words and phrases, of the revised hiragana and cultural elements.Although it is not specified whether the game's lessons follow a learning order proposed by books or handouts, learning the 46 hiragana is relatively easy (Kuhara-Kojima et al., 1996) and, therefore, the order in which each ideogram is presented possibly does not interfere with learning.The tool was even evaluated in a classroom context (Marciano et al., 2016).However, it was observed that some terms taught in Japanese culture are not very present in cultures outside Japan (e.g., "uchikake", which is a type of kimono; "tabi", which are Japanese socks that separate the big toe from the other toes fingers; and "warashi", which are spiritual beings present in Japanese folklore), and therefore, could be considered complex terms for students who are learning the most basic alphabet of the language, as observed in the classroom assessment.
A revaluation of the level of difficulty of the terms used in the game is recommended.

Katakana Star Samurai
Katakana Star Samurai (Marciano et al., 2015), available for Windows and Android, is a game in the style of Missile Command (in terms of gameplay) and Asteroid (in regards to the movement pattern of obstacles), aimed at studying katakana, which is part of the Japanese alphabet, and commonly used to write words of foreign origin that were adopted into the Japanese language (i.e., words that came from languages other than Japanese) and onomatopoeia or slang (Carson, 1992;Samimy, 1994).The gameplay of Katakana Star Samurai consists of protecting a spaceship from invading ships, by selecting the translation of the katakana present on the invading ships.Versions 0.1.8-betafor Windows and version 0.1.2-betafor Android were tested on the hardware described in Section 4.1.To test all 25 game levels, in medium and hard difficulty modes, about five hours were dedicated, with daily sessions of one hour.
Table 8 presents the questions prepared for the evaluation of the Katakana Star Samurai game, based on the heuristics presented, answers to the questions, and degree of severity in dealing with problems encountered (c.f.Table 4).
Katakana Star Samurai stands out for its playful game mechanics, based on the Missile Command game.It is also possible to train katakana based on romaji or hiragana, allowing learners to review hiragana alongside katakana, or learn katakana without ever studying hiragana.There's also a division of content into game phases, allowing a gradual study of katakana.Although the game's lessons are not based on an order followed by books or handouts, in katakana, the rela-tionship between written symbols and pronounced syllables is quite simple, which makes learning katakana, as well as hiragana, relatively easy (Sakamoto, 1976), and therefore, the order in which each ideogram is presented possibly does not interfere with learning.
In the context of main recommendations, it is suggested to correct a bug that often occurs during matches: the response options can change abruptly during a game level, which can confuse the learner and make the game impossible for him to destroy some enemy ships.Also, like Karuchā Ships Invaders (see Section 4.2.3), it is important to introduce more game modes, more unique exercises, and/or the introduction of random game events.A viable solution would be the introduction of levels with enemy ships featuring some different movement or game mechanics, such as a ship that, when the katakana translation is correct, offers powers to the player's ship, such as invincibility or the ability to explode all onescreen ships.It is also recommended to introduce activities not intended to exercise katakana during some breaks, for example, introducing enemies where the player must just shoot them with correct timing, without exercising katakana.Introducing breaks during katakana training allows the learner to rest the mind and thus continue exercise sessions for longer periods of time.

Kanji JLPT N5
Kanji JLPT N5 (Haristiani and Firmansyah, 2016) is an Android mobile app aimed at teaching basic level JLPT kanji to Indonesian students.The app is equipped with some features such as Indonesian translation (Indonesian kanji meaning and vocabulary), vocabulary examples and the assessment of kanji knowledge through quiz.The app offers two modes: a study mode, where kanji are introduced in lessons of increasing difficulty; and a quiz mode, for reviewing the kanji taught.Version 1.02 for Android was tested on a tablet described in Section 4.1.To test all the game lessons, in medium and hard difficulty modes, a total of about five hours were dedicated, with sessions of 1.2 hours per day.
Table 9 presents the questions prepared for the evaluation of Kanji JLPT N5, based on the heuristics presented, answers to the questions, and degree of severity in dealing with problems encountered (c.f.Table 4).It is important to point out that, unlike the previously presented case studies, Kanji JLPT N5 is not a game, therefore the formulated questions had to be adapted for a different context from game.
Kanji JLPT N5 is a relatively simple software for learning kanji.Kanji learning sessions are brief and straightforward, and the exercises involve only quizzes.It is not specified whether the exercises keep a history of the learner's activities, and the tool does not adapt according to the student's difficulties.The kanji are taught with examples of vocabulary use, but the quizzes do not exercise them in vocabulary.The availability of more unique exercises, customization of time limits and number of questions in exercises, brief breaks during kanji training, feedback that better explains the mistakes made by the learner after exercise sessions, availability of kanji memorization tips (the use of mnemonics, such as the ones in kanji book pict-o-graphix (Rowley, 1992), is recommended), and the use of audiovisual elements set in Japanese The interaction is done through touch controls, or click with the mouse in the case of the computer version, in a simple and direct way (i.e., select the correct katakana translation to prevent the ships from colliding).
Focus on the task at hand 14 Are audiovisual elements contextualized with cultural elements of Japan? No.

15
Can game elements distract the student from the task of learning? No.
16 Does it present an engaging narrative with cultural elements from Japan?
The game does not present a story to the player.
Continue on the next page   culture are suggested improvements for the tool.

Discussion
Evaluating Japanese CALL educational games from the combined perspective of a framework for evaluating tools aimed at language teaching and the dimensions of the flow experience helps to detect problems and issues related to language learning and the optimal gaming experience, contributing to the evaluation of JCALL games for their ability to induce students to states of high satisfaction and concentration, while providing effective foreign language learning.This paper brings a set of 32 heuristics that cover all nine dimensions of the flow, contrary to the heuristics proposed in the literature, as described in Section 2, which explore only some dimensions of the flow.Despite the central evaluation focus of this study having been to analyse the applicability of the heuristics in a context of JCALL games, the evaluation with a non-game JCALL tool allowed to observe that the proposed heuristics seem to be generic enough for the evaluation of non-game tools.
We believe the heuristics presented in the current work can contribute in different ways to the work of designers, developers, and researchers of educational technologies aimed at language teaching.Designers and developers can base themselves on heuristics to raise questions and decisions to be taken in the design of CALL tools that contribute to a playful and effective learning of a second language.As for researchers, it is possible that the heuristics will inspire them to develop new instruments for evaluating foreign language learning games.
When conducting evaluations, it is necessary to instantiate the heuristics in questions related to the context of the tool to be analysed.The evaluation presented in the study carried out in the present work allowed us to observe that some heuristics can be instantiated in questions that are generic enough for the evaluation of any CALL tool.For example, the heuristic #1, referring to the balanced presentation of new content or content to be revised, was instantiated in a question applicable in different analysed games (i.e., "Is content taught divided into levels/phases of gradually incremental difficulty, and with appropriate levels of introduced or revised content?").However, some heuristics must be instantiated in specific questions related to the context of each tool in particular.For example, the heuristic #30, referring to introducing surprise elements involving luck during gambling, would hardly apply in a context other than games.
Game evaluations exemplified how the proposed heuristics can be applied to identify problems and issues related to teaching and enjoying CALL games.It is important to highlight that, in the evaluations carried out involving JCALL games (Sections 4.2.1 to 4.2.4), all 32 heuristics were instantiated in questions.Note that, although it is unnecessary to elaborate a question for each heuristic, the choice of which heuristics should be considered in the evaluation will depend on the characteristics of the CALL tool to be analysed.For example, in the evaluation of the Kanji JLPT N5 tool (Section 4.2.5), the heuristic #16, referring to the presentation of an engaging narrative, was disregarded for the analysis of this tool since the software was not a game and, therefore, does not present a narrative.Heuristics #30 and #31 were also disregarded since they would hardly apply to it either.This result demonstrated that all other heuristics were useful in the evaluation process of a non-game CALL tool, which suggests that the defined heuristics seem generic enough to be utilized in evaluations of other types of non-game CALL tools.
During the software evaluation, several improvements were suggested for the educational games and for the CALL tool, based on the analysis carried out from the perspective of the formalized heuristics.The feasibility of these suggestions should be discussed by the development team for these software products.However, the proposed improvements seem to have the potential to contribute to games providing a better gaming experience and better second language learning.

Limitations
As limitations, we can point out that the formalized heuristics in this paper do not represent an exhaustive list, and also, there is the difficulty of reproducing the evaluation carried out, since the questions generated by the specialist can be unique given his unique perception about the game or the heuristic in question, or even the possible subjectivity of some answers.Another limitation concerns the participation of only one specialist in the evaluation of the Japanese CALL tools, given the difficulty in finding other professionals with a similar profile to conduct the evaluation carried out in the present work, that is, fluency in the Japanese language, knowledge of Human-Computer Interaction, and previous experience with CALL game development.
Finally, it is worth mentioning that several JCALL educational games could not be tested in the present work due to their unavailability of access; the articles are available, but the software described in them aren't.

Conclusion
This paper presented heuristics that were conceived intending to support the evaluation of the optimal experience in JCALL games.The 32 heuristics developed are based on the Flow Theory and on a framework for evaluating tools aimed at language learning.The heuristics were applied in an evaluation, in order to verify their usefulness in practical cases with different JCALL educational games.Additionally, another JCALL tool, different from game, was also evaluated in order to observe if the defined heuristics could also be adopted with other types of CALL tools.
Results from the application of heuristics include suggestions for improvements to the evaluated CALL tools, exemplifying how heuristics can, in fact, contribute to the identification of aspects capable of promoting better learning and a better flow experience, besides helping observe advantages in using certain educational technologies for Education.
We understand these heuristics should instrument designers and developers of CALL educational games in the elaboration of questions pertinent to the evaluation of pedagogical and playful aspects, allowing to evaluate how the designed games help in the effective learning of a second language, while providing an optimal experience capable of concentrating and satisfying students.The evaluation presented in Section 4 exemplifies how the proposed heuristics can be applied in the evaluation of CALL educational games, and how they enhance the development of questions relevant to the domain and that guide the evaluation.
For future works, we suggest that the presented set of heuristics are used in the evaluation of the optimal experience of other CALL tools described in the literature, in order to verify the usefulness of this proposal in other contexts, such as, with other second languages (e.g., English, Spanish, French, Italian or Mandarin) or even with other types of CALL educational software that are not considered games.

(
undermine the student's self-esteem during learning and encourage him to keep revising even if he makes a limits must be proposed in moderate doses.

Table 1 .
The nine dimensions of Flow Theory.

Table 5 .
Questions and answers of the Sumo Sensei evaluation.

Table 5 .
Questions and answers of the Sumo Sensei evaluation.(Continue)

Table 6 .
Questions and answers of the Karuta Kanji evaluation.

Table 6 .
Questions and answers of the Karuta Kanji evaluation.(Continue)

Table 7 .
Questions and answers of the Karuchā Ships Invaders evaluation.

Table 7 .
Questions and answers of the Karuchā Ships Invaders evaluation.(Continue)

Table 8 .
Questions and answers of the Katakana Star Samurai evaluation.

Table 8 .
Questions and answers of the Katakana Star Samurai evaluation.(Continue)

Table 9 .
Questions and answers of the Kanji JLPT N5 evaluation.
Continue on the next page

Table 9 .
Questions and answers of the Kanji JLPT N5 evaluation.(Continue)