Basic study of kanji recognition patterns in non-kanji-background Japanese learners based on eye-movement analysis 1

: Learning kanji is one of the greatest challenges that non-kanji-background Japanese learners face. Kanji have a complicated structure and are fundamentally different from the characters of the Latin alphabet. Recognising them can thus be difficult for learners from places whose writing systems do not include kanji. Studies on the fixation and eye movements observed in kanji recognition by such learners are therefore important to clarify the kanji-recognition and -learning process. The present study attempts to shed light on this subject by using an eye-tracker to study such eye movements. The analysis of the data obtained from the fixation and eye-movement experiments shows that students exhibit different kanji recognition patterns depending on how long they have been studying kanji: while students who have been studying kanji for a short time tend to look repeatedly at different parts of the kanji to grasp its shape and structure, those with more experience exhibit a pattern more similar to that of native Japanese speakers.


IntroductIon
The missionaries who tried to propagate Christianity in Japan in the 15th century are said to have bemoaned, 'Kanji are characters created by the devil to prevent the spread of Christianity.'In fact, learning kanji is considered one of the most difficult aspects of the Japanese language learning process for learners of Japanese with non-kanji backgrounds 2 (e.g.Paxton & Svetenant, 2014;Janković, 2002;Kurihara, 2019).As a Japanese teacher, I have witnessed first-hand how hard learning kanji can be for non-kanji-background learners.Likewise, I have often encountered forms that felt strange in the kanji written by such students.This suggests that the forms of kanji recognised by native speakers 3 of Japanese, who grew up in an environment in which they came into contact with kanji on a daily basis and began learning kanji at the start of compulsory education, and those recognised by native speakers of languages with writing systems completely different from kanji are not the same.Why this is so is a very curious question.Compared to the alphabet, kanji have not only more complex shapes, but also a fundamentally different structure, which makes it difficult for beginners -especially those from countries where kanji are not used -to recognise them (Tollini, 1992).Basic research on how kanji pattern recognition occurs thus seems necessary for kanji teaching (Suzuki & Itō, 1999).
Additionally, in recent years, the digitisation of learning tools in educational settings has made great strides, and various applications have been developed for learning kanji.Which traditional kanji-learning methods are used (e.g.repeated writing, creating stories to memorise them) (Heisig, 1977) can vary depending on the abilities and preferences of each learner, but most of these learning methods are based on looking at 'static', already written characters.However, many non-kanji-background learners (especially beginners) have trouble recognising character forms because they do not recognise kanji as a set of several components.While writing order is extremely effective for learning kanji correctly, as it encourages a grasp of these components and a proper understanding of the origins of a single written character, it is not realistic to ask teachers to repeat the writing order over and over again.In contrast, digital learning materials enable learners to grasp the component elements of a kanji through 'movement', such as animations of the stroke order.Furthermore, learners can watch them as many times as they like, enabling them to observe the kanji in greater detail at their own pace.
2 Kanji-background speaker refers to speakers from the People's Republic of China (China), the Republic of China (Taiwan), Chinese communities in Southeast Asia, and Japan who use Chinese characters in their daily lives.
3 This article uses this term in a very broad sense to mean a person who grew up in a particular language and is a 'proficient user of the language' (Rampton, 1990).
The kanji learning method will thus presumably shift from the conventional method of learning by writing to one based on learning by seeing (Hayakawa et al., 2019).Hence, the question of how learners 'see' kanji will be of great importance.
This research attempts to clarify how non-kanji-background learners of Japanese perceive and recognise kanji by studying their eye movements from the viewpoint of character recognition.

Japanese writing system
The writing system for the Japanese language is extremely complex.It includes two types of phonetic characters, called hiragana and katakana, as well as the phonetic and ideographic characters of kanji.Although learning kanji is an inevitable part of learning Japanese, as noted, it is also regarded as one of the most difficult processes for learners.The following factors, amongst others, have been cited as reasons why learning kanji is difficult for non-kanji-background learners of Japanese: (1) their complex structure (Kanō, 1988;Kaiho, 1990;Yadamsuren et al., 2006); (2) the huge number of characters (Iori, 2016;Iori & Hayakawa, 2017;Yadamsurrren et al., 2006); (3) the existence of many similar kanji (Janković, 2002;Kanō, 2001); and (4) the multiple readings (Kanō, 2001;Iori, 2016;Iori & Hayakawa, 2017;Janković, 2002;Lu et al., 2004;etc.).
Compared to kana characters and the alphabet, kanji have extremely complicated shapes.It thus takes time to get used to simply identifying them, particularly for non-kanji-background learners who are unfamiliar with kanji themselves (Kanō, 2017).Such learners have difficulty correctly grasping the shape of kanji characters (Yamato and Tamaoka, 2017), which makes character recognition itself a heavy burden (Hayakawa et al., 2019).
Furthermore, kanji characters contain an extremely large amount of information to process.In addition to information about meaning, reading, and usage, learners must also acquire information about the character shapes (Kanō, 1997(Kanō, , 2001;;Yamato, 2019).Processing these multiple pieces of information at the same time imposes a memory burden on the learner (Kanō, 2017), which, in turn, becomes a major barrier to learning kanji (Taniguchi, 2016;Watanabe, 2015).A comparison of kanji learning by children whose native language is Japanese and by learners learning Japanese as a foreign language is instructive.The former often know vocabulary in the spoken language and can thus learn kanji by adding information about their shape and memorising them.In contrast, learners of Japanese as a foreign language must start by understanding the kanji information system, making the burden for them quite heavy (Yamato, 2019).For these learners, the difficulty of recognising kanji shapes in particular has often been highlighted (e.g.Kanō, 1988;Tollini, 1992;Shimizu, 1993;Hagiwara, 2017, amongst many others), and various proposals have been made in this regard (Nakamura, 2019).For instance, in a can-do statements survey of Japanese language learners, Kanō (2014) found that such learners had difficulty accurately recognising the shapes of kanji characters and memorising and reproducing them (i.e.writing them down).Likewise, the difficulty that early learners encountering kanji for the first time have in recognising kanji glyphs has been found to be related to the complexity of their structure rather than just the number of strokes (Kanō, 1998).Taniguchi (2017) finds that the visual complexity of kanji shapes has the strongest effect on the reproduction of unknown kanji.She further reports that even kanji with low visual complexity are difficult to reproduce when they are non-linear, and that symmetry strongly influences the degree of completion of highly visually complex kanji, such that asymmetric kanji are harder to reproduce.
In an experiment on the relationship between learners' perception of the 'complexity' of kanji and their reproducibility, Kanō (1988) finds that non-kanji-background learners have difficulty distinguishing between long and short lines, as well as tome (stop) and harai (sweep).

Previous research on kanji character recognition and kanji recognition patterns
As noted above, of the three elements of a kanji, namely, form, sound, and meaning, the form is thought to be particularly difficult for beginner non-kanji-background learners of Japanese to learn (Shimizu, 1993).Kaiser (1997) classifies approaches to study books for non-kanji-background learners into seven categories.All focus on the part that helps learners memorise the character shape, underscoring the importance placed on that aspect (Kanō, 2001).Offering examples of kanji exercises devised by Tollini (1992) to help students recognise the shapes and understand the structures of 5 types and 20 sorts of kanji, Kanō (2001, p. 50) argues, 'The fact that all these exercises are required suggests that kanji shape recognition can be quite a burden for non-kanji-background learners.Kaiho and Haththotuwa Gamage (2001) find that it is difficult for learners with a non-kanji background to grasp the shape of kanji due to the difficulty in learning the distinctive features (i.e.perceiving multiple shapes as different, such as 大 ('big'), 犬 ('dog'), and 太 ('fat')) and discriminative features (i.e.recognising them as belonging to a single group, such as 大) of kanji.They emphasise two ways to help learners overcome this problem: (1) familiarising them with the shapes of kanji; and (2) identifying the basic elemental units of kanji shapes that are effective for early learners and using them as the basis for perceptual learning.In any case, the kanji's form is considered important.Nonetheless, they state that while radicals can function as basic element units for native Japanese speakers who are familiar with kanji and learners with extensive learning experience, they are not necessarily effective for beginners.
Kanji shape recognition is thus extremely important, especially for early-stage learners of Japanese with non-kanji backgrounds, and various studies have been conducted on this topic.Examining which constituent elements students in the second half of a beginner course find difficult to recognise when seeking to recognise kanji character shapes, Maehara and Fijishiro (2007) identify some kanji shapes that are difficult for learners to learn.Ikeda (2010) examines how novice Japanese learners with non-kanji backgrounds perceive kanji shapes by having them group kanji with similar shapes.She reports that such learners classify not only the radicals, but also the constituent elements other than the radicals.Hayakawa et al. (2019) break down kanji into smaller units with the aim of reducing the memory burden for students learning kanji that differ in shape and structure for each character.
Japanese learners' ability to recognise kanji shape patterns has been emphasised in the fields of cognitive psychology and Japanese language education (Suzuki and Itō, 1999), and various studies have been conducted focusing on the memory representation of kanji (Itō and Wada, 1997, 1999a, 1999b, 2004;Suzuki and Itō, 1999;Fukuda et al., 1993;Fukuda et al., 1995;Matsubara et al., 1994).Fukuda et al. (1993), Matsubara et al. (1994), andFukuda et al. (1995) used eye movements to study kanji recognition patterns, with highly suggestive results.Non-kanji-background subjects with little or no learning experience tend to focus their line of sight on a graphically eye-catching point, and their eye movements are slower than those of native Japanese speakers.This is because native Japanese speakers already have a schema for kanji; they can thus perceive the kanji's overall shape from the start (Fukuda at al., 1995).Therefore, depending on how the learner recognises kanji (i.e.does he/she focus on a part or recognise the whole?), a teaching method based on the perspective of a native Japanese speaker may not be suitable and another teaching method based on the learner's viewpoint may be more appropriate.
Since the above study used subjects with no learning experience at all or learners at a certain level (intermediate) as subjects, no differences were found based on the subjects' levels.Furthermore, many of the recent studies on kanji conducted from the perspective of eye movements focus on the reading of sentences, rather than the recognition of individual characters, to explore learners' sentence comprehension (Yanagisawa et al., 2009;Li & Yoshinari, 2013;Matsumi et al., 2017, amongst others).In contrast, the present study focuses on kanji recognition patterns in terms of learners' kanji learning experience.

Hierarchical structure in character recognition: sensation, perception and recognition
In terms of character recognition, the kanji learning process is assumed to be more or less as follows.When a person sees kanji for the first time, even if he or she knows that they are a kind of character, they simply look like figures without meaning.According to Fukuda's (1978Fukuda's ( , 1987) ) hierarchical structure model of character recognition (see Figure 1), this is the simplest level of 'sensation' in the process of character-information processing insofar as the existence of figures is recognised.When a person can grasp the kanji's constituent components, he or she is at the 'perception' level.At this level, whether the person has to grasp only a single component of the kanji or all of them, he or she has not yet grasped the kanji as a character.The highest level of 'perception' is achieved when all the constituent components of the kanji are grasped and the positional relationships are correctly established, that is, when the 'shape' is correctly perceived.A person reaches the next stage, 'cognition', when he or she is able to relate reading and meaning to the presented kanji, in addition to its shape.A learner who has achieved this level will be able to 'recognise' kanji as characters by comparing them with kanji as concepts stored in his or her memory.
It is interesting to check learners' progress against this hierarchical structure when teaching kanji and evaluating learners' learning outcomes.If we could follow and collate eye movements indicating how learners see kanji, it would be persuasive in terms of teaching and interesting for kanji learners.(Fukuda, 1987, p. 20) 4   3. Method

Purpose, hypothesis and aim of the experiment
The purpose of this study is to shed light on the kanji recognition patterns of learners of Japanese as a foreign language by analysing their eye movements.As noted, kanji are one of the most difficult things for Japanese learners to learn, as not only is their structure more complicated than that of an alphabet, it is also fundamentally different.
Therefore, research on eye movements, specifically, on fixation and jumps in the line of sight during learners' kanji recognition, seems useful to clarify an important part of the information-processing and learning processes involved in kanji character recognition.
In the present study, it was hypothesised that Japanese learners show different fixation patterns in kanji recognition depending on the length of their kanji-learning experience, that is, their level in the aforementioned hierarchical structure.To verify this hypothesis, an experiment was conducted on non-kanji-background Japanese language learners of different levels.A sample was assembled consisting of three or four university students of Japanese from each course (first to fourth year).The experiment was also conducted with one advanced learner and two native Japanese speakers to compare the results.

Experiment method
An eye-tracker manufactured by Tobii (Tobii-T60) 5 was used for the experiment.Specifically, twenty students who had been learning Japanese for different lengths of time -from 4 A: A region in which the meaning and content of the pattern are recognized by checking them against already acquired concepts; B: A region in which a feature of the pattern is perceived; C: A region in which an action to reconstruct the sampled features is seen; D: The so-called 'feature sampling' region, in which a portion of an element of the pattern is perceived; E: A region in which the existence of a pattern can be perceived in the luminance; F: A region in which the observed pattern has collapsed completely, resulting in the sensation of luminance (Fukuda, 1987, p. 20). 5 In collaboration with TRANSMEDIA CATALONIA at the Universitat Autònoma de Barcelona.
students just starting to learn kanji to students with some accumulated knowledge of kanji -were selected as subjects. 6he kanji characters used in the experiment were selected in consideration of the number of strokes and the shapes of the kanji characters, their complexity, their structure, and whether or not the students had already learnt them, based on Fukuda et al. (1993), Matsubara et al. (1994), andHayakawa et al. (2019).
First, subjects were asked to read an explanation of the experiment and respond to a short questionnaire on their kanji-learning experience (how long had they been learning kanji; how many kanji did they know; did they find learning kanji difficult; did they use any kanji-learning apps).
The experiment was conducted individually.Subjects were seated 60 cm away from the computer screen and instructed to look freely at the kanji displayed on the screen for 4 seconds. 7Subsequently, three similar kanji were displayed on the screen and the subjects were asked to respond which kanji they had seen before.This task was added to ensure that the subjects would observe each kanji displayed on the screen carefully.
The recommended distance when using an eye-movement measuring device is 60 cm.Each kanji was displayed on the screen in 400-point UD Digi Kyokashotai, a commonly used font in Japanese language textbooks.
Amongst the eye movements measured by the eye-tracker, special attention was given to fixation.Fixation is 'the most common eye movement, and [it] can be used to make inferences about cognitive processes and attention'(Tobi pro, 2022).The most commonly used measures are time to first fixation, fixation points, fixation duration, fixation count, and path between fixations.The range of fixation points is a valid analytical measure.These fixation-related measures reveal what the subject initially focused on and how much attention he or she paid to a particular image.
Within the results of eye-movement measurements taken by an eye-tracker, the circles are fixation points.The numbers inside the circles indicate the order of the eye movements.Fixation duration is represented by the diameter of each circle, with a larger circle size indicating a longer fixation time.The present study focused on the range of movement of the fixation points and the fixation time, amongst other valid measures.

Kanji with simple structures
日 ('sun/day'), 中 ('middle') and 大 ('big'), all of which had already been learnt by all subjects and have less than ten strokes, were chosen as the kanji with simple structures.
Although there was a slight difference depending on the student's level, a set of relatively long fixation times was confirmed for all subjects with regard to these kanji, and their line of sight was concentrated in a certain area.This suggests that none of the subjects had to move his or her line of sight much to grasp each kanji, both because they had already learnt them and because of their simple structure (Figure 2).
However, subjects with less experience learning kanji moved their line of sight more often than subjects with more experience learning kanji, even within the same area.In contrast, subjects with more learning experience tended to fixate within a certain area for a longer period of time without moving their eyes much, showing a pattern very similar to that of native Japanese speakers.
The students' line of sight was generally observed to be concentrated on a specific part from amongst those making up the kanji.For example, for kanji composed of upper and lower parts, the line of sight tended to concentrate on the upper part rather than the lower part, and the fixation time tended to be longer on the upper part.This is most likely because the upper part of the kanji has a more complicated structure, such as 雪 or 熱.In the case of unlearnt kanji, the line of sight also moved to the bottom of simpler structures, although from second-year students onwards, such movements were less common.This suggests that with previously learnt characters, students focus their line of sight on the complex structure and try to figure out which kanji it is.
Regarding 思, subjects with less learning experience moved their line of sight to both the upper and lower parts of the kanji, although they focused more on the lower part (心).This tendency was more common amongst first-year students.Similarly, the line of sight was observed to be concentrated on the intersection of multiple lines for the kanji composed of right and left parts.Although the kanji 階 is strictly composed of left, upper-right, and lower-right parts, the line of sight tended to focus on the radical or upper right (比), which showed a pattern similar to that of native speakers as their experience with learning kanji increased.The distribution of fixation points was very natural, considering that of the three components, 白and 日had already been learnt, while 阝had not been.

Kanji with radicals such as kamae (wrap) or tare (top-left)
A similar tendency was confirmed for kanji with radicals such as 囗, 門,辶and 疒.The line of sight was also found to be concentrated on the part that characterised each kanji, which is consistent with Fukuda et al. (1995).
Additionally, different eye-movement patterns were observed for these types of kanji depending on the length of the subjects' kanji-learning experience.The fixation points of learners with little experience learning kanji were scattered; they did not concentrate on a given area.Furthermore, they confirmed the kanji's shape by looking at each component in order to identify which kanji it was.This tendency was especially clear amongst second-and third-year students.In contrast, the line of sight of learners with more experience learning kanji was concentrated in a certain area; they likely grasped and recognised the shape of the kanji as a whole without the need to confirm each of its components.This pattern is very similar to that of native Japanese speakers and those of advanced learners of Japanese.

Kanji with many strokes
As for kanji characters with complex structures (i.e. more than twenty strokes) such as 離 ('separate') or 鱗 ('scale'), learners with less experience learning kanji tried to grasp their shape by looking at the components of each character one by one and trying to figure out which character it was.
For instance, relatively novice learners who had not yet learnt the kanji離 tried to grasp its shape by looking at all of its constituent elements in order to determine which kanji it was.
Learners with less experience learning kanji moved their line of sight to examine the individual components in more detail, as if they were tracing the memory of a kanji they had seen before, since they had already learnt it.With unlearnt kanji, the learners looked at the complexities that characterised the kanji in detail, although their lines of sight moved somewhat less frequently than those of the learners who had already learnt it.Presumably, they were somewhat unsure where to look when faced with complex kanji that they had never seen before.
In contrast, learners with more learning experience focused their line of sight on the more complex structural parts of the kanji (the parts that characterised it).Since the fixation time was longer than that of less experienced kanji learners, it can be concluded that they grasped the character shape of the kanji as a whole without the need to check each of its constituent elements in detail.This recognition pattern is very similar to that of native Japanese speakers and advanced learners (Figure 6).
Regarding 鱗 ('scale'), an untaught kanji for all the subjects except the native speakers, the subjects moved their line of sight relatively more frequently than they did with other kanji, again, except for the native speakers.Due to the large number of strokes and the fact that some of the components, such as 魚 ('fish'), 米 ('rice'), and 夕 ('evening'), were already known to the subjects, their eye movements suggested that they were trying to identify each component in detail and grasp the character's shape.

conclusIons
Through a detailed analysis of the data obtained through the experiment on subjects' fixation and the movement of their line of sight, it was possible to verify the hypothesis that non-kanji-background learners of the Japanese language show different fixation patterns for kanji recognition depending on the length of their kanji-learning experience.Specifically, beginners who have just started learning kanji did not focus their line of sight on a certain area.However, by looking at various parts of the kanji displayed on the screen, they repeatedly gazed and moved their eyes to grasp the kanji's shape and structure.
These eye movements suggest that learners with little kanji-learning experience have not yet reached the level of recognising kanji as characters at first glance, and instead seek to identify which kanji they are seeing by looking at various parts.For learners who have just started learning kanji, most of the kanji displayed in the experiment had not yet been learnt.Some of these beginners' eye movements thus suggested that they were unsure where to look, especially with kanjis involving many strokes.Learners at the elementary to elementary-intermediate level had already learnt most of the kanji used in the experiment.Some of their eye movements looked at each of the kanji's components in detail, as if they were recalling a kanji from memory.
In contrast, the line of sight of learners with more experience learning kanji stayed within the limited area of the displayed kanji and was not dispersed over a wide area like that of the beginner learners.This indicates that the learners can perceive the shape of the entire character in a relatively short period of time without moving their line of sight over a wide area and recognise it by comparing it with characters they have already learnt.This fixation point movement pattern in kanji recognition is similar to that of native speakers of Japanese, which suggests that the more experience one has in learning kanji, the more proficient one becomes in kanji recognition, i.e. the more similar one's kanji-recognition pattern becomes to that of a native speaker.
Under Fukuda's (1978) aforementioned hierarchical structure model of character recognition, the way novice or beginner learners see kanji is characterised by their perception of them as figures rather than characters.At the beginner-intermediate level, both a tendency to perceive them as figures composing a character and an effective fixation point distribution were observed.At the intermediate level and beyond, kanji are seen in a way more similar to typical character recognition, and feature extraction of kanji characters occurs within a narrow area.Furthermore, the more learning experience a person has, the more the way he or she sees kanji reflects the features of typical character recognition.When it comes to native speakers, it reflects the features of character recognition of a person who is familiar with this character.
These results point to the effectiveness of analysing kanji learning patterns and contribute to the expectations for digital technology applications.
Many attempts have been made to break down kanji characters into sub-components to ease the burden for learners, yet the more the components are subdivided, the greater this burden will be.One possible solution might thus be for learners to learn a minimum number of components involving a small number of strokes so that they can gradually get used to recognising a kanji as 'a whole' by practicing combining them.Learners' ability to write Kanji correctly may remain somewhat limited even with the help of digital applications.However, these days, when there are more opportunities to read kanji than to write them, it may be important for them to first develop the ability to recognise which kanji is which simply by looking at it.
Digital learning materials such as kanji learning applications may be useful in this regard.Applications that can combine kanji in a way that makes users aware of the spatial arrangement of each of a kanji's components are thought to be effective for acquiring the character shapes (Hayakawa et al., 2019).In recent years, kanji have been seen far more often than they have been written.Nonetheless, whether one is 'writing' kanji on computers, smartphones or other devices, the ability to quickly identify the right one will become increasingly important.
Digital learning materials can also encourage self-study by learners.As the classroom time available to study kanji is limited, kanji learning is often left to self-study once learners have been introduced to the basics.Applications such as stroke-order animations and finger tracing in which the user receives higher scores the more accurately he or she writes a kanji are useful for learning the finer details of kanji.Although writing order is extremely effective for learning kanji correctly, as it encourages an awareness of the kanji's component elements and proper understanding of the origins of a single written character, a learning method that enables recognition of entire kanji at a glance will also be effective.Introducing such methods, including digital technology, from an early stage of learning can reduce the time and quantitative burden of kanji learning.
Although this study is basic research and further data analysis is needed, the characteristics of the kanji recognition patterns identified in it may offer clues as to how to develop such a method.Thus, experimental research on how kanji learners look and move their line of sight during kanji recognition, such as this project, will shed light on the information processing involved in kanji character recognition and the related learning processes.As teaching materials become increasingly digitised and learning kanji by 'seeing' becomes more mainstream, this will offer significant insight into how to effectively teach kanji.

Figure 2 .
Figure 2. Movement of line of sight when grasping the kanji 日 (top: from beginners to semi-beginners; bottom: advanced learners and native speaker of Japanese)

Figure 3 .
Figure 3. Movement of line of sight when grasping the kanji 熱. (top: from beginners to semi-beginners; bottom: advanced learners and native speaker of Japanese)

Figure 4 .
Figure 4. Movement of line of sight when grasping the kanji 階 (top: from beginners to semi-beginners; bottom: advanced learners and native speaker)

Figure 5 .
Figure 5. Movement of line of sight when grasping the kanji 図 (top: from beginners to semi-beginners; bottom: advanced learners and native speaker)

Figure 6 .
Figure 6.Movement of the line of sight when grasping the kanji 離 (top: from beginners to semi-beginners; bottom: advanced learners and native speaker)