Language Learning & Technology
Vol. 3, No. 2, January 2000, pp. 58-76


Batia Laufer
University of Haifa

Monica Hill
University of Hong Kong


The study investigates a relationship between what is looked up about new words when different kinds of information are available and how well these words are remembered. The dictionary information has been incorporated into a CALL programme which was comprised of a text, highlighted low-frequency words, and access to different lexical information about these words (explanation in English, translation into L1, sound, root, and "extra" information).

The subjects were English as a Foreign Language (EFL) university learners in Hong Kong and Israel. The target words examined for incidental learning were 12 low frequency words. Pre-tests showed that they were unfamiliar to most subjects. The subjects were asked to read the text on the screen and understand it so that they could take a comprehension test after reading it. Unknown words could be looked up in the CALL dictionary built into the programme. During the task, log files registered every selection of dictionary information. After task completion, subjects were unexpectedly tested on meaning recall of the target words.

Recall data were analysed (ANOVAs, repeated measures, and correlations) to establish possible connections between retention and lookup behaviour (type of information selected, and number of lookups for each word). Results suggest that different people have different lookup preferences and that the use of multiple dictionary information seems to reinforce retention. The teaching implication is, therefore, to provide a variety of lookup options catering to different lookup preferences in paper or CALL dictionaries when assigning tasks that involve reading comprehension and understanding of unfamiliar words.


Attention to the form of input has occupied much of recent SLA research (e.g., Fotos, 1993; Robinson, 1995; Schmidt, 1990, 1993) . There is growing research evidence that L2 learning, particularly adult L2 learning, is impossible without attention to input in the sense of "noticing" it. Though attended learning is discussed in literature in relation to syntax, incidental vocabulary learning is no exception to the attention requirement. This may sound paradoxical at first, since incidental learning is sometimes mistakenly assumed to be unattended learning. This is not the position taken in this paper.

Incidental vocabulary is learnt as a by-product of another activity, such as reading or communication, without the learner's conscious decision, or intention, to learn the words. For example, during a reading activity, words are looked up in a dictionary in order to understand the text and to perform a comprehension task. Subsequently, some of these words are remembered even though the main task was not a vocabulary task, nor was it the reader's intention to learn the words in the text. Learning was thus incidental, that is, unintentional and as a by-product of another activity. However, it was not unattended. It is indeed highly debatable whether words which are not noticed in the input can be learnt.



It is also questionable whether noticing alone will result in acquisition. While this is not impossible, in most cases, additional elaboration strategies will be necessary on the part of the learner before a memory trace for the noticed word is created. In the case of reading, these strategies include attempts to infer the meaning of a word from context, consulting a dictionary and selecting the meaning that best fits the context, and relating the form and the meaning of the word to other words the learner knows. Cognitive psychologists and language acquisition scholars working within the framework of cognitive psychology believe that retention of information is determined by the way in which this information is processed (Craik & Lockhart, 1972; Ellis, 1994; Haastrup, 1989; Mondria and Wit-de-Boer, 1991; Schouten van Parreren, 1989; Watanabe, 1997). "The more a learner pays attention to a word's morphophonological, orthographic, prosodic, semantic, and pragmatic features and to intraword and interword relations, the more likely it is that the new lexical information will be retained" (Hulstijn, in press). Such close attention to word features, which is often associated with performing a vocabulary task, has also been referred to as deep processing, elaboration, cognitive effort--terms which defy a simple definition. Retention of words attended to will happen regardless of whether vocabulary learning is intentional or incidental. Hence, effective incidental vocabulary learning is a conscious learning process. Yet deep processing during the first encounter will not in all likelihood induce long term retention. There is evidence to show that repeated exposures to a new word in language input reinforce learning, though it is unclear how many repetitions are necessary for this. Sailing (1959) suggests that the number is 5, Kachroo (1962), Crothes and Suppes (1967) suggest it is 7, Saragi, Nation, and Meister (1978), 16 (surveyed in Nation, 1990). 1

If learning is dependent upon attention and quality of information processing, then effective teaching should include tasks which direct the learner's attention to the words targeted for instruction and require elaboration of the words. One such task is the use of dictionaries to look up the target words in the course of a reading assignment. It has been found that the use of dictionaries, or glosses can indeed contribute to small increments in vocabulary learning (Chun & Plass, 1996,1997; Hulstijn, Hollander, & Greidanus, 1996; Knight, 1994; Luppesku & Day, 1993; Lyman-Hager & Davis, 1996; Lyman-Hager, Davis, Burnett, & Chennault, 1993; Mondria, 1993).

The advent of electronic dictionaries inspired research of how these dictionaries are used and their usefulness as on-line helping tools and as contributors to incidental vocabulary learning. Leffa (1992) investigated the efficiency of an electronic dictionary and a conventional dictionary in a translation task and found that the computer dictionary enabled the students to "understand 38% more of the passage, using 50% less time" (p. 63). Knight's (1994) study compared the effect of CALL dictionary lookup with guessing words from context and found that students who used a dictionary learnt more words and achieved higher reading comprehension scores. Research into the influence of task and learner variables showed that words which were deemed relevant for the task were looked up more frequently than those which were considered irrelevant. In addition, words whose meanings could be guessed from context were less likely to be looked up than those whose meanings could not easily be inferred (Hulstijn, 1993).

The above research, revealing as it may be, has two major limitations: uncertainty about which words were looked up and limited dictionary information provided for learners. Some studies report that electronic or paper dictionaries were available to the class. This, in itself, however, does not necessarily mean that learners looked up the words the researcher assumed would be looked up. If a study does not provide log files which record what learners are doing during the reading task, there is no evidence that they indeed are looking up unknown words, rather than guessing or ignoring them. Nor do we have the information about the number of times they return to a specific word during the reading task. Lack of certainty about the use of dictionaries is even more evident in the case of paper dictionaries. Other studies, which overcome this limitation by tracking learners' lookup behaviour in log files, research only one type of dictionary selected by the researcher--monolingual, or bilingual. The learner is thus denied the opportunity to select the dictionary s/he would feel most comfortable with in real life. And yet different people, when given the choice, consult different types of dictionary information (Laufer & Kimmel, 1997). Some prefer translations, some explanation in L2, others a mixture, and some access different information for different words. Moreover, the studies cannot reveal whether the learners read the entire entry or a part of it, or which part (assuming the dictionary used was monolingual). Because of these shortcomings, we may be missing important information about what words are looked up, how many times they are looked up and whether the dictionary information that has been provided would be the kind of information the learner would have selected in real life.



More rigorous research of electronic dictionaries would require research tasks and a computer programme that would overcome the above limitations. First, as for tasks, the researcher has to make sure the tasks cannot be carried out without the knowledge of the words targeted for investigation. Relevance of the words to the task will increase the chance of dictionary consultation (Hulstijn, 1993). Second, as we cannot be sure whether the learner will look up the target words even when they are relevant to the task, log files should track learners' lookup behaviour to show whether indeed they looked up the target words or additional words, and how many times they selected each word. Third, the programme should provide options for selecting different types of dictionary information for each word. If, for example, the learner is interested in a quick L2-L1 translation, the option should be available. If, on the other hand, s/he is interested in examples of usage, or in grammatical information, or in a definition, each type of information should be accessible via another lookup option. Log files would record which of these options were selected for which words.

Studies that satisfy the above requirements--eliciting target word lookup, providing access to any and all types of dictionary information, tracking students' lookup behaviour--can provide insight into the effect of dictionary use on incidental vocabulary learning. Specifically, we can investigate the relationship between retention of looked up words and the type of dictionary information selected, and between retention and the number of times a word was looked up. In addition, providing different dictionary information and tracking people's selection of information is a more rigorous method of investigating dictionary preferences than questionnaires that have often been used in dictionary use studies. Several recent studies which have included multimedia annotations have shed more light on learner lookup behaviour (Aust, Kelley, & Roby, 1993; Chun & Plass, 1996; Davis & Lyman-Hager, 1997; Lomicka, 1998; Lyman-Hager et al., 1993; Roby, 1991, 1999).

Taking advantage of the tracking capabilities of computer mediated language learning, Roby (1991, 1999) examined the reading comprehension level of American tertiary students of Spanish. Computer and paper modes of presentation were used to compare the effect of reading with and without glosses. Roby focused on reading time, number of lookups, and comprehension and found that there was no significant difference in comprehension. However, subjects who had access to a gloss read the passage in significantly less time than those in the dictionary alone treatments, and those who used an electronic dictionary looked up significantly more words than those who used a paper dictionary. In a similar study comparing an online dictionary aid and conventional paper dictionary, Aust, Kelley, & Roby (1993) noted that those using the electronic reference had more than twice as many dictionary lookups as those in conventional mode, but again there was no significant difference in comprehension.

Incidental vocabulary learning through an L2 reading comprehension task was the focus of a study by Chun and Plass (1996). Three types of multimedia annotations were tested: text, text plus image, and text plus video. One hundred sixty American students of German were introduced to Cyberbuch, a hypermedia application for reading German texts containing a variety of annotations such as those just mentioned. Their results showed incidental learning of 25% accuracy in production tests and 77% in recognition tests with minimal loss between immediate and unexpected delayed recall. In their 1998 study, Plass, Chun, Mayer, and Leutner presented another group of English speaking tertiary students of German with a 762 word text annotated with verbal (L1 translation) and / or visual (picture or video clip) annotations. A higher level of lexical recall was found with the students who had selected both visual and verbal and those who were able to select their preferred mode of annotation showed better comprehension.

While the previous studies focused on reading comprehension and vocabulary acquisition, Lomicka (1998) analysed the think aloud protocols of 12 native English speakers during a computerised reading task in French to find out whether glossing aids comprehension. Subjects, who were asked to read a text in French, were divided into three groups: A, text only (no glosses); B, text plus "traditional glosses" (L1 translation and L2 definitions); and C, access to B's glosses plus pronunciation, images, references, and questions. A tracker recorded the amount and type of glosses, and the length of time that each was consulted. The data suggested that computerised reading with all glosses "may promote a deeper level of text comprehension," (Lomicka, 1998, p. 41) however, the main obstacle to comprehension was vocabulary.




Before reporting our particular study, we will describe the computer programme called Words in Your Ear, which attempts to meet the research requirements outlined in the background section.2 The programme consists of four parts: (a) a pre-test of the words targeted for investigation, (b) a text where these words appear highlighted, (c) dictionary information for each word in the form of five options (meaning in English, translation into L1, word pronunciation, root and "extra" information), and (d) log files where every mouse click selecting from these options is recorded.

The pre-test displays the words on screen asking the student to write the word's meaning next to it if the word is familiar to him/her (Figure 1).

Figure 1. Pre-test screen

The next screen displays the text "Meeting Mania" in which the 12 target words are highlighted in red (as opposed to the black script of the text). The right side of the screen provides the lookup options for each word, as in Figure 2.

Figure 2. Text of "Meeting Mania" with lookup options about here



Unfamiliar words can be looked up by clicking on them with the mouse and choosing any of several options on the right side of the screen. Learners can return to any word any time while reading the text for further information. The following information about each word is provided under separate options:

    1. Hear word (pronunciation in the form of a digitized voice recording);
    2. English meaning (the definition in English and contextualised examples taken from the Longman Active Study Dictionary of English [Summers, 1984]);
    3. L1 meaning (in Chinese or Hebrew using translations taken from Segal & Dagut's English-Hebrew Dictionary [1986] and Longman's Active Study English-Chinese Dictionary [Li, 1995]);
    4. "Extra" information (other forms of the word, phonemic transcription, details of levels of formality, prepositions which follow the item, related meanings, and other semantic and syntactic details); and
    5. Root (taken from Chambers Twentieth Century Dictionary [Geddie, 1968])

Figure 3 shows the screen print with the word burgeoning selected and its English meaning, while Figure 4 shows the screen print with the word insidious selected and its Hebrew translation.

Figure 3. Example of the English meaning of burgeoning

The student can also opt to hear the text being read aloud. The line along the bottom of the screen is a timer which prompts the students when the maximum time of ten minutes is exhausted.

Figure 4. Example of the Hebrew translation of insidious



Whenever a student selects information by clicking on it, the log registers the click in a log file. Figure 5 shows an example of a student log at the end of the task.

Figure 5. Example of a student log

The right part of the file has recorded the learners' responses to the words during the pre-test and the left part reveals the lookup behaviour. A zero means that the information was not selected, 1 or 2 means that the relevant option was clicked on once or twice and times selected indicates the number of times the word was returned to. The student may have selected several options when first reading the text, then made a further selection during a subsequent reading.

The results screen (see Figure 5) shows that in the pre-test the learner has perceived the meaning of insidious to be the "adjective from inside" and noted assert as meaning "claim." She has selected the English meanings of 8 of the 12 words, checked the Chinese translation of 5 and listened to the pronunciation of 6 words. She has chosen to look up the extra information on 2 words, pervasive and congregate, and has checked the roots of 4 words. She also opted to hear the text being read aloud.

Using the above programme as our research tool, we set out to investigate the relationship between L2 learners' dictionary lookup patterns and their retention of the looked up words.


Research Questions

Our specific research questions were as follows:

  1. What percentage of words are remembered after being looked up in an electronic dictionary during a reading task?
  2. Are different lookup preferences associated with different levels of retention?
  3. Is there a relationship between the number of lookups and retention?


Initially, 97 subjects participated in the study, but only 72 were left for data analysis, as will be explained in the section on pre-test. Of the 72 subjects, 32 were EFL students from the University of Haifa, Israel, and 40 were first year ESL students from the University of Hong Kong. The Israeli students were non-English majors taking a course in English for Academic Purposes. They had had eight years of English in high school prior to their university studies. Their score on the English section of the psychometric university entrance exam was 1-1.5 standard deviations (SD) above the mean. Since this section tested reading comprehension only, the students' level is 1-1.5 SD above the mean on the reading section of TOEFL (National Institute for Testing and Evaluation, personal communication, December 1998). The Hong Kong students were from the Social Sciences and Arts faculties. Two students were English majors, however, their English grades were not above the group average, and their performance was no better or worse than their peers. The mean proficiency level of the Hong Kong students was about 570 on TOEFL. All were taking English for Academic Purposes and all had had at least seven years of English in secondary school in Hong Kong. Even if the two groups were not equivalent on all language skills, they could read the text without any difficulty except for the target words. Furthermore, the focus of the study was within subject differences in retention as a function of lookup patterns. All subjects reported that they were already familiar with a computer environment and knew how to use a mouse.




The text in the programme is a short extract (120 words) from an academic text of fairly general interest, "Meeting Mania" (Bergman, 1994). Before the experiment, it was piloted with 30 students who were not involved in the present study and who were asked to highlight all words of whose meanings they were unsure. Twelve words were most frequently marked as being unfamiliar:

  assert endeavour mania rampant
  burgeoning insidious pervasive scrutiny
  congregate malpractice profusion ubiquitous

These words were therefore selected as the target words to be investigated in the experiment on incidental learning. Even though the templates of the programme allow for linking any number of words to glosses, in this particular text only the above 12 words were glossed and could therefore be looked up. Other words in the text were of high frequency and did not present any problems in the pilot.


The experimental procedure consisted of three stages: pre-test, tutorial, and vocabulary retention test.

Pre-test. The students logged in and the first screen was displayed with the 12 target words. They were asked on the screen if they knew the meanings of any of these words (see Figure 1). This pre-test allowed us to find out whether or not some of the target words were familiar. Words which were reported as known and those for which incorrect meanings were given were also noted. Students who were familiar with more than one of the target words were later eliminated from the sample. Those who knew one word were later not credited with "learning" that word. So the final number of subject whose data were analysed was 72.

Tutorial. Having completed the pre-test stage, the learners received the second screen which displayed the text "Meeting Mania" with the 12 highlighted target words (see Figure 2). Subjects were instructed to read the text and understand it for comprehension questions that would be given at a later stage. They were also told that, in the course of reading, they could look up information about the highlighted words by clicking on them with the mouse and then choose the option(s) that would best clarify the meaning of the word in the text. Students were encouraged to do so since the words were relevant to text comprehension. They were not told to learn the words, nor were they notified of a vocabulary test that would follow. The fact that the words were highlighted on the screen made them salient to the learner, similar to a marginal gloss in a text. However, enhanced input in itself does not result in a decision to learn it. Since the task was specified as text comprehension and not vocabulary learning, we believe that retention of the target words was indeed incidental. 3

As mentioned before, the log has recorded every mouse click and showed, in the results screen, which words were known in the pre-test (if any), which words were selected in the tutorial, which dictionary information was looked up, and the number of times each word was selected (see Figure 5).

The students were warned that the tutorial session would last no more than 10 minutes and that the text would disappear from the screen after this. They were asked to notify the researcher or the assistant if they finished reading the text earlier. In such cases, they moved away from the monitor to another part of the room. Most Israeli students spent between 5 and 6 minutes on the text; most of the Hong Kong students spent close to the 10 minutes allocated.

Retention Test. On completion of the reading task, subjects were given an unexpected vocabulary post-test (on paper) in which the 12 target items were listed. The subjects were asked to write the meaning of the words in L1 or L2. When they handed in the sheets, they were given a comprehension exercise with six questions on the text, as announced before the tutorial session.



Data Analysis

For each subject, the following data were collected from the pre-test stage: the number of already familiar target words (i.e., with correct explanations provided) and the number of words mistakenly perceived as known (i.e., with incorrect explanations). As mentioned before, only subjects who did not know the target words were selected for tutorial and post-test analyses. The log files showed that all the target words were looked up by all subjects. From the post-test, we calculated the number of correct responses, that is, the number of words each student remembered after the reading task. From the tutorial section and the post-test, we collected data on the lookup options and the relation between each option and word retention. In other words, for each lookup option, we checked how many of the looked up words were later retained. Specifically, we calculated how many words were looked up in L1 only and of those, how many were remembered on the post-test; how many were looked up in English only and of those, how many were remembered on the post-test; how many were looked up in both languages and of those, how many were remembered; and finally, how many were looked up for meaning (in any or both languages) and also for additional information (sound, root, extra information) and of those, how many were remembered on the post-test. We also noted the total number of lookups for each student, including repeated selections of words.

Students were categorised by their preferred lookup behaviour as "lookup types": those who predominantly (in 75% of cases) selected Chinese or Hebrew translations were categorised as L1 type; those who preferred the English meanings were L2; those who looked up L1 and L2 in equal proportions were L1/L2; and those who selected word meaning (in L1, and/or L2) together with additional information were grouped as other. In sum, for each of the 72 students we obtained the following information: the number of new words retained after the tutorial, the number of words selected in each lookup option, the number of words remembered in each lookup option, the total number of times words were looked up, and classification of the students by preferred lookup pattern.

The scoring procedure was straightforward. A correct answer received one point, an incorrect answer zero points. An answer on the post-test was considered correct if the learner provided the meaning given in the CALL programme. When the subjects opted for explanations in English rather than L1 translations, a correct answer did not necessarily require providing the exact words used in the glosses as long as it was semantically accurate. For example, the meaning given for profusion might be "large quantity," "big amount," or "a lot." If the answer was semantically accurate but contained a minor spelling mistake which did not distort its meaning, it was considered correct. We expected only meanings given in the CALL programme. (The target words were not homonyms.)


Our first research question was, Are words remembered after being looked up in an electronic dictionary during a reading task, and if so, how many?

Table 1 presents the means of retained words. IL stands for the Israeli group of subjects and HK for the Hong Kong group.

Table 1. Mean of Total Learning Scores on Post-Test (out of 12)


















Table 1 indicates that the answer to our first question is affirmative. The Hebrew L1 subjects recalled an average of 4 out of the 12 target words while the Chinese L1 subjects recalled an average of 7. The maximum recall of the Hebrew L1 subjects was 10, or 83% while the maximum number of items recalled by one Chinese L1 subject was 12 words, that is, 100%.



The second research question was, Are different lookup preferences associated with different levels of retention?

This question was answered in two ways. As mentioned earlier, we collected information on the percentage of words correctly retained in each of the four lookup options. Table 2 shows how many words were looked up in each option, and how many words were retained in each option. The numbers of words looked up and retained are given in raw scores. The raw scores of the Hong Kong students' lookups are out of 480 since the maximum number of words that could be looked up in each option in the entire group is 480 (12 words x 40 subjects). The raw scores of the Israeli students' lookups are out of 384 since the maximum number of words that could be looked up in each option is 384 (12 words x 32 subjects). The number of words retained in each option was also converted into a percentage (of the number of words looked up in the respective option).

Table 2. Words Looked Up and Retained in Each Option




L1/L2+other info


Looked up

Retained %

Looked up

Retained %

Looked up

Retained %

Looked up

Retained %







53/ 24


90 / 26






120 / 95




157/ 97








IL between groups
IL within groups






HK between groups
HK within groups






Analysis of Variance comparing mean retention scores in the four lookup conditions showed that there was no significant difference between the look options among the Israeli learners (F(3,54)=0.51, p>0.05). In the Chinese group, the difference between the retention scores was significant (F(3,80) =3.6, p<.05). A Tukey Kramer multiple comparisons test (Table 2a) shows a significant difference between "L1" and the three other groups.

Table 2a. Tukey Kramer Test of Differences Across Lookup Options
Lookup option


Mean= 42%


Mean= 79%


Mean= 67%

L1/L2+other info

Mean= 62%










* p<0.05

Table 2 shows that the most frequent lookup strategy of the Israeli learners was L1 translation. Yet their highest retention score (45%) was associated with selecting both L1 and L2 during the tutorial. The Hong Kong learners, unlike their Israeli peers, used L1 translations least frequently. Even though the mean retention score in L1 condition was 42% (which is higher than the Israeli 38% ), this was the lowest retention score of the Chinese learners. The most frequent lookup procedure which resulted in correct retention of a word was selecting the English meaning. The other two, which involved selecting English with L1, or one of the languages plus additional information, yielded lower scores, but not significantly different from the L2- oriented lookups.



The second way of checking the relationship between lookup options and retention was by classifying learners by their preferred lookup patterns. Table 3 shows the distribution of different lookup types of students in numbers and in percentages out of the total number of students in each country.

Table 3. Distribution of Lookup Types of Learners

L1 type

L2 type



IL (n = 32)









HK (n = 40)









Like Table 2, Table 3 shows that the Israeli and the Chinese students exhibit very different dictionary behaviour. The predominant lookup type among the Israeli subjects was L1 (72%) and few alternated between L1 and L2 depending on the looked up word (16%). Even fewer (6%) of the Israeli subjects relied solely on the English meanings of the target items or opted to look at the additional information provided by the programme. The Hong Kong subjects, on the other hand, tended to select more information about the target items and were predominantly of the other lookup type (38%). The next largest grouping (32.5%) selected the English meanings of the words, and fewer (17%) chose to alternate between L1 and L2. Only 12.5% chose Chinese lookups. Table 4 shows the mean number of words and percentage (out of the 12 target words) retained by different "lookup types" of the learner.

Table 4. Word Retention Mean Scores of Different Lookup Types

L1 type

L2 type



































IL between groups
Within groups






HK between groups
Within groups






Analysis of Variance reveals no significant difference among the lookup types in the Israeli group (F(3,28)=0.59, p=0.63>0.05) and a significant difference in the Hong Kong group (F(3,36)=3.3, p=0.03< 0.05). A Tukey Kramer Multiple comparisons test shows significant differences between L1 and L2 (p<.05) and between L1 and other (p<.05).

Table 4a. Tukey Kramer Test of Differences Across Lookup Types
Lookup option

Mean = 37%

Mean = 69%

Mean = 60%

L1/L2 + other info
Mean = 66%










* p<0.05



Tables 2 and 4 point similar results in retention as a function of lookup patterns. Even though there is no significant difference in retention among the Israeli groups, the highest retention scores (45% in Table 2 and 42% in Table 4) are associated with consulting both languages for the new words. The next best result seems to be achieved when words are looked up in L1. The category of Hong Kong learners which attained the highest mean retention score of 69% was the L2 lookup type. The next highest scoring group with a mean of 66% was the other type, followed by those who alternated between L1 and L2 (60%). The group which retained the lowest mean number of target words (37%) was the L1 type who selected only the Chinese meanings.

The third research question was, Is there a relationship between the number of lookups and retention?

Table 5 presents the means of number of times selected, that is, the mean number of "clicks" on the target words. Table 6 shows the correlations which were calculated by relating each student's number of selections (clicks) and his/her total retention score.

Table 5. Mean Number of Selections

times selected




IL (n = 32)



HK (n = 40)



Table 6. Spearman Correlations Between Number of Selections and Retention Scores











Since the number of looked up words was 12, the figures in Table 5 show that on average each word was looked up more than once. The log files showed that each individual word was looked up between 1 and 3.2 times. It is clear that the Hong Kong learners looked up the words almost twice as often as the Israeli learners. However, as Table 6 shows, the relationship between the number of selections and retention is weak. Apparently, people with a larger number of lookups do not necessarily remember more words (Chun and Plass, 1996). In the case of the Israeli learners, the correlations are low, and in the case of the Hong Kong learners, low and insignificant.


Incidental Vocabulary Acquisition and CALL Dictionary Information

The use of a dictionary has been shown to have a positive effect on incidental vocabulary learning (see "Background" section). Our results support this claim (see Table 1). Yet some studies show that L2 readers often decide not to use the dictionary when meeting unfamiliar words in a text (Bogaards, 1998; Hulstijn, 1993). One of the reasons often reported by students is the time involved in flicking through the dictionary pages and the subsequent disruption of the flow of reading. An electronic dictionary may provide a good solution to this problem. The ease and speed of using may encourage the learner to look up unfamiliar words. This in turn, will not only contribute to more fluent reading, but will also increase the chance of acquiring the looked up words. Since our study was not designed to investigate the differences between paper and electronic dictionaries, the above advantage is still speculative. Yet we were encouraged to believe in it on the basis of a survey was carried out among the Hong Kong learners after the experiment in which they evaluated the programme as a vocabulary learning tool: 97% of the Hong Kong subjects commented favourably on it and recommended further development. A similarly enthusiastic response was found with participants in studies by Roby (1991) and Lomicka (1998). If a pedagogical tool is popular with the students, the chances are it will also be beneficial for learning. A counter argument could be leveled at electronic dictionaries claiming that the ease of use will result in shallow processing of the looked up word and will therefore be detrimental to retention. Our results do not support this position. Any attempt to explain why this is so would be only speculative. We believe that the favourable attitude of the learner and the variety of lookup options resulted in careful attention to the lexical information provided by the glosses. Further research comparing the programme with paper dictionaries could corroborate this speculation.



How do our results compare with other studies where dictionaries were used for unknown words? Mondria (1993) found that after looking up new words in a dictionary, learners remembered 15% of them on a post-test. Knight's (1994) subjects recalled 20% of the tested words. According to Hulstijn, Hollander, and Greidanus (1996), 25% of the looked up words were remembered when the word appeared in the text once. Chun and Plass (1996), report 25% accuracy in production and 77% on recognition tests. The students in our study have outperformed the subjects in the above studies. The Israeli group remembered 33.3% of the words, the Hong Kong group, 62%. There are several differences in the design of the studies. Our programme provided the learner with the choice of language of explanation, with the choice of information beyond mere word meaning and with the choice of access to multiple items of information. We would like to postulate that learners' ability to select the type of information they consider most appropriate for the task and feel most comfortable with may well contribute to retaining more looked up words than in the other studies. Another contributing factor to our relatively good retention results may have been the combination of the different lookup possibilities with the specific task--text comprehension. In Mondria's (1993) study, 14 sentences with 14 target words were presented to the learner. The students were asked to guess the words then verify their meaning in a dictionary. The length of "text" and number of target words are similar to our study, the task is different. Mondria's task requires comprehending isolated sentences, our task requires comprehending a coherent text.

Comparing our study to Knight (1994) and Hulstijn et al. (1996), we can see that the global tasks assigned to students in the studies were similar: read the text to perform a comprehension exercise or test. However, our study differs in two additional factors which may have affected the results: the quantity of text presented to the learner and pointing out the unknown words to be looked up. Knight's students were tested after reading a 250-word text with 14 target words which were not highlighted on the screen. Hulstijn et al. used a text which was 1,306 words long with 16 target words that were not presented in any way to the learner during the reading session. They had a dictionary and could look up any word they wanted. These differences, a shorter text in our study and highlighting the target words, meant that the words were more salient in our input than in the input of the other two studies. This input enhancement which, in our study, can be attributed to the CALL technology, may partly be responsible for the better results than in other studies.

We cannot, at this point, be certain why the Hong Kong learners did so much better than the Israeli learners. Based on our knowledge of the students in the experiment, we assume it is because the Hong Kong learners are more diligent and more intrigued by the programme than the Israeli students (As mentioned earlier, the Israeli students spent about half of the time allocated to them, while the Hong Kong learners took up almost the entire 10 minutes). As the programme was originally designed for Hong Kong university students, those students may have felt that there was a certain "ownership" involved and that they should perhaps pay more attention to the information provided. A further factor could be that the Hong Kong study was conducted by the teacher who was also the researcher, whereas in Israel, a research assistant collected the data The Hong Kong students also tend to be competitive, so they attended to the lexical information more seriously: this greater attention may have resulted in better learning.4 It is also possible that the Hong Kong learners are more word or vocabulary oriented, and that they have been trained to do more bottom-up processing. Conversely, it is possible that the Israeli learners approached the task from a top-down perspective in that their goal was comprehension of the text as a whole rather than knowing the meaning of each word in the text.

Variability in Dictionary Lookup Patterns

Even though a variety of dictionary information was available, most students opted for definitions, translations, or both. This reflects the findings of Davis and Lyman-Hager (1997) and Lomicka (1998). Though it was originally anticipated that each option would provide a rich resource of research data, the total number of lookups for "extra information" and "root" formed only a small percentage (<5%) of the total selections. Therefore, these data do not merit discussion here.



Table 3 shows clearly that different people have different lookup patterns. But it also reveals that the groups of learners in each country behave differently. The individual and group differences may have to do with individual learning styles, specific features of the learners' mother tongue, or transfer of training. The Chinese L1 students in our study tended to look up more information about the words than the Hebrew L1 students. As mentioned above, this could reflect the study patterns of Chinese learners who are often diligent. It is also evident that Chinese learners prefer to look up English meanings of unfamiliar English words rather than L1 translations. Israeli learners, on the other hand, have a noticeable preference for L1 translation. A similar diversity of language lookup choice was found in Lomicka's (1998) study. Subjects who had access only to French and English glosses consulted the L2 definitions more frequently than L1. Those in the all glosses group, on the other hand, consulted L1 more frequently than L2. Aust, Kelley, and Roby (1993) noted that bilingual dictionary users consulted 25% more definitions than monolingual dictionary users, but they add that their subjects' answers were written in L1 (English) and this may have affected their bilingual lookup preference. The language of lookup did not make any significant difference on comprehension. They also point out that

The notion that greater improvements in vocabulary ability will result from monolingual dictionary use seems reasonable because the cognitive tasks are more directly associated with understanding the foreign language than when the learner uses a bilingual dictionary and cycles from one language to another (p. 71).

In Davis and Lyman-Hager's (1997) study on multimedia glosses in French reading, however, the subjects tended to utilize almost exclusively word definitions provided in English (L1) ignoring the other forms of glosses available.

In the present study, we postulate that these differences in preferred lookup language may be related to transfer of training. Most, if not all of the Hong Kong subjects would have attended English medium secondary schools where their teachers may have encouraged them to consult monolingual dictionaries in their senior school classes. In Hong Kong universities, the medium of instruction is English and many, though certainly not all, students are accustomed to using monolingual dictionaries. No instruction, however, is given on the usage of dictionaries in their English enhancement courses. In Israeli universities, the medium of instruction is Hebrew and no specific preference is given to any type of dictionary. The available English-Hebrew dictionaries are quite good and most learners choose to use them for reading English texts. The preference of the Israeli learners cannot be attributed to inferior dictionary use skills since they have been trained to use bilingual, monolingual, and bilingualized (English-English-Hebrew) dictionaries in high school. Yet most of them prefer a bilingual dictionary (Laufer & Kimmel, 1997). Lyman-Hager and Davis (1996) suggest that accessing word meanings in the native language is a key factor in comprehension.

The records of lookup behaviour in log files showed that the Chinese learners check the phonological dimension of words, while the Israelis do not. Particularly, the Chinese learners check the pronunciation of the words which look as if they would not conform to standard English pronunciation rules, such as burgeoning. This difference between the groups may be related to the different types of orthography of the two languages (Chinese and Hebrew). It is arguable that the Israeli subjects, whose L1 is written in an alphabetic orthography, are used to "sounding out" words, so that they are already good at decoding the sounds of English and may find the pronunciation of the words superfluous. Lomicka (1998) also noted that in her study of English L1 subjects reading a French text, only 2 out of 12 subjects frequently consulted the pronunciation gloss and this "did not seem to directly affect comprehension" (p. 48). Chinese subjects, however, tend to focus on the written form of words, as with Chinese characters, and they appear to benefit from hearing the pronunciation of unfamiliar words as a means of retention (Hill, 1994, 1996).

There is evidence to suggest that the average Hong Kong Chinese ESL student recognizes lexical items by orthographic (written form) rather than phonological (sound) principles (Hsia, Chung, & Wong, 1995). Hsia et al. points outs that young Cantonese speaking children in Hong Kong are not taught Cantonese sound analysis. Thus, visual memorization becomes the most mastered strategy. Students have reported visual memorization as being a dominant cognitive strategy in learning English words. However, this programme provides freely accessible pronunciation of unfamiliar words and so it is possible that the addition of auditory information has helped the Chinese learners to build referential connections between the written form and the meanings of the words. The bimodal (visual plus auditory) presentation of words may have enhanced the storage in short term memory, especially in proficient (tertiary level) L2 learners of English (Mayer & Simms, 1994). Another interesting speculation is that the Chinese preference for the pronunciation option could somehow be related to the fact that Chinese dictionaries are arranged according to the phonetic radical and so Chinese lookup words in a dictionary by sound. In the programme evaluation survey conducted after the experiment, students noted that it was useful to be able to hear the pronunciation of the words, although some commented that they had not checked the sound as they did not anticipate using these words orally.



Few learners (in Hong Kong and in Israel) paid attention to additional information provided in dictionaries, perhaps perceiving this as time consuming. And even fewer were interested in the roots of words. In spite of the group characteristics discussed above, each group of learners includes different "look up types," which demonstrates that different people, irrespective of what country they come from, have different dictionary consultation strategies.

Lookup Patterns and Incidental Vocabulary Learning

The interpretation of the results would have been simple if we had found a uniform relationship between lookup patterns and retention of looked up words. This is not the case in our study. The Hong Kong results show that L1 lookups yielded the worst results. In the Israeli group, however, resorting to L1 was the second best strategy in absolute terms, and it was not significantly different from any of the other look ups. We cannot ascribe the overall better results of the Hong Kong learners to their preference to use English definitions of the new words. If the use of L2 were a superior strategy, it should have produced better results than the use of L1 with the Israeli learners, too, which was not the case. Besides, the use of L1 is not necessarily an inferior learning strategy as it has long been established that L2-L1 pairs can be retained well (for a review, see Nation, 1982). Why then would the same lookup strategies work differently for each group? We do not know for sure (and the experiment was not designed to find this out). Yet, we could reasonably assume that since these groups of learners are accustomed to using dictionaries in different ways, as discussed in the previous section, good retention is the result of resorting to the lookup strategy with which learners feel most comfortable.

In spite of the differences between the two groups, our results show that the use of L1 together with L2 leads to good retention (best scores in both groups, and scores which are not significantly different from the best scores in the Hong Kong group). If selection of both L1 and L2 information means that the new word has been attended to more carefully (or noticed, elaborated, processed better, cf. "Introduction") than the word which was looked up in one of the languages, then our results are in line with the claim that retention is determined by the way in which new words are processed, whether the learning is intentional or incidental. Ellis (1994) states that words may easily be forgotten after the first encounter, however, "explicit, deep, elaborative processing concerning semantic and conceptual/imaginal representations prevents this" (p. 52). The beneficial effect of L1+L2 lookup may lie in the richness of semantic encoding; it may lie in the prolonged attention that multiple items of information require, or it may lie in both.

It is not clear at this stage why the number of selections did not correlate well with retention. According to Pimsleur's (1967) recommended memory schedule which proceeds in intervals of a factor of two (5 seconds after the first exposure, then 25 seconds, etc.), words that were looked up several times should have been remembered better than words that did not undergo immediate rehearsal.

It is notable that the correlation was better and significant in the Israeli group. As stated earlier, most Israeli learners finished the tutorial within 5-6 minutes (out of the 10 minutes assigned for it), that is, were not very attentive. The significant (albeit not high) correlation in the Israeli group may suggest that learners who attended to the words more than once learned them better. The Chinese learners, who were attentive at first reading, may have done their main learning then (most of them spent closer to the 10 minutes allocated). Apparently, when more time is spent on a word, additional selections do not add significantly.

Teaching Implications

Following the results of the study, we would like to make some recommendations for reinforcing vocabulary learning through reading in classrooms that have no access to CALL programmes. Teachers could assign the task of looking up specific words in paper dictionaries. These would be words that teachers, on the basis of their experience, know are unfamiliar to their students. Learners should also be encouraged to access different kinds of information found in their dictionary. They frequently check only the meaning, without looking any further for the examples of usage or other lexical specifications. And yet it is accessing multiplicity of information that is likely to enhance retention. A further point worth considering is the use of "bilingualised" dictionaries which contain the monolingual information about a word and its translation into the learner's mother tongue (see, for example, Oxford Student's Dictionary for Hebrew Speakers [Ruse, Reif, & Levy, 1978]) as such dictionaries cater for a variety of lookup preferences: for definitions and examples in English, translations into L1, or both.




The aim of this article is twofold: to suggest a CALL methodology suitable for investigating vocabulary acquisition, and to investigate a possible relationship between lookup patterns and retention of looked up words. The novelty of the methodology lies in offering learners several options in selecting lexical information about words. Researchwise, this means that incidental vocabulary learning can be observed under optimal lookup conditions since learners can select the lookup strategy which may be most compatible with their learning style, whether it involves a particular lookup pattern, or a combination of patterns.

The results of our study suggest that such conditions, which cater to a variety of lookup preferences, may be more favourable for incidental vocabulary learning than dictionary use conditions reviewed in the introductory section. The variety of lookup preferences emerged with the comparison of learners in each country, and more so with the comparison of learners in the two countries. We have not demonstrated conclusively that a particular lookup behaviour yields the best results. Nevertheless, the results suggest that multiplicity of lexical information tends to be associated with better retention. The number of times the word is looked up during a learning session bears almost no relation to its retention. We postulated, albeit cautiously, that what matters is greater attention during the lookup rather than the number of lookups.

In addition to being a research tool, Words in your Ear and similar programmes written in the future can fulfill important pedagogical functions. First, the programme directs the learner's attention to unfamiliar words during reading, which in turn can contribute to incidental acquisition of these words. Second, it provides an on-line "optimal dictionary." This dictionary is quick and easy to use and does not interrupt the flow of reading. It is also compatible with individual lookup preferences since it allows the user to select the kind of lexical information with which s/he feels most comfortable. Furthermore, the various lookup options combine into multiple lexical information for those learners interested in it, or encouraged by the teacher to use it. Third, the programme includes special vocabulary exercises which consist of a variety of input and output oriented tasks designed to reinforce the retention of the looked up words. Preliminary exercises (not included in our study) are already in the programme. Most of the Hong Kong learners who experimented with the text, the dictionary, and the preliminary exercises seemed to be involved with the programme and found it a great novelty.

Finally, we would like to make some suggestions for further research. Our study could be replicated with larger samples in the attempt to find an unequivocal relationship between lookup pattern and retention. Moreover, "unlearning" of wrongly perceived meanings of words can be studied by comparing incorrect meanings given in the pre-test with post-test meanings to see whether or not students had unlearned the original erroneous meanings and learned the correct ones. An analysis of the log may shed some light on the conditions in which students can be helped to dispel wrong meanings and assimilate the correct ones. The reinforcement exercises could be investigated to ascertain whether particular types of followup exercises can better aid retention. An experiment similar to ours could be conducted with paper dictionaries: bilingual, concise monolingual, and detailed monolingual. Each student would be provided with the three and would report on his/her lookup behaviour for each word. And finally, our study tested retention immediately after the reading and lookup task. Yet, learning in real life requires retention of information long after the task performance. The relationship between lookup behaviour, reinforcement exercises, and long term retention of vocabulary may well be the most important follow up of our research.


  1. Further consolidation of the knowledge of the word may involve associative learning such as semantic or imagery technique and rehearsal of the word in isolation, in an L2-L1 pair, in phrase or sentence context. Yet these activities are carried out by the learner with the specific intention of committing the word to memory and therefore belong to the realm of intentional learning. For an extensive discussion of incidental and intentional vocabulary learning, see Hulstijn (in press). [Return]



  1. This programme was developed as a part of the second author's Ph.D. thesis. Following the tutorial section described here, there is a selection of exercises which was not used for this particular experiment. [Return]

  2. One can argue that it is never possible to be sure whether learning is incidental, that is, devoid of a decision to commit the information in question to memory. Some people may be carrying out a task other than, or additional to, the task assigned by the experimenter, such as memorizing words when they are asked to read or to look for grammatical patterns. Yet when experiments are carried out, we assume that the majority of subjects follow our instructions. [Return]

  3. The study was not designed to specifically compare the two groups of students. It mainly focused on within subject differences and intra-group differences as a function of different lookup behaviour. Therefore, the comments explaining the differences between the Hong Kong and the Israeli students are based on what we observed during the experiment and what we know about our students. [Return]


Funding for the programme development was kindly provided by the Simon K Y Lee fund for language research which also covered some of the expenses of the programmer, Bill Wedlock. Assistance with Chinese translations was gratefully received from Lara Lam, Dora Pao, and Cynthia Lee. Assistance with the Israeli data collection was gratefully received from research students Shelley Birnhack and Natalie Mishin. The authors are also grateful to the reviewers for their helpful suggestions.


Professor Batia Laufer teaches and supervises research in the Department of English Language and Literature at the University of Haifa. Her main research interest is second language acquisition, particularly vocabulary acquisition, vocabulary testing, applied lexicography, and cross linguistic influence. She has published extensively on various aspects of vocabulary in second language.


Dr. Monica Hill teaches undergraduate and postgraduate students in the English Centre of the University of Hong Kong. Her main research interests are vocabulary acquisition by Chinese learners and computer assisted language learning. She is currently working on vocabulary assessment of incoming tertiary learners and is expanding a Website for academic vocabulary development.



Aust, R., Kelley, M. J., & Roby, W. B. (1993). The use of hyper-reference and conventional dictionaries. Educational Technology Research & Development, 41, 63-73.

Bergman, A. B. (1994, June 2). Meeting Mania, New England Journal of Medicine 330(2), 1622-1623, Massachussets Medical Society.

Bogaards, P. (1998). Which words are looked up by foreign language learners? In B.T.S. Atkins & K. Varantola (Eds.), Studies of dictionary use by language learners and translators (pp. 151-157). Tubingen: Niemeyer.

Chun, D. M., & Plass, J. L. (1996). Effects of multimedia annotations on vocabulary acquisition. The Modern Language Journal, 80, 183-198.



Chun, D. M., & Plass, J. L. (1997, July). Research on text comprehension in multimedia environments. Language Learning and Technology, 1(1), 60-81. Retrieved July 31, 1999 from the World Wide Web:

Craik, F.I.M., & Lockhart, R.S. (1972). Levels of processing: a framework for memory research. Journal of Verbal Learning and Verbal Behaviour, 11, 671-684.

Davis, J.N., & Lyman-Hager, M. (1997). Computers and L2 reading: Student performance, student attitudes. Foreign Language Annals, 30(1), 58-72.

Ellis, N.C. (1994). Consciousness in second language learning: Psychological perspectives on the role of conscious processes in vocabulary acquisition. AILA Review, 11, 37-56.

Fotos, S. (1993). Consciousness-raising and noticing through focus on form: Task performance versus formal instruction. Applied Linguistics, 14, 385-407.

Geddie, W. (1968) Chambers twentieth century dictionary. Edinburgh: Chambers.

Haastrup, K. (1989). The learner as word processor. AILA Review 6, 34-46.

Hill, M.M. (1994). A word in your ear: Vocabulary acquisition by Chinese tertiary students. In N. Bird, P. Falvey, A.B.M. Tsui, D.M. Alison, & A. McNeill, (Eds.), Language and Learning, (pp. 179-190). Hong Kong: Government Printer.

Hill, M.M. (1996) What's in a word? Enhancing English vocabulary development by increasing learner awareness. In P. Storey, V. Berry, D. Bunton, & P. Hoare (Eds.), Issues in Language in Education, (pp. 179-190). Hong Kong: Hong Kong Institute of Education.

Hsia, S., Chung, P. K., & Wong, D.(1995) ESL learners' word organisation strategies: A case of Chinese learners of English words in Hong Kong. Language in Education 9(2), 81-102.

Hulstijn, J. H. (1993). When do foreign-language readers look up the meaning of unfamiliar words? The influence of task and learner variables. The Modern Language Journal, 77(ii), 139-147.

Hulstijn, J.H. (in press). Intentional and incidental second language vocabulary learning: a reappraisal of elaboration, rehearsal and automaticity. In P. Robinson (Ed.), Cognition and Second Language Instruction. Cambridge University Press.

Hulstijn, J.H., Hollander, M., & Greidanus, T. (1996). Incidental vocabulary learning by advanced foreign language students: the influence of matginal glosses, dictionary use, and reoccurrence of unknown words. The Modern Language Journal, 80, 327-339.

Knight, S. (1994) Dictionary use while reading: The effects on comprehension and vocabulary acquisition for students of different verbal abilities. The Modern Language Journal, 78(iii), 285-298

Laufer, B., & Kimmel, M. (1997). Bilingualised dictionaries: How learners really use them. System 25, 361-369

Leffa, V. (1992). Making foreign language texts comprehensible for beginners: An experiment with an electronic glossary. System 21 (1), 63-73.

Li, S-G., (1995) An active study English-Chinese dictionary. Hong Kong: Longmans.

Lomicka, L. (1998). "To gloss or not to gloss": An investigation of reading comprehension online. Language Learning & Technology, 1(2), 41-50. Retrieved from the World Wide Web July 31, 1999:



Luppesku, S., & Day, R. (1993). Reading, dictionaries, and vocabulary learning. Language Learning 43(2), 263-287.

Lyman-Hager, M., & Davis, J. N. (1996). The case for computer-mediated reading: Une Vie de Boy. The French Review, 69(5), 775-790.

Lyman-Hager, M., Davis, J. N., Burnett, J., & Chennault, R. (1993). Une Vie de Boy: Interactive reading in French. In F. L. Borchardt & E. M. T. Johnson (Eds.), Proceedings of the CALICO 1993 Annual Symposium on "Assessment" (pp. 93-97). Durham, NC: Duke University.

Mayer, R. E., & Sims, V. K. (1994). For whom is a picture worth a thousand words? Extensions of a dual-coding theory of multimedia learning. Journal of Educational Psychology, 86, 389-401.

Mondria, J-A. (1993). The effects of different types of context and different types of learning activity on the retention of foreign language words. Paper presented at the 10th AILA World Congress of Applied Linguistics, Amsterdam.

Mondria, J-A., & Wit-de-Boer, M. (1991). The effects of contextual richness on the guessability and the retention of words in a foreign language. Applied Linguistics 12, 249-267.

Nation, I.S.P. (1982). Beginning to learn foreign vocabulary: A review of the research. RELC Journal, 13, 14-36.

Nation, I.S.P. (1990). Teaching and Learning Vocabulary. New York: Newbury House.

Pimsleur, P. (1967). A memory schedule. The Modern Language Journal 51, 73-75.

Plass, J. L., Chun, D. M., Mayer, R. E., & Leutner, D. (1998). Supporting visual and verbal learning preferences in a second language multimedia learning environment. Journal of Educational Psychology, 90 (1), 25-36.

Robinson, P. (1995). Attention, memory and the 'noticing' hypothesis. Language Learning 45, 285-331.

Roby, W. B. (1991). Glosses and dictionaries in paper and computer formats as adjunct aids to the reading of Spanish texts by university students. Unpublished doctoral dissertation, University of Kansas.

Roby, W. B.(1999, January) What's in a Gloss? Language Learning and Technology, 2(2), 94-101. Retrieved from the World Wide Web July 31, 1999:

Ruse, C., Reif, J.A., & Levy, Y. (1978). Oxford student's dictionary for Hebrew speakers. Jerusalem: Kernerman Publishing LTD, Lonnie Kahn & Co. LTD.

Schmidt, R. (1990). The role of consciousness in second language learning. Applied Linguistics 11, 129-158.

Schmidt, R. (1993). Awareness and second language acquisition. Annual Review of Applied Linguistics 13, 206-226.

Schouten-van Parreren, C. (1989). Vocabulary learning through reading: Which conditions should be met when presenting words in texts? AILA Review 6, 66-74.



Segal, M., & Dagut, M.B. (1986). English Hebrew dictionary. Tel Aviv, Israel: Publishing House Kiryat-Sefer LTD.

Summers, D (1984) Active study dictionary. London: Longman

Watanabe. Y. (1997). Input, intake and retention: Effects of increased processing on incidental learning of foreign language vocabulary. Studies in Second Language Acquisition 19, 287-307.



About LLT | Subscribe | Information for Contributors | Masthead | Archives