Language Learning & Technology
Vol.12, No.1, February 2008, pp. 109-114
External links valid at
time of publication

Paginated PDF Version

Tom Cobb
Université du Québec à Montreal

I was glad to receive a response from Jeff McQuillan and Stephen Krashen to my piece "Computing the Demands of Vocabulary Acquisition from Reading" (Language Learning & Technology, October, 2007), because drafting a reply forces me to be even more clear about what I am saying. I was initially surprised to see the lead responder was an expert in first language (L1) reading rather than second (L2) but on second thought McQuillan’s (The Literacy Crisis, 1998) participation makes sense.

Position Review

I argued that building an adequate functional L2 lexicon for reading from reading alone (Krashen’s longstanding position) cannot be done by the majority of learners in the normal time frame of instructed L2 learning. An example of such a time frame would be the year or two of ESL preparation granted to foreign students on arrival in a North American university. A minimal functional lexicon is 3,000 word families, which provides about 90% known-word coverage of average texts. But lexicon building from reading alone will stall shortly after 2,000 families. This happens for the demonstrable reason that 3,000-level words (and other less frequent words) do not appear often enough in the amount of reading of natural texts that such learners are likely to accomplish. Research has shown that words need to appear minimally six times for learning to take place.

As proof I offered three samples of natural text at what I proposed was the outer limit of such an amount, namely any of the journalism, academic, or literary sub-corpora of the Brown corpus. Each of these amounts to about 175,000 words, of which 10% are words beyond the 2,000-most-frequent level (minus proper nouns). Through elementary corpus analysis, I showed that a learner who managed to read any one of these collections would meet no more than half of the third thousand word families six times apiece. A similar analysis of the collected works of a major author (300,000+ words) and another of an entire set of graded readers (375,000+ words) pointed to the same conclusion: reading these texts in their entirety cannot provide enough repeated exposures to enough 3,000-level vocabulary to support the acquisition of a minimal functional lexicon.

Critics’ Response

I would have expected a critique of this analysis to focus on the assumption that most of the words in a text need to be known for reading to proceed successfully, given the abiding belief that learners can easily expand their vocabularies by guessing new word meanings from context, as was assumed but never shown in many classic accounts of the reading process. So I was surprised that the critics’ actual problem was with the claim that L2 learners would have trouble reading 175,000 words of fairly difficult natural text in a year or two. Doing the math, McQuillan and Krashen propose that even "reading relatively slowly at a speed of 100 words-per-minute," (p. 106) L2 learners should be able to read 175,000 words in 1,750 minutes or 29.2 hours, which, spread over two years, "amounts to only 2.4 minutes of free reading per day" (p. 106). Such readers would make light work of any of the Brown sub-corpora, or indeed all three of them.

As a teacher and coordinator of many L2 reading courses and programs, I wondered if we were talking about the same world. In my experience, even strong ESL readers find small amounts of unsimplified text fairly hard going, particularly if the text type is expository rather than narrative. I myself after decades of working in and around French cannot get through Le Monde before the next edition is on the doorstep. Indeed, expository academic or journalistic text has always been the stuff of the intensive reading course, wherein a good deal of scaffolding is provided and no great volume of text actually gets read. Were my critics and I talking about the same kind of learner and the same kind of reading?

The Reading Rate Research

To support their position, McQuillan and Krashen cite a half dozen reading rate studies, conveniently gathered in the literature review of a recent study by Fraser (2007). In this research and Fraser’s own study, L2 reading rates of 100 words-per-minute (wpm) and even somewhat higher appear to be the norm. But a little digging below the numbers raises questions about their applicability to the matter at hand.

The first thing to note about the Fraser (2007) study is that while my critics use its results to establish how much L2 readers can read, the researcher herself interprets the data to show how slow and arduous L2 reading is even for experienced readers, and how it remains so for long periods, even for those living and studying in the L2 culture.

The second point is the nature of the participant groups in Fraser’s study, neither of which greatly resembles a group of learners who are at the point of taking on the third thousand words as a prelude to academic reading. One consists of students who have specialized in English language and literature (English majors in a Chinese university), while the other consists of learners well into their studies in a Canadian university; some of them had lived in Canada for as long as 12 years (2007, p. 378).

Third and most important is the nature of the material Fraser’s subjects read at the cited rates of 135 and 140 wpm. Fraser reports that in terms of grade equivalent level, the two experimental texts were found to be suitable for use with Grade 9 or 10 high school students. Her analysis using Vocabprofile confirms their non-university character; the frequency profiling revealed large proportions of first 2,000 level lexis -- 83% in one text and 86.8% in the other (p. 394). These proportions of basic lexis are substantially higher than those consistently found in more typical university-level texts, as represented by, say, the academic section of the Brown corpus. Table 1 shows randomly chosen profiles from segments of this corpus. The mean coverage of the first 2,000 words in the Brown texts is only 78.53%, with a very small standard deviation (3.01%).  The differences between the Brown mean and the means of Fraser's two experimental texts (5% and 8%) may seem minor, but from the L2 reader’s perspective, the added load of 5% more non-basic vocabulary means one more ‘hard’ word per two lines of text. An added 8% means one more in almost every line.

Table 1. Lexical Frequency Profiles across Disciplines (coverage percentages). 

Brown Segment


No. of words



1000 + 2,000


1K + 2K + AWL


























Social Psychology
















Medicine (anatomy)































Notes: (1) Table from Cobb & Horst, 2004. (2) Segments from the Brown corpus are described in the Brown University website accessible from the Compleat Lexical Tutor at (3) AWL stands for Academic Word List.

To summarize, the experimental materials were easy texts for these learners. As we will see, the other reading rate studies from Fraser (2007) that McQuillan and Krashen cite are similarly inapplicable to the question under discussion. Either the readers were too advanced to be considered typical classroom ESL learners, or the reading materials were much simpler than those specified in my original paper, or both. But before examining these studies more closely, it may be useful to remind ourselves of the kind of learners and texts this discussion is about. At issue are the many ESL and EFL learners worldwide who are trying to move beyond a basic 2,000-word lexicon, by reading texts that contain significant amounts of post-2,000 lexis. Such learners and texts are not hypothetical but typical. In the case of learners, Laufer’s (2000) review of seven vocabulary size studies showed that university entry level ESL/ EFL learners in three Asian, two Middle Eastern, and three European countries were working with an average 2100 known word families (SD 977). In the case of texts, those bearing post-basic vocabulary are common if not the norm in academic and professional contexts; as mentioned, the Brown academic corpus bears an average 10% (SD 3%), that is, one post-2,000 word, on average, in every line.

As I have already shown, Fraser’s (2007) reading rates are simply not applicable to the learners and texts in question. Nor are the four other main sources of rate evidence reviewed in her study and cited by McQuillan and Krashen. First, Nassanji and Geva’s (1999) study of 60 Farsi-speakers hardly pertains to ESL learners at the beginning of their academic studies. They were in fact "graduate students at a major Canadian university… who had been living in Canada for 3 to 6 years" (p.  246). Their mean comprehension scores on the Nelson-Denny Reading Test, a measure designed for L1 readers, were over 50% (p. 251). The several Taguchi studies cited (e.g., Taguchi, Takayasu-Maass, & Gorsuch, 2004) all involve repeated readings of simplified texts from the Heinemann New Wave series of graded readers, with an upper limit of 2200 word families (Hill, 1997). Hirai’s (1999) study involves an experimental reading task that measures not reading rate but rather "rauding" rate, following Carver’s (1990) notion that reading rate is best measured using texts that the readers find easy to read. Similarly, Haynes and Carr’s (1990) study used a text chosen specifically because of its familiar topic and the fact that it "contained numerous 'lexical familiarizations'… definitions, examples, stipulations, synonyms, paraphrases, illustrations, etc., which the author had provided to clarify the meaning of new terms introduced in the text" (p. 396). It goes without saying that not all authentic academic texts would be so lexically familiarized.

Thus the L2 reading rate research cited is not applicable to academic reading. Even if it were, simple multiplication of reading rate times hours and days would only tell us what learners might be able to read in principle. This would still have to be confirmed in studies of what some particular learners had read in fact. Therefore I returned to some of the research literature where I thought I remembered specific amounts of L2 reading having been documented. Was any of it (a) of the type we are talking about and (b) as much as McQuillan and Krashen claimed should be possible?

The Amount-of-Reading Research

Not all extensive reading studies produce such useful specifics as numbers of words or pages read, but there are some. The largest amounts of L2 reading on record seem to hail from Japanese contexts. One is Rosszell’s (2007) doctoral study, in which university learners read an average 40 pages per week, which at 300 words per page, 12,000 words per week, 40 weeks per year, amounts to almost half a million words in a year. This is in the order of the rate McQuillan and Krashen hypothesize. However, the texts that these learners were reading were graded readers from Oxford’s Bookworm series, levels 4 to 6 (6 is highest). As mentioned in my original paper, this type of text does not include adequate inputs from the third thousand level and indeed makes no claim to.

An even larger amount of reading was recorded by Beniko Mason (2004). Her study involved 18-year old English majors at a junior college in Japan reading fully 1000 pages, or 250,000 words, per semester (although not all participants were able to meet these targets). Extrapolating this reading to four terms or two years, the amount of reading would indeed seem to be about 1 million words. This is the kind of figure McQuillan and Krashen are talking about, and it is equivalent to the size of the entire Brown corpus. But in fact, it was not the Brown corpus or anything resembling it in lexical composition that the learners were reading. Again, all this reading is of graded or simplified texts – much of it at the very elementary 600 word level.

Certainly, there is nothing wrong with reading simplified texts! But learners reading large amounts of such text do not make the point that McQuillan and Krashen wish to make.

In fact, reading simplified texts is a very good thing for language learners to do, for many reasons including but not limited to increasing vocabulary (Horst 2005; Pigada & Schmitt, 2006). Indeed, the second part of my LLT piece described ways of using technology to expand the library of graded materials that are accessible to ESL teachers and learners. Text computing can help us expand the range (to include more expository material) and vocabulary level (to provide a smooth rise up to a vocabulary size of 3,000+ word families) of available graded materials. At present, there is no smooth rise. Rather, “there exists a wide gap between the highest level of graded readers and the vocabulary demands of academic text and unsimplified novels" (Nation, forthcoming, p. 1). Even the best graded reader series, e.g., Oxford’s Bookworm series, make no claims beyond 2,500 words. The Longman Bridge Series (1945) was a systematic grading of materials up to 8,000 words, but it is long out of print. The new Penguin/Longman Active Reading series may claim successor status to Bridge with its 3,000 word-family target, but none of the studies I located had used this series. In other words, the large amounts of reading reported in some of the published research is reading of a type that by definition cannot be the route to an adequate functional reading lexicon.

But are there no studies involving the reading of unsimplified texts? Given the number of learners worldwide who are trying to improve their reading ability for advanced study through English, it is surprising how little research addresses this common objective. An exception is work by Parry, who looked at academic ESL learners reading large amounts of unsimplified texts in credit courses at U.S. universities. A preliminary goal of these studies was to estimate how many words the learners had managed to read. In a case study of two learners, one read as few as 7,500 words of an assigned anthropology text over a complete term, while another read as many as 72,000 words from the same textbook (Parry, 1997).

At first glance, the second reader’s rate lines up nicely with McQuillan and Krashen’s estimate. If 72,000 words of academic text can be read in a term, and two years is four terms, and the learner is taking four such courses at a time, then he is reading over one million (72,000 x 4 x 4 = 1,152,000) words of natural academic text in two years. This is the size of the whole Brown corpus and then some, and its lexical composition is probably similar. So can we conclude that some ESL learners can read at the rate McQuillan and Krashen propose, and presumably experience the vocabulary growth that goes with it? Not exactly.

At the start of the experiment, Parry asked her two readers to write down all the words they thought were new or difficult while reading, along with the page number. Then at the end of the academic session the readers were asked to provide the meanings of words they had noted, first out of context and then in the page context where the word was first noticed. Out of context, the 72,000-word reader could remember having seen only 5% of the words he had originally noted, and with the help of the context could provide correct or partly correct meanings (in his L1) for only 28%. The 7,500-word reader, on the other hand, could remember seeing 29% of her words and could give correct or partly correct meanings for 63%. In other words, the fast reader was reading a rather large amount but with very little vocabulary growth, while the slow reader was not getting her reading done but was learning some of the new words in the small amounts she did manage to complete.

These and a number of similar Parry studies are small but nonetheless, they ring true for anyone who has taught academic ESL reading in North America or elsewhere. Learners in such courses typically struggle to get through a few pages - or else read quickly with low comprehension and almost no vocabulary growth.

Conclusion: Not Reading Alone

Parry’s case studies suggest that most academic ESL learners cannot read their way to an adequate functional second lexicon. The scale of her research is not big enough to be conclusive, but there is no evidence I know of to contradict it. Also, Parry’s findings are congruent with some other well established evidence. Replicated research shows that reading becomes arduous and comprehension suffers when unknown word densities exceed 5% (e.g. Laufer, 1989). Nation (2006) sets the ideal criterion as low as 2%. But academic readers with knowledge of about 2,000 words are reading texts bearing at least 10% unknown items, and as their ESL teachers can attest, their pain is all too real.

If McQuillan and Krashen have relevant counter-evidence to any of this, I welcome it. Then the discussion can proceed on a different basis. Until then, the adequacy of free reading is an idea with high credibility in the time frame of L1 acquisition, and some credibility in an extended time frame of L2 acquisition under conditions of exceptional motivation. But carried into the typical time frame of instructed L2 acquisition, it is an idea that grossly misrepresents the problems faced by L2 readers who need to read to learn in their second languages. For these learners, an adequate second lexicon will not happen by itself; it will be provisioned through well-designed instruction including but not limited to reading.


Tom Cobb began his career in English literature but soon moved toward language study and instruction. He has taught and coordinated reading and writing courses at a wide range of levels in the United Kingdom, Saudi Arabia, Oman, Hong Kong, Australia, New Zealand, Japan, Mexico – and, of course, Canada. He currently trains TESL trainees in the uses of computing in language learning at a French-language university in Montreal.


Carver, R. (1990). Reading rate: A review of research and theory. San Diego, CA: Academic Press.

Cobb, T. & Horst, M. (2004). Is there room for an AWL in French? In P. Bogaards & B. Laufer (Eds.), Vocabulary in a second language: Selection, acquisition, and testing (pp. 15-38). Amsterdam: John Benjamins.

Fraser, C. (2007). Reading rate in L1 Mandarin Chinese and L2 English across five reading tasks. The Modern Language Journal, 91(3), 372-394.

Haynes, M., & Carr, T.H. (1990). Writing system background and second language reading: A component skills analysis of English reading by native speakers of Chinese. In T.H. Carr & B.A. Levy (Eds.), Reading and its development: Component skills approaches (pp. 375-418). San Diego, CA: Academic Press.

Hill, D. (1997). Graded (basal) readers -- Choosing the best. The Language Teacher Online, 21(5). Retrieved January 12, 2008, from

Hirai, A. (1999). The relationship between listening and reading rates of Japanese EFL learners. The Modern Language Journal, 83(3), 367-384.

Horst, M. (2005). Learning L2 vocabulary through extensive reading: A measurement study. Canadian Modern Language Review, 61(3), 355-382.

Laufer, B. (1989). What percentage of text-lexis is essential for comprehension? In C. Lauren & M. Nordman (Eds.), Special language: From humans thinking to thinking machines (pp. 316-323). Clevedon, UK: Multilingual Matters.

Laufer, B. (2000). Task effect on instructed vocabulary learning: The hypothesis of 'involvement.' Selected Papers from AILA ’99 Tokyo (pp. 47-62). Tokyo: Waseda University Press.

Mason, B. (2004). The effect of adding supplementary writing to an extensive reading program. International Journal of Foreign Language Teaching, 1(1), 2-16.

McQuillan, J. (1998). Literacy crisis: False claims, real solutions. Portsmouth NH: Heinemann.

Nagy, W.E., & Anderson, R.C. (1984). How many words are there in printed school English? Reading Research Quarterly, 19(3), 304-330.

Nassaji, H., & Geva, E. (1999). The contribution of phonological and orthographic processing skills to adult ESL reading: Evidence from native speakers of Farsi. Applied Psycholinguistics, 20(2), 241-267.

Nation, P. (forthcoming) New roles for L2 vocabulary? In L. Wei & V. Cook (Eds.) Language teaching and learning. Contemporary Applied Linguistics Series. London: Continuum International.

Nation, P. (2006). How large a vocabulary is needed for reading and listening? In M. Horst and T. Cobb (Eds.), [Special Issue on Second Language Vocabulary Acquisition]. Canadian Modern Language Review, 63(1), 59-81.

Parry, K. (1997). Vocabulary and comprehension: Two portraits. In J. Coady & T. Huckin (Eds.) Second language vocabulary acquisition (pp. 55-68). Cambridge, UK: Cambridge University Press.

Pigada, M., & Schmitt, N. (2006). Vocabulary acquisition from extensive reading: A case study. Reading in a Foreign Language, 18(1), 1-28.

Rosszell, R. (2007). Extensive reading and intensive vocabulary study in a Japanese university. Unpublished doctoral dissertation, Temple University, Japan.

Taguchi, E., Takayasu-Maass, M., & Gorsuch, G. (2004). Developing reading fluency in EFL: How assisted repeated reading and extensive reading affect fluency development. Reading in a Foreign Language, 16(2), 70-96.