Monday, April 9, 2012

Bye Bye AWL, Hello COCA academic lexis...

Just got an update from Mark Davies on new developments at COCA via the user list server.  Some profound changes in the way we view academic English are in the air.

In fact, the approach Davies & Co. have taken in defining 'academic lexis' makes a good deal more sense than the decades old approach embodied by Coxhead's research into the AWL, which Tom Cobb pointed out recently is basically an artifact of the GSL.  Billuroğlu and yours truly, in fact, picked up on this back in 2001 when we embarked on the 'Bare Naked Lexis' project.  However, Davies' new approach is much more dynamic that what has gone on before, and not at all linked to any preconceived notions of what academic vocabulary is, or the arbitrary categorization of words into families, as in Nation's work on 'familizing' the first 20,000 word families in the British National Corpus.

The new  interface at for academic English looks great, as it will let you input entire articles (no 1,000 word limit) and it will produce customized word lists based on the analysis.  Pretty amazing long will this remain free is my question...I hope forever!

Here is a copy of the message I just got FYI.


If you are interested in academic English – for teaching or learning – there are two new, free corpus-based resources that might be of interest to you. These are based on the 110 million words of academic texts in the Corpus of Contemporary American English [COCA] (85 million words in academic journals and 25 million words in more academically-oriented magazine articles).

1. The new site contains free COCA-based academic wordlists. There are important differences between these lists and the Academic Word List created by Coxhead (2000), and we believe that the lists that these new lists provide better coverage of academic English and that they have a format that better enhances learning and teaching. The three sets of word lists, which have been created in conjunction with Prof. Dee Gardner of BYU, are:

-- Word families (SAMPLE): The top 1000 word families of academic English (with nearly 3000 words total). Unlike the traditional Academic Word List, ours contain separate entries for different parts of speech, so you know, for example, whether abstract is used more as a noun, verb, or adjective. The words are also color-coded to let you know whether the word is a "general" academic word, or whether it is a more "technical" one that occurs in just a few sub-genres. And most importantly, the entries are listed in order of frequency, to help you focus more on words that you will actually see in the real world -- rather than just having a mass of unorganized words in each word family.

-- General “core” academic English (SAMPLE): The top 3500 words (lemmas) in COCA Academic (listed individually, rather than by word family)

-- Technical / sub-genre lists (SAMPLE): The top 1000 words in each of the nine academic sub-genres (Business, Law, Medicine, Science, Humanities, etc)

2. We have created a new interface at for just academic English. It has the same features as the general WordAndPhrase site, but all of the data is based strictly on the 110 million words of academic English in COCA.

-- Frequency listing: Browse through these lists (including word families) to see detailed information (all on one screen, with extensive links between resources): definition, frequency by academic sub-genre (e.g. Medicine, Business, Humanities), synonyms, and collocates and concordance lines (based just on academic English).

-- Input texts: As with the general interface, you can input an entire text (such as a journal article, or an academic paper that you have written) and it will give you detailed information about the words and phrases in the text. You can download word lists based on your text, and you can click on phrases in your text to see related phrases from COCA.

(By the way, if you previously had trouble accessing WordAndPhrase with your account, please try again; we’ve fixed a few bugs.)

I hope that these two new corpus-based resources on academic English will be of interest and value to you for teaching, learning, and research.


Mark Davies
Brigham Young University