Cambridge English Language Teaching  
  • View basket
  • Help
Home > English Language Teaching > Cambridge International Corpus > Cambridge International Corpus
Cambridge International Corpus

The Cambridge International Corpus (CIC) is a very large collection of English texts, stored in a computerised database, which can be searched to see how English is used. It has been built up by Cambridge University Press over the last ten years to help in writing books for learners of English. The English in the CIC comes from newspapers, best-selling novels, non-fiction books on a wide range of topics, websites, magazines, junk mail, TV and radio programmes, recordings of people's everyday conversations and many other sources. 

The CIC also includes the Cambridge Learner Corpus (CLC), a large collection of writing by learners of English.

The Cambridge International Corpus includes the following corpora:

Cambridge and Nottingham Corpus of Discourse in English (CANCODE)
Recordings of spoken English across the UK
5 million words
Cambridge and Nottingham Spoken Business English (CANBEC)
Unique recordings of business language in commercial companies
1 million words
Cambridge Cornell Corpus of Spoken North American English
Recordings of spoken English across North America
0.5 million words
Cambridge Corpus of Business English
Business reports and documents from the UK and US
175 million words
Cambridge Corpus of Legal English
Law related books and articles from the UK and US
20 million words
Cambridge Corpus of Financial English
Books and articles relating to economics And finance from the UK and US
55 million words
Cambridge Corpus of Academic English
Text from Academic books and journals from the UK and US
30 million words
 
Cambridge Learner Corpus
Exam scripts written by students taking Cambridge ESOL exams
35 million words

Find out more

Home
What is a Corpus?
What can Corpus do for me?
Which Cambridge publications use the Corpus?