Contents
1 Introduction: Statistics Meets Corpus Linguistics 1
2 Vocabulary: Frequency, Dispersion and Diversity 38
3 Semantics and Discourse: Collocations, Keywords and Reliability of Manual Coding 66
4 Lexico-grammar: From Simple Counts to Complex Models 102
5 Register Variation: Correlation, Clusters and Factors 139
6 Sociolinguistics and Stylistics: Individual and Social Variation 183
7 Change over Time: Working Diachronic Data 219
8 Bringing Everything Together: Ten Principles of Statistical Thinking, Meta-analysis and Effect Sizes 257
Final Remarks 283
References 285
Index 294