Introductory Econometrics

Chapter 2: Correlation

This chapter begins the study of describing data that contain more than one
variable. We will see how the correlation coefficient and scatter plot can be
used to describe bivariate data.

Not only will you learn the meaning and usefulness of the correlation coefficient, but, just as important, we will stress that there are times when the correlation coefficient is a poor summary and should not be used. There is no such thing as a perfect summary measure of data. In addition, we emphasize that correlation merely indicates the level of linear association between two variables and should never be used to infer causation. It is tempting to suppose that a high correlation implies some kind of causal connection, but this is wrong.

Although much of this material may be familiar to students of statistics, we conclude the chapter with a discussion of ecological correlation, which is often omitted from introductory statistics courses. We show that the correlation coefficient based on individual level data may be markedly different when computed with grouped data. In economics, this is called the aggregation problem, and it merits attention.

Excel Workbooks