### Chapter 10: **Review of Inferential Statistics **

The goal of statistical inference is to use sample data to estimate a *parameter*
(a statistic about the population) or determine whether to believe a claim
that has been made about the population. We never actually observe the parameter
we are interested in; instead we use an estimate of the parameter based on
data from a sample. The sample estimate is almost always different from the
claimed value of the parameter. There are then two possibilities: the difference
(between the estimate and the claim) may be real or it may be due to chance.
Thus, the fundamental question of statistical inference becomes, Is the difference
real or due to chance?

To answer the fundamental question, we require a model for the *data generation
process*, or DGP. The DGP describes how each observation in the data set
was produced. It usually contains a description of the chance process at work.
Given a DGP and certain parameter values, we can calculate the probability
of observing particular ranges of outcomes.

In this chapter,we try to clarify these complicated issues by reviewing basic
concepts of inference from introductory statistics. Our approach is somewhat
unusual in that we downplay the mathematical formalism and instead emphasize
the logic of statistical inference. We borrow the extremely useful metaphor
of a box model from Freedman, Pisani, and Purves (1998). The box model is
a way of concretely representing a random variable. In this chapter, we will
distinguish between two basic types of box models – we call them coin-flip
and polling box models. Though these models differ in important respects,
it turns out that we can answer the fundamental question of inference in the
same way with both models.

In subsequent chapters, we will develop additional box models that are designed
to handle the more complicated situations arising when one examines data from
observational studies. We will, however, be able to use the basic strategy
outlined in this chapter to answer the question of whether the difference
is real or due to chance.

The next section introduces the box model as a metaphor for handling chance
processes. Sections 10.3 and 10.4 introduce the two fundamental box models
and demonstrate how they work.We then present a review of hypothesis testing
and follow up with the concept of a consistent estimator. Finally, we explain
the algebra of expectations – a set of rules that are useful for computing
the expected value and standard deviation of random variables.

We will call on the box model metaphor over and over again throughout the rest of this book. We will almost always employ Monte Carlo analysis to demonstrate properties of the various box models. On occasion, we will make use of results from the algebra of expectations to provide an alternative, more rigorous derivation of these properties.

Although the experienced statistics student may wish to skip this review chapter,
we recommend a quick perusal of the material if only to ensure that the box
model metaphor makes sense. Of course, every student can benefit from a detailed
review to sharpen the crucial skills and concepts learned in an introductory
statistics course.

**Excel Workbooks**

AlgebraofExpectations.xls

BoxModel.xls

Consistency.xls

PresidentialHeights.xls