Chapter 23: Bootstrap
Throughout this book, we have used Monte Carlo simulations to demonstrate statistical properties of estimators. We have simulated data generation processes on the computer and then directly examined the results.
This chapter explains how computer-intensive simulation techniques can be applied to a single sample to estimate a statistic’s sampling distribution. These increasingly popular procedures are known as bootstrap methods. They can be used to corroborate results based on standard theory or provide answers when conventional methods are known to fail.
When you “pull yourself up by your bootstraps,” you succeed – on your own – despite limited resources. This idiom is derived from The Surprising Adventures of Baron Munchausen by Rudolph Erich Raspe. The baron tells a series of tall tales about his travels, including various impossible feats and daring escapes. Bradley Efron chose “the bootstrap” to describe a particular resampling scheme he was working on because “the use of the term bootstrap derives from the phrase to pull oneself up by one’s own bootstrap . . . (The Baron had fallen to the bottom of a deep lake. Just when it looked like all was lost, he thought to pick himself up by his own bootstraps.)” [Efron and Tibshirani (1993), p. 5].
In statistics and econometrics, bootstrapping has come to mean to resample repeatedly and randomly from an original, initial sample using each bootstrapped sample to compute a statistic. The resulting empirical distribution of the statistic is then examined and interpreted as an approximation to the true sampling distribution.
The tie between the bootstrap and Monte Carlo simulation of a statistic is obvious: Both are based on repetitive sampling and then direct examination of the results. A big difference between the methods, however, is that bootstrapping uses the original, initial sample as the population from which to resample, whereas Monte Carlo simulation is based on setting up a data generation process (with known values of the parameters). Where Monte Carlo is used to test drive estimators, bootstrap methods can be used to estimate the variability of a statistic and the shape of its sampling distribution.
There are many types of bootstrapping because there are many ways to resample,
and there are a variety of ways to use the bootstrapped samples. The next
section introduces the bootstrap by returning to the free-throw shooting example
used to explain Monte Carlo simulation.We then apply the bootstrap with regression
analysis, using data presented by Ronald Fisher. Section 23.4 demonstrates
how the Bootstrap Excel add-in can be used on your own data to obtain bootstrapped
SEs.We conclude our introduction to bootstrapping by exploring how the bootstrap
can be applied to get a measure of the variability of the R2 statistic.