2021-10-26

Here we describe the current version of the **Bootstrap** plugin.

- Choose
**bootstrap**from the**Plugins**menu in CODAP. - Prepare your dataset for bootstrapping:
- Make a
*measure*(a new attribute with a formula) that represents the quantity you're estimating. - Drag it left in the table (or up in the case card) so that it's at a higher level in the hierarchy.

- Make a
- If there is anything wrong with the way you have prepared the table, you'll get a message with help.
- To specify the dataset,
*drag any attribute into the*.**bootstrap**plugin - Adjust the number and then click the buttons to create as many "bootstraps" as you wish, up to 1000. (Usually, 400 is plenty.)
- A table of measures from the bootstrapping appears. Analyze these to see how much your measure might vary in a wider population.

You can try all this yourself in this sample document

The point of bootstrapping is to create a *sampling distribution* of some *measure*.
In our example, we want to find the mean height of 13-year-olds.
We can compute the mean height of our *sample* (that's 161.02 cm)
but that's just our sample.
What's a reasonable interval where we expect the true mean of the whole population to fall?

To figure that out, we'll sample our 59 cases *with replacement*,
that is, it will be possible that some kids get picked twice and others not at all.
Why do this odd sampling?
The idea is that the only thing we know about the distribution of heights is our sample's distribution.
So we will sample from that distribution over and over, computing the mean height every time.

When we've sampled a lot of times, we plot the mean heights from the *measures* table:

That graph shows the *sampling distribution* of these bootstrapped means.
With that, we look to see what range of mean heights the middle 90% contains.
We took 100 bootstrap samples and then used movable values to separate the top and bottom five cases.
The remaining range (159 to 162.56) is a reasonable range;
we can be pretty confident that the true mean of the entire population lies within it.

Need to be more confident? Enclose more of the data, like 95%. Of course, to do that, you will need to make your interval wider.

This is very important, and bears highlighting:

**You must create a measure of the quantity you're investigating.
For the Bootstrap plugin to work, it needs a column with a formula, and it has to be over on the left.
**
Beyond that, you can use this technique on any quantity at all.

Designed and written by Tim Erickson, Senior Scientist, Epistemological Engineering. Thanks to Bill Finzer and the whole CODAP team at Concord. Visit codap.xyz to see what else might be coming this way.