Testimate Guide

Emitting results

You’ve learned how to do a statistical test, but that doesn’t necessarily help you understand what the test means, or how the many numbers in the testimate display behave.

That’s where the very bottom part of testimate comes in.

Let’s look at the confidence interval we get when we look at the mean of pulse in our data. The live demo below shows a test of mean pulse with a hypothesized value of 70. We’re going to focus on the confidence interval, which covers the range [70.35, 72.36].

  • Down at the bottom of testimate, press the emit these results button.

A new table appears, called tests and estimates. It contains most of the numbers that appear in the display. In particular, way over on the right, you can find the confidence level (95%) and the minimum and maximum of the CI. (If you like, use choosy to hide the other attributes so that these are easier to get to!)

The table lets us remember our results so we can compare if we change how we configure the test. We’re going to change the confidence level and see what that does to the CI.

  • In the blue “configuration” strip, change the confidence level to 96%. The CI changes.
  • Press emit these results again. You get a new case in the new tests and estimates table.
  • These are now data in a CODAP table! Make a graph with conf on the horizontal axis and CImin on the vertical.
  • Add additional data with confidence levels ranging from, oh, say 75% to 99%. The graph updates. See if you can understand why it goes the direction it does.
  • Now drag CImax over the graph and drop it on the plus-sign above the vertical axis (so you’re plotting both CImin and CImax on the same axis). You will see the confidence-interval “trumpet” that shows how the CI expands as you increase the confidence level.

This kind of activity opens up many interesting possibilities. We can create data out of statistical calculations, and use that data to explore how the tests and estimates behave. Here are a couple of examples:

Automating the peocess: Making the null true

In a fresh CODAP document, we made a table of 30 values of X using the formula randomNormal(). That is, we sampled from a standard normal distribution, which of course has a zero mean. Then we did a t-test to see if the mean was different from zero.

Since we used a random-number formula, testimate gives us additional controls at the bottom:

So we re-randomized 1000 times, with testimate recording the test results. You try it (first click the rerandomizing radio button):

What do you think the distribution of P-values looked like? We made the graph, used the ruler menu to add a movable value, set it 0.05, and then used the ruler again to show counts:

It’s flat! And of course in this graph we can see that we got 44 type I errors, which is pretty darned close to the 50 we might expect.

Or how about this: below you can see two graphs. On the left, the P-value as a function of the sample means. On the right, the P-value as a function of \(t\). Explain the similarities and differences between these graphs…

You can imagine endless explorations like this one.

Emitting one test per group

One more thing: if your dataset is already grouped, that is, you’ve dragged an attribute to the left in order to look separately at the data from each value of that attribute, testimate will perform the test separately on each group.

Once again we open the data from NHANES.

  • Drag age to the left to make 3 groups: 16, 17, and 18. (We have done this already in the live demo.)
  • Set up for a test (Let’s test pulse ≠ 70)
    • Drag pulse to the left box if it isn’t already there.
    • Enter 70 into the configuration.
  • Click each subgroup at the bottom (as shown at right).
  • Click the emit results from 3 subgroups button.

Now there are three cases in the tests and estimates collection. The key attribute age is at the far right of the table, so you can use it in your graphs.

Notice that, while the P-value for the whole dataset is 0.008, the value for age = 16 is 0.68: 70 is a completely plausible value for the mean of pulse.

And if you plot the CImin and CImax for the three groups, you see this:

You can see how comparing test results from group to group might be useful.

And if you know how to use simmer, you can use it to create grouped data and use that to perform tests on data that you have designed yourself.

For example, you could create random normal data from multiple populations with slightly different means, and then test each mean, seeing how the P-value distribution changes as the underlying population mean changes.