Testimate Guide
Emitting results
You’ve learned how to do a statistical test, but that doesn’t necessarily help you understand what the test means, or how the many numbers in the testimate
display behave.
That’s where the very bottom part of testimate
comes in.
Let’s look at the confidence interval we get when we look at the mean of pulse
in our data. The live demo below shows a test of mean pulse
with a hypothesized value of 70. We’re going to focus on the confidence interval, which covers the range [70.35, 72.36].
- Down at the bottom of
testimate
, press theemit these results
button.
A new table appears, called tests and estimates
. It contains most of the numbers that appear in the display. In particular, way over on the right, you can find the confidence level (95%) and the minimum and maximum of the CI. (If you like, use choosy
to hide the other attributes so that these are easier to get to!)
The table lets us remember our results so we can compare if we change how we configure the test. We’re going to change the confidence level and see what that does to the CI.
- In the blue “configuration” strip, change the confidence level to 96%. The CI changes.
- Press
emit these results
again. You get a new case in the newtests and estimates
table. - These are now data in a CODAP table! Make a graph with
conf
on the horizontal axis andCImin
on the vertical. - Add additional data with confidence levels ranging from, oh, say 75% to 99%. The graph updates. See if you can understand why it goes the direction it does.
- Now drag
CImax
over the graph and drop it on the plus-sign above the vertical axis (so you’re plotting bothCImin
andCImax
on the same axis). You will see the confidence-interval “trumpet” that shows how the CI expands as you increase the confidence level.
This kind of activity opens up many interesting possibilities. We can create data out of statistical calculations, and use that data to explore how the tests and estimates behave. Here are a couple of examples:
Automating the peocess: Making the null true
In a fresh CODAP document, we made a table of 30 values of X
using the formula randomNormal()
. That is, we sampled from a standard normal distribution, which of course has a zero mean. Then we did a t-test to see if the mean was different from zero.
Since we used a random-number formula, testimate
gives us additional controls at the bottom:
So we re-randomized 1000 times, with testimate
recording the test results. You try it (first click the rerandomizing
radio button):
What do you think the distribution of P-values looked like? We made the graph, used the ruler menu to add a movable value, set it 0.05, and then used the ruler again to show counts:
It’s flat! And of course in this graph we can see that we got 44 type I errors, which is pretty darned close to the 50 we might expect.
Or how about this: below you can see two graphs. On the left, the P-value as a function of the sample means. On the right, the P-value as a function of \(t\). Explain the similarities and differences between these graphs…
You can imagine endless explorations like this one.
Emitting one test per group
One more thing: if your dataset is already grouped, that is, you’ve dragged an attribute to the left in order to look separately at the data from each value of that attribute, testimate
will perform the test separately on each group.
Once again we open the data from NHANES.
- Drag
age
to the left to make 3 groups: 16, 17, and 18. (We have done this already in the live demo.) - Set up for a test (Let’s test
pulse ≠ 70
)- Drag
pulse
to the left box if it isn’t already there. - Enter
70
into the configuration.
- Drag
- Click
each subgroup
at the bottom (as shown at right). - Click the
emit results from 3 subgroups
button.
Now there are three cases in the tests and estimates
collection. The key attribute age
is at the far right of the table, so you can use it in your graphs.
Notice that, while the P-value for the whole dataset is 0.008, the value for age = 16
is 0.68: 70 is a completely plausible value for the mean of pulse
.
And if you plot the CImin
and CImax
for the three groups, you see this:
You can see how comparing test results from group to group might be useful.
And if you know how to use simmer
, you can use it to create grouped data and use that to perform tests on data that you have designed yourself.
For example, you could create random normal data from multiple populations with slightly different means, and then test each mean, seeing how the P-value distribution changes as the underlying population mean changes.