Scrambler Help

scrambler screenshot

2021-10-13

Here we describe the current version of the Scrambler plugin.

You can try all this yourself in this sample document

(Want a task? That document is set up to compare 13-year-olds. Make it compare 10-year-olds!)

Background and an Example

The point of scrambling is to create a sampling distribution of some measure. For example, suppose that in your dataset it appears that 13-year-old boys are taller than 13-year-old girls. You want to assess whether it's plausible that the difference in means that you see could happen by chance.

To do that, you will make the "null hypothesis" real: you will break any association between Gender and Height by scrambling the values for one of those attributes. Then you look to see how different the boys and girls seem to be when the difference is just chance.

But one trial is not enough. Furthermore, you have to decide what, specifically, to look at to say that the boys are taller. In this situation, that means coming up with a number that represents how much taller the boys are.

This is very important, and bears highlighting:

You must create a measure of the effect you're seeing. It's not enough to say that boys are taller than girls; you have to say how much taller.

scrambler screen shot

In our example, we used the difference of means and called it dMeanHeights — and dragged it leftwards in the table. The CODAP formula looks like this:

mean(Height, Gender="Male") - mean(Height, Gender="Female")

We see how much taller boys are in the actual data (in our case, 5.87 cm in the mean). Then we will see how much taller they are when the data have been all scrambled. Because the data are randomly assigned, sometimes the difference will be positive, sometimes negative (the "girls" will be taller).

But is it plausible that 5.87 could appear by chance?

Repeat this process a few hundred times and see. In this case, no: even though it's possible that the data could be that extreme (after all, the real data could come up when you scramble), it doesn't happen very often.

Analyzing your results

Make a graph of the measure from the "measures" table. You'll see the sampling distribution. The picture shows the results from 200 scrambles.

scrambler measures screen shot

You want to know what proportion of those measures are more extreme than your "test statistic" (which in our case is 5.87, the difference in mean heights).

Here's the trick:

Now you can see what percentage are on each side of the line.

Set the line to 5.87 (you might need to rescale) to see how unusual it is! (Chances are, very few of your measures, positive or negative, are that large.)

When Things Go Wrong

If CODAP and the scrambler get confused, pressing the "refresh" arrow can help (it's a circular, recyle-y arrow). That will basically restart the scrambler.


Designed and written by Tim Erickson, Senior Scientist, Epistemological Engineering. Thanks to Bill Finzer and the whole CODAP team at Concord. Visit codap.xyz to see what else might be coming this way.