This chapter will help you to:
The within-subjects analysis of variance (ANVOA) is the generalization of the paired-samples t-test. That is because it is used when we have repeated measures of a dependent variable but with more than two levels of an independent variable. To account for the increased number of samples, we will need to take the analysis of variance approach. We’ll find, however, that there are some important changes from the between-subjects ANOVA
Repeated measures refers to designs in which one or more dependent variables are measured multiple times from each participant.
Recall that a within-subjects design requires each participant to experience all levels of an independent variable. That means that each participant will have the same number of dependent variable measurements as there are levels of the independent variable.
One of the classic designs of this style is called the longitudinal design. In this approach, a researcher may be interested in how outcome variables may change as a factor of time. Some examples include cognitive development, political ideation, and language abilities.
Longitudinal designs are those in which outcome variables are assessed over time, often at regular intervals.
In a more experimental setting, a researcher might employ a “pre-post” design. These designs assess participants on the dependent variables before and after administering the levels of the independent variable. For example, a researcher who is interested in the effect of team building exercises might assess team cohesiveness before and after each of three different exercises (i.e., group problem solving, trust exercises, and shared recreation).
The key aspect of these designs is the ability to assess the change in the outcome variable related to the independent variable for each participant. Given that we are focusing on the differences within each individual, we’ll be able to hone our analysis and gain statistical power.
Just as with the one-way between-subjects ANOVA, the null hypothesis is that there is no effect of the IV causing our DV scores to vary; only sampling error. Because we are focusing on the difference scores, we would expect the mean of each of our comparisons to be equal
\[ H_0: M_{A-B} = M_{A-C} = M_{A-D} = ... = M_{C-D} = \mu_D \]
We will utilize the same basic F-statistic for the within-subjects ANOVA \[ F = \frac{\textrm{Effect}}{\textrm{Error}} \]
However, we’ll need to have to define “Effect” and “Error” slightly differently.
To understand how the within-subjects ANOVA differs from the between-subjects ANOVA, let’s build an example. We’ll imagine that we ask a group of job-seekers to each try one of three different social cuing techniques during interviews. On one interview, the job-seekers are asked to sit back in their chairs, looking relaxed. On another interview, they are asked to sit on the edge of their chairs, looking eager. On a third interview, we ask them to sit upright but fully in the chair for the control condition. After each interview, we ask the interviewer to rate the job-seeker on employability.
The employability ratings for each job-seeker in each interview is presented in table 1.
Table 1
Job-Seeker Employability Ratings across Social Cuing Conditions
Job-Seeker | Control | Relaxed | Eager |
---|---|---|---|
A B C |
4 6 5 |
2 3 1 |
9 8 7 |
Before we add in our marginal means, we need to calculate the differences among the conditions for each participant. Table 2 contains the differences of ratings among all combinations of levels of the social cuing.
Table 2
Differences in Job-Seeker Employability Ratings across Social Cuing Conditions
Job-Seeker | Control - Relaxed | Control - Eager | Relaxed - Eager |
---|---|---|---|
A B C |
2 3 4 |
-5 -2 -2 |
-7 -5 -6 |
Now we can add in the marginal means to the bottom of the table. Table 3 contains the marginal means for each comparison of social cuing.
Table 3
Differences in Job-Seeker Employability with Marginal Means
Job-Seeker | Control - Relaxed | Control - Eager | Relaxed - Eager |
---|---|---|---|
A B C |
2 3 4 |
-5 -2 -2 |
-7 -5 -6 |
Marginal Means | 3 | -3 | -6 |
We can actually add another set of marginal means because we have an additional grouping factor in our design; job-seeker! However, we wouldn’t consider job-seeker to be an independent variable or predictor variable because we are concerned about the impact of individual job-seekers on employability ratings. However, it is something that varies systematically in our study and thuse can be utilized in the analysis. Table 4 contains the marginal means for job-seeker as well as for social cuing conditions.
Table 4
Differences in Job-Seeker Employability with All Marginal Means
Job-Seeker | Control - Relaxed | Control - Eager | Relaxed - Eager | Marginal Means |
---|---|---|---|---|
A B C |
2 3 4 |
-5 -2 -2 |
-7 -5 -6 |
-3.33 -1.33 -1.33 |
Marginal Means | 3 | -3 | -6 | GM = -2 |
Let’s discuss how we can use these various marginal means in the ANOVA approach.
Recall that the ANOVA approach accounts for the factors that are causing our individuals scores to differ from the overall average (i.e., the grand mean). Table 4 emphasizes that we have two factors in a one-way within-subjects ANOVA; the independent variable and participant. Remember that the term “one-way” refers to the number of independent variables in our analysis and participant is not an independent variable.
Let’s revisit our ANOVA fraction from the one-way between-subjects ANOVA lesson.
\[ F = \frac{\textrm{Between-Group Variance}}{\textrm{Within-Group Variance}}=\frac{\textrm{Variance}_{\textrm{Effect}}}{\textrm{Variance}_{\textrm{Error}}}=\frac{\textrm{Group Variance + Sample Variance}}{\textrm{Sample Variance}} \]
Essentially, we are looking to compare the differences in the scores due to our independent variable to the differences in our scores due to random error (i.e., sampling error). We can easily use the variance due to social cuing (i.e., the variability of marginal means for the columns) as the numerator.
Where do we put the variability due to participant? Nowhere. We are actually going to remove it from the denominator (i.e., \(\textrm{Variance}_{\textrm{Error}}\)). As such, our new ANOVA fraction for the within-subjects ANOVA becomes
\[ F = \frac{\textrm{Between-Group Variance}}{\textrm{Within-Group Variance}-\textrm{Between-Participant Variance}} \]
When we subtract from the denominator, we are making the denominator smaller and thus making the overall fraction larger. This has implications for statistical power.
Statistical power refers to the probability of a test to detect an effect if it is actually there. The within-subjects ANVOA is statistically more powerful than the between-subjects ANOVA for any given sample size because of the way it eliminates the between-subjects variability from the error term. Given the same number of participants, we will be better able to detect a difference across conditions using a within-subjects design than a between-subjects design.
Of course, there are other things to consider than statistical power. For example, our job-seekers interviewing ability using one social cuing technique may be influenced by those used in previous interviews. These carry-over effects can be serious flaws in a study. There are ways to appropriately handle these issues but it comes at the cost of complicating the design.
Carry-over effects refers to the impact of previous trials on future trials.
If (and I DO MEAN IF) we have a statistically significant F-test for the within-subjects ANOVA, we will need post hoc analyses. This is because the ANOVA only tells us that there is a difference across our IV levels but does not reveal which levels are different from other levels.
The post hoc analysis of choice for the within-subjects ANOVA is the Bonferroni-corrected paired samples t-test. The Bonferroni correction is a simple technique used to adjust for the increase Type I Error (i.e., false positive rate) incured with multiple tests. You simply divide your \(\alpha\)-level by the number of pairwise comparisons made.
The Bonferroni correction adjusts for increased Type I Error by dividing \(\alpha\) by the number of tests.
In our example, we will have to perform three paired samples t-tests to compare all groups. The Bonferroni correction thus yields a new, adjusted \(\alpha\)-level of \(\frac{\alpha}{3} = \frac{.05}{3} = 0.0167\).
We’ll find that we need to ask SPSS for this adjustment in a different way than we requested the Tukey HSD post hoc analysis.
We are introducing a new assumption for the within-subjects ANVOA because of the extra step of calculating difference scores. It is analogous, however, to the assumption of equal variance.
Sphericity. Sphericity refers to an equality of variance among the difference scores calculated in comparing each condition to each other condition.
Normality. As we are calculating the means of the difference scores, we’ll want to ensure that they are normally distributed.
Please download the “ProductivityTechnique.sav” data set from the “Data Set” folder in the “Files” section of Canvas.
Figure 1 shows a section from the Data View of SPSS
Figure 1
Data View of ProductivityTechnique.sav
Notice the “wide” layout of the data set. That is, rather than having all of the dependent variables in one column, they are spread out across different variables that are related to the levels of the IV. A look at the variable view better elucidates those levels (see Figure 2).
Figure 2
Variable view of ProductivityTechniques.sav
This data set represents the proportion of daily tasks each individual was able to complete following one of three preparations: none (control), making a list, getting sufficient sleep.
Our research question is thus “Do sleep or list-making effect daily productivity?”
This is a within-subjects design, which requires that the difference scores from comparing the levels of the IV are normally distributed. To check this assumption, we will need to calculate the difference scores for each comparison.
Go to the “Transform” menu, then click “Compute Variable…” as in Figure 3.
Figure 3
Compute Variable in Transform Menu
Before we start computing new variables, it may be helpfu to display the variable names instead of the labels in the box on the left of the “Compute Variable” window. Right click on any variable in the list then select the “Display Variable Names” option as depicted in Figure 4.
Figure 4
Displaying Variable Names
Now that we can more easily see which variable is which, let’s create the first comparison variable by typing the name of the new variable into the “Target Variable” box. We will call it “Control_List” because we will be subtracting the values for the “list” condition from the “control” condition (See Figure 5). Drag the “PropTasksComplete” variable from the box on the left to the “Numeric Expression:” box in the top-right of the window (See Figure 5).
Figure 5
Computing a New Variable
To subtract a variable from the “PropTasksComplete” variable either press the minus sign key on your keyboard or click the minus sign button in the selection of buttons below the “Numeric Expression” box. Next, drag the “PropTasksComplete_List” variable from the box on the left to the “Numeric Expression” box. Your window should look like Figure 6.
Figure 6
Full Numeric Expression for Control_List
Click the “Paste” button to generate the syntax.
Repeate this procedure to create “Control_Sleep” (See Figure 7) and “List_Sleep” (see Figure 8) variables. Be sure to click “Paste” each time.
Figure 7
Computing the Control_Sleep Variable
Figure 8
Computing the List_Sleep Variable
Navigate to your syntax editor window to verify that your syntax is the same as what is displayed in Figure 9. Select and run the syntax.
Figure 9
Compute Variable Syntax
Not much is happening in the Ouput viewer, but a quick check of the Data view of the Data Editor window shows that we now have three new variables (See Figure 10).
Figure 10
New Variables in Data Editor
With our newly created difference variables ready, we can create a histogram for each variable to check for normality. These histograms will be produced through the chart builder, one at a time.
Go to the “Graph” menu and select “Chart Builder” (see Figure 11).
Figure 11
Chart Builder in the Graph Menu
In the Chart Builder window, select “Histogram” from the “Choose from:” list in the Gallery section. Then select the select the “Simple Histogram” (the first option) in the histogram variantions in the box to the right of the “Choose from:” list (see Figure 12).
Figure 12
Selecting Histogram in the Chart Builder
Figure 13 shows how to set up the histogram for the “Control_List” variable with a normal curve. Drag “Control_List” variable to the x-axis and then click on “Display normal curve” option in the “Element properties” window.
Figure 13
Setting up the Control_List Histogram
Click “Paste” to generate the syntax for this histogram.
You will need to repeat this procedure for the other two difference variables.
Figure 14 displays the syntax for each of the three histograms for your reference. With the syntax ready for all the histograms, select all the syntax associated with the histograms (see Figure 14) and then press the “run” button.
Figure 14
Histogram Syntax
Let’s check these histograms by going to the output viewer. Figures 15, 16, and 17 are the histograms for Control_List, Control_Sleep, and List_Sleep.
Figure 15
Histogram for Control_List
Figure 16
Histogram for Control_Sleep
Figure 17
Histogram for List_Sleep
The bars in each may not look particularly bell-curved but this is in part due to the interval width for the bins (i.e., what values are counted for each bar), and the sample size. The normal curve gives us a better approximation of the population distribution for the variable. What we notice is that these normal curves are mostly symmetrical and that the peak is central on the x-axis. These indicate that each variable is normally distributed.
Now we’ll prepare the repeated measures general linear model for this one-way within-subjects design.
Navigate to the “Analyze” menu, go to “General Linear Model” and then select the “Repeated Measures…” (see Figure 18).
Figure 18
Repeated Measures GLM Menu
The “Repeated Measures Define Factor(s)” window (see Figure 19) will open. In this window, we tell SPSS what our independent and dependent variables are. Recall that we are wondering if the preparation technique impacts daily productivity.
Figure 19
Repeated Measures Define Factors Window
Let’s put “preparation” in the “within-subjects factor name” box (our IV) then click the “add” button (see Figure 20).
Figure 20
Adding Within-Subjects Factor
Next, I’ve decided to call the dependent variable “PropComplete” (see Figure 21). Click “Define” to move on to assigning variables.
Figure 21
Defining the Measure Name
Figure 22 shows the “Repeated Measures” window before assigning variables.
Figure 22
Unassigned Variables in Repeated Measures Window
Notice that we have three more variables in the left list than we have spaces in the “Within-Subjects Variables” list. This is because we had created the difference variables to assess normality. It is important that you select the original variables for this procedure, not the difference scores. The procedure will be calculating difference scores so we would’t want differences among differences.
Drag the first three variables from the list on the left to the “within-subjects variables” box on the right (see Figure 23).
Figure 23
Assigned Variables in Repeated Measures Window
Our variables are now available for creating a bar chart to compare means across our IV levels.
We’ll create a bar chart in the same manner as we had for the other GLM procedures (i.e., independent samples t-test, paired samples t-test, one-way between-subjects ANOVA). Click on the “Plots” button on the right side of the “Repeated Measures” window.
Drag “Preparation” from the “Factors” list on the left side of the window to the “Horizontal Axis” box on the right side of the window.
Click Add.
You now have the option to choose bar chart for the chart type and to include error bars.
Your final window should like the one depicted in Figure 24.
Figure 24
Complete Profile Plots Window
Click “Continue” to return to the main “Repeated Measures” window.
We’ll want some more specific numbers to report along side our bar chart so we’ll click on the “EM Means” button to generate means and confidence intervals. It is in this window where we will also ask SPSS to do a Bonferroni adjustment to our confidence intervals.
Drag the IV (i.e., “Preparation”) from the “Factor(s) and Factor Interactions” list to the “Display Means for” box.
Click on the “Compare main effects” option below the “Display Means for” box.
Select “Bonferroni” from the “Confidence interval adjustment” drop-down menu.
Figure 25 shows the completed setup for the EM Means window.
Figure 25
Completed setup for the EM Means Window
Once you have all the options set, click “Continue” to return to the the “Repeated Measures” main window.
We have now set everything we need for the one-way within-subjects ANOVA via the repeated measures general linear model procedure.
We did not utilize the “options” window because our test for our second assumption of sphericity will automatically be included in the output.
Let’s generate the syntax by clicking the paste button at the bottom of the main “Repeated Measures” window.
Navigate to the syntax editor and verify that your syntax for the GLM looks like that in Figure 26.
Figure 26
Syntax for the Repeated Measures GLM
If you need to make any corrections, do so before selecting and running the syntax.
Navigate to the output viewer. We’ll walk through the important tables and figures in the output.
The first table to note is the “Within-Subjects Factors” table that associates each of our original variables to a dummy code. This is important becasue the tables that depict marginal means will use these codes instead of the levels of the IV.
Figure 27
Within-Subjects Factors Table
Figure 28 displays an annotated version of “Mauchly’s Test of Sphericity” table.
Figure 28
Annotated Mauchly’s Test of Sphericity Table
We will interpret this table in much the same way we interpreted the Levene’s Test of Equality of Variance table. Our assumption is that we have sphericity. To maintain that assumption, we need a “Sig.” value (i.e., p-value) greater than .05. If we have a “Sig.” value less than .05, we have to reject the assumption of sphericity.
Our table indicates a violation of the assumption of sphericity (Sig. < .05). This has implications for how we interpret the next table (“Test of Within-Subjects Effects”).
The “test of within-subjects effects” table is our ANOVA table that contains information about the reliability of the impact of the IV on the DV. As you will see in Figure 29, there are multiple values, depending on the adjustment made for sphericity.
Figure 29
Annotated Test of Within-Subjects Effects Table
The first row for our effect is for when our assumption of sphericity holds. Because we violated that assumption, we will need to adjust our degrees of freedom. There are several methods for doing so but the most common (because it is not too lenient or conservative for most violations) is the “Greenhouse-Geisser” correction. As such, we’ll need to find the information presented in the second row of the “Preparation” and “Error(Preparation)” sections of the table.
Now that we know in which rows to look, we can find the important columns for interpretation. First, I want to point out that our degrees of freedom look a little strange because of the correction. That is, they are no longer whole numbers.
Let’s check the “Sig.” value for the effect. If it is less than .05, we can reject the null hypothesis and claim that preparation technique has a significant effect on productivity. In this example, we see that p < .001 so we will reject the null hypothesis.
Since we’ve rejected the null hypothesis and are claiming that “preparation” has some effect on “PropComplete,” we need to investigate our post hoc comparisons to learn about the patter on differences.
Scroll to the “Pairwise Comparions” table to determine which levels of “Preparation” lead to differences in productivity.
This “Pairwise Comparisons” table has the same format as it did for the one-way between-subjects ANOVA. The only difference is that the values were calculated using a different approach (bonferroni vs. Tukey HSD). The table still has redundancy so Figure 30 presents an annotated version that redacts the repeated information and highlights relevant information.
Figure 30
Annotated Pairwise Comparison Table
To determine which levels of the IV lead to reliable changes in the DV (compared to other levels), we’ll need to check for “Sig.” values that are less than alpha (.05). As Figure 30 shows, each level of the IV lead to a reliable change in the DV. We’ll look at the estimated marginal means to determine the pattern more easily.
Figure 31 shows the estimated marginal means and confidence intervals for each level of “prepartion.”
Figure 31
Estimated Marginal Means and Confidence Intervals
To help us recall what each dummy code represents, I’ve reproduced Figure 27 below.
Figure 27
Within-Subjects Factors Table
The data suggests that the control condition lead to the smallest proportion of tasks complete and getting sufficient sleep lead to the largest proportion of tasks complete.
The profile plot bar chart presents the same inforamtion from the marginal means table (see Figure 32).
Figure 32
Bar Chart of Preparation Technique and Proportion of Tasks Complete
Again, notice that the “sufficient sleep” (dummy code 3) condition lead to the most productivity, then the “list making” (dummy code 2) condition, and lastly the control (dummy code 1) condition. Also note that the error bars do not overlap, suggesting reliable differences (just as we saw in the pairwise comparison table).
We are going to present our results in the same pattern as we have been doing previously. Let’s start with styling the histogram.
We’re going to follow APA guidelines for styling the histogram this time because we’ve had a fair amount of practice with the bar chart. I’ll focus on the first from our procedure (see reproduction of Figure 15 below)
Figure 15
Histogram for Control_List
Here’s what we’ll need to do to get this figure APA-style ready.
Double click the figure in the output viewer to open the “Chart Editor” window.
To remove grid lines, click the “Hide Grid Lines” button in the toolbar (see Figure 33).
Figure 33
Hide Grid Lines Button in Toolbar
To remove the title, click the title once, then press the “DEL” key on your keyboard (for PC users). For some keyboards, you may have to press the “FN” (function) key while presseing the backspace key. If you are on a Mac, you should press the Apple key (⌘) and the backspace key.
Follow the same approach to delete the extra information from the side of the figure. Click once then press delete.
If SPSS does not actually delete what you try to delete, you can crop the image in the word processing software of your choice during the write up.
To update the x-axis title, you’ll need to click once on the title then click again to enter the editting mode. Remember that it is better to be clear in labeling, even it seems like we are not being “concise.” I suggest “Difference in Proportion Tasks Complete of Control and List Conditions.”
Figure 34 (including the figure number) is the completed APA styling of the histogram
Figure 34
APA Styled Histogram with Normal Curve
Note. Normal curve displayed.
Our formula for writing up results still applies: Test + Interpretation of Results + (Summary of Stats)
Test: “Mauchly’s test of sphericity…”
Interpretation: “… revealed a violation of sphericity”
Summary of Stats: “(p < .05)”
All together: “Mauchly’s test of sphericity revealed a violation of sphericity (p < .05)”
Test: “A one-way within-subjects ANVOA, implemented in a repeated measures general linear model,…”
Interpretation: “… suggested a significant impact of preparation technique on proportion of tasks complete”
Here is a reproduction of the test of within-subjects effects table to facilitate the write-up of the statistical summary.
Figure 29
Annotated Test of Within-Subjects Effects Table
Summary of Stats “(F[1.365, 39.571] = 453.101, p < .001. Degrees of freedom adjusted using Greenhouse-Geisser correction)”
All Together: “A one-way within-subjects ANVOA, implemented in a repeated measures general linear model, suggested a significant impact of preparation technique on proportion of tasks complete (F[1.365, 39.571] = 453.101, p < .001. Degrees of freedom adjusted using Greenhouse-Geisser correction).”
Test: “Bonferroni adjusted pairwise comparisons…”
Interpretation: “… revealed reliable differences across preparation techniques”
Summary of Stats: “(ps < .001)”
All Together: “Bonferroni adjusted pairwise comparisons revealed reliable differences across preparation techniques (ps < .001).”
We don’t have a test to write up but we still need to describe the pattern of means. We’ll intersperse the statistical information with the interpretation.
Figure 31 is reproduced for convenience.
Figure 31
Estimated Marginal Means and Confidence Intervals
Interpretation (Summary of Stats): “The sufficient sleep condition yielded the greatest proportion of tasks completed (M = .735, 95% CI [.715,.755]), followed by the list-making condition (M = .681, 95% CI [.659,.702]), followed by the control condition (M = .586, 95% CI [.568,.603]).”
Mauchly’s test of sphericity revealed a violation of sphericity (p < .05). A one-way within-subjects ANVOA, implemented in a repeated measures general linear model, suggested a significant impact of preparation technique on proportion of tasks complete (F[1.365, 39.571] = 453.101, p < .001. Degrees of freedom adjusted using Greenhouse-Geisser correction). Bonferroni adjusted pairwise comparisons revealed reliable differences across preparation techniques (ps < .001). The sufficient sleep condition yielded the greatest proportion of tasks completed (M = .735, 95% CI [.715,.755]), followed by the list-making condition (M = .681, 95% CI [.659,.702]), followed by the control condition (M = .586, 95% CI [.568,.603]).
In this lesson, we’ve:
In the next lesson, we’re back to between-subjects designs but we are increasing the complexity. The factorial between-subjects ANOVA addresses multiple independent variables.