Factorial Within-Subjects Analysis of Variance (ANOVA)

This chapter will help you to:

Compare the factorial within-subjects ANOVA to the factorial between-subjects ANOVA
Review the assumptions of the within-subjects ANOVA
Review the order of analyses for the factorial design
Use the univariate GLM procedure to run a factorial within-subjects ANOVA
Interpret the results of the GLM procedure for a factorial within-subjects ANOVA
Present the results of a factorial within-subjects ANOVA using APA style guidelines.

Revisiting the Within-Subjects Design

The within-subjects, or repeated measures, design involves administering the various level of independent variables to the same sample of individuals. Because we are able to account for regularities stemming from the participants, we can account for and remove that source of variation from our error term. This reduction in error results in increased statistical power.

Assumptions

We will need to check two assumptions, just as we had for the one-way within-subjects ANOVA.

Normality of DV within each combination of levels of IVs
Sphericity or equality of variance across difference scores.

A Factorial Within-Subjects Example

Let’s start off with a 3x2 within-subjects design example. By refering to our design as “3x2” indicates that we have two independent variables. The first has 3 levels and the second has 2 levels. The within-subjects name indicates that each participant will receive all levels of all IVs (and have a dependent variable score associated with each administration).

Here’s the set up. A sensation and perception psychologist is investigating the impact of lighting color (natural, green, and red) and intensity (bright and dim) on flavorfulness ratings of vegetables (100 point scale). The researcher invites 5 children to participate in six conditions. The split-plot table is presented in Table 1.

Table 1

Split Plot Design of Example

		Color
		Natural	Green	Red
Intensity	Dim	P1 = 69 P2 = 69 P3 = 69 P4 = 73 P5 = 74	P1 = 84 P2 = 84 P3 = 84 P4 = 88 P5 = 89	P1 = 31 P2 = 39 P3 = 39 P4 = 35 P5 = 45
Intensity	Bright	P1 = 79 P2 = 79 P3 = 81 P4 = 80 P5 = 84	P1 = 68 P2 = 70 P3 = 71 P4 = 72 P5 = 71	P1 = 52 P2 = 62 P3 = 63 P4 = 56 P5 = 64

Our 3x2 design could be conceptualized as a 3x2x5 design because we are analyzing the variability in flavor scores due to our participants. We will be removing this source of variance rather than examining it for statistical significance. Therefore we do not participant in our description.

Order of Analyses

With our dat set laid out, we can think about the types and number of effects we will want to test in our within-subjects ANOVA. We will want to test for each main effect and for the interaction effect.

Determining Number of Effects

Include all Main Effects.

There will be one main effect for each independent variable.

In the Vegetable Flavor example, we have two IV: color and light.

Find Unique Combinations of IV for Interaction Effects.

Eliminate and interaction effects that contain the same factors as another interaciton.

Color x Light is the same as Light x Color, so there is only one interaction effect.

The Factorial Within-Subjects ANOVA

When we’re done with checking our assumptions, we run the ANOVA. The ANOVA always precedes the post hoc analyses because it controls for the increased Type I error that accompanies multiple testing.

We’ll run through the SPSS procedure to produce the factorial within-subjects ANOVA using the repeated measures general linear model in the next section, for now we’ll review the interpretation steps. Table 2 is the ANOVA table that results from the repeated measures GLM for our example.

Table 2

Within-Subjects ANOVA Table

Source	Sum of Squares	df	Mean Square	F	Sig.
Color	5368.067	1.06	5044.375	199.556	.000
Error(color)	107.600	4.257	25.278	–	–
Intensity	213.333	1	213.333	51.200	.002
Error(Intensity)	16.667	4	4.167	–	–
Color*Intensity	1786.067	1	893.033	1448.162	.000
Error(Color*Intensity)	4.933	4	.617	–	–

Note. Degrees of freedom for Color and color*intensity adjusted for sphericity violation using Greenhouse-Geisser correction.

You should notice a few things. First, as indicated by the table note, there was a violation of the assumption of sphericty an that the degrees of freedom were adjusted using the Greenhouse-Geisser correction. Second, there are error terms associated with each effect. That is, we are required to calculated a separate demoninator for each main effect and for the interaction effect.

The need for separate error terms stems from how are grouping our sources of variance. Recall that the error term in our F-statistics represents variability in individual scores about the group means for which we cannot account. In a within-subjects design, we are calculating how the participants’ scores vary from their means and remove that variability from our error term. The reason why this leads to different error terms is because we will have different sets of scores involved in each effect.

Interpreting Effects

Table 1 reveals a significant interaction effect so we will focus our interpreation around that effect. Let’s look to the interaciton plot to guide our interpretation. Figure 1 is a line chart of the effect of color and intensity on reported vegetable flavor.

Figure 1

Interaction Line Graph of Color and Intensity on Vegetable Flavor

Note.Error bars represent 95% CI.

This looks like a busy chart but let’s look for patterns and violations of patterns to guide our interpretation. The first pattern I notice is the nice straight line for the dim lighting conditon. That is, the flavor rating is highest for the green colored light, lowest for the red colored light, and in the middle for natural lighting. I’ll use that straight line as my reference as I examine what is happening in the bright lighting condition.

Recall that a significant interaction tells us that something in the relationship between one IV and the DV changes when we apply the levels of the other IV. As such, we should be looking for where our relationship (i.e., the straight line in the dim condition) changes.

The bright lighting condition does not show the same straight line but rather the highest flavor rating is for natural light rather than green light. Importantly, however, notice that the flavor rating for the green colored light is lower than that of the green colored light in the dim condition just as the red colored light leads to a lower rating than that of the red colored light in the dim condition. Stated another way, we would expect the natural colored light in the bright condition to behave in a similar way to that of the natural colored light in the dim condition (i.e., produce an average flavor rating), but it does not. This deviation from expectation is what is driving our interaction effect and thus we should focus or write-up on that feature.

Reporting Post Hoc Analyses

To corroborate our interpration of the interaction plot, we’ll want some statistics. Just as we had in the one-way within-subjects ANOVA, we will utilize the Bonferroni correction. We do have the option of performing simple effects tests as well. If we were to perform them, I would suggest that we narrow in on the interesting changing happening in the natural light conditions. However, given that we perform the Bonferroni correction for multiple post hoc paired samples t-tests, it makes sense to jump to those corrections now. Table 3 provides the means and confidence intervals for flavor rating at the interaction of the IVs.

Table 3

Means and Bonferroni-adjusted Confidence Intervals

Color	Intensity	Mean	95% CI LL	95% CI UL
Green	Bright	70.800	67.708	73.892
Green	Dim	80.600	78.025	83.175
Natural	Bright	85.800	82.708	88.892
Natural	Dim	70.400	68.517	72.283
Red	Bright	37.800	31.324	44.276
Red	Dim	59.400	52.972	65.828

Note.Confidence intervals adjusted using Bonferroni correction.

With these means and confidence intervals, we can provide a succinct description of the interaction.

“The interaction plot in Figure 1 suggests that a trend of flavor ratings decreasing from green light, to natural light, to red light in the dim condition was different for the bright condition. Although the flavor ratings in green and red lights are reliable lower in the bright than dim condition (see Bonferroni-corrected 95% CI in Table 3), the trend reverses for the natural light condition. That is ratings are reliably higher higher in the bright condition than the dim condition for natural light (see Table 3).”

Using SPSS: GLM for the Factorial Within-Subjects ANOVA

Luckily for us, setting up the factorial within-subjects ANOVA using the repeated measures general linear model is very similar to how we implemented the one-way within-subjects ANOVA. There are only a few differences due to additional independent variables we’ll make sure to note along the way.

The Data Set

Visit the “Dataset” folder in the “Files” section of the Canvas site to download the “veggieFlavors.sav” SPSS file. When you open the file, click on the “Data View” tab in the data editor.

This file has the same “wide” style layout that we saw in the paired samples t-test and the one-way within-subjects design. Rather than finding a column with our DV and one column per IV, we find several columns of DV scores (See Figure 2).

Figure 2

Data View of veggieFlavor.sav

Determining the DV

How do you know that these are DV and not IV columns? Here are a few clues.

First, the scale of measurement for an IV ought be “norminal” but these are all “scale.” It is true that SPSS files do not have the correct scale of measurement but the values themselves are more than a few discrete variables indicative of dummy coding.

Second, the labels in the variable view (see Figure 3) has more detail about the values of the variable.

Figure 3

Variable View of veggieFlavor.sav

Each label states “Vegetable Flavor Ratings in…” This suggests that each variable contains “vegetable flavor ratings” under different conditions. As such, we might call our DV “flavor ratings.”

Determining the IVs

The various conditions under which the DV is assessed are the levels of the independent variables. How many IV do we have and what shall we name them? Let’s list the conditions with generic IV names to get started (see Table 4).

Table 4

IV levels

IV 1	IV 2
Green	Bright
Natural	Bright
Red	Bright
Green	Dim
Natural	Dim
Red	Dim

With levels of “green, natural, or red,” IV 1 seems to represent colors of light. As such, we may call IV 1 “color.” IV 2 only lists two levels: “bright or dim.” Perhaps a good name for IV 2 is “intensity”.

The Research Question

It seems, thus, that our data set is the result of a study in which someone is trying to determine how the color and the intensity of light impact how flavorful one perceives vegetables to be. Importantly, each participant rated the flavorfulness of vegetables in each combination of IV levels.

Checking Asssumptions

We’ll check for both normality of the DV within each combination of IV levels and for sphericity

Normality

The assumption for normally distributed DV scores is to hold for each of the groups formed by the combination of the levels of the IVs. For a within-subjects design, our DV scores are already divided by these combinations so we need only perform our normality assessments to each of the variables.

For this lesson, I’m going to revisit the Q-Q plot. Recall that the Q-Q plot (Q-Q stands for “Quantile-Quantile”) compares the distribution of a variable to a perfectly normal distribution. The perfectly normal distribution appears as a black reference line and our data are the dots plotted around the line. The closer the dots are to the line, the more normally distributed the data are. If the the middle of the dots bow above or below the refence line, our data is skewing toward the negative or positive, respectively. If the dots “snake” around the line, our data are kurtotic.

To produce Q-Q Plots, click on the “Analyze” menu, then hover over “Descriptive Statistics”, then click on “Q-Q Plots” (See Figure 4).

Figure 4

Navigating to Q-Q Plots in Menu Bar

The Q-Q plot window will open. You will need to move each variable from the box on the left to the box on the right (see Figure 5).

Figure 5

Q-Q Plots Window

Click “Paste” to generate the syntax. Navigate to the syntax editor window then select and run the syntax.

Figure 6 contains the Q-Q plots, arranged by IV levels.

Figure 6

Q-Q plots of Vegetable Flavor Ratings by Color and Intensity of Light

Intensity	Color
Intensity	Natural	Green	Red
Dim
Bright

These Q-Q plots show some deviation from normal but not enough to raise concern. If you feel some concern, please check the skewness and kurtosis statistics from the “Frequencies” options in the “Descriptive Statistics” menu.

Sphericity

Check for sphericity visually can be a bit of a daunting task. It involves calcualting all the difference scores (15 in this case) then producing simple error bar charts to compare variance. Although I am a big fan of data visualization, I’m afraid the required work in SPSS makes that an untenable option. Instead, we’ll simply refer to Mauchly’s Test of Sphericity in our GLM output.

Settin up the GLM Procedure

We are ready to set up the GLM procedure. Navigate to the “Analyze” menu in the menu bar. Hover over “General Linear Model,” then select “Repeated Measures” from the menu. The “Repeated Measures Define Factor(s)” window should appear.

Defining Factors

We’ll want to reference the investigative work we did in the “Data Set” section above in which we determined the IVs and DV.

Recall that we decided on two IVs. The first was “color” and it had three levels. Let’s add that into the “Repeated Measures Define Factor(s)” window first. “Color” will go in the “Within-Subjects Factor Name” box and “3” will go in the “Number of Levels” box (see Figure 7). Click the “Add” button under the “Number of Levels” label to complete the factor definition.

Figure 7

Adding Color Factor

Next, add the “intensity” factor with 2 levels. Click “Add” to add the “intensity” factor to the model.

Lastly, we’ll want to define our DV by typing “flavor” into the “Measure Name” box. Click “Add” to add the DV to the model.

Figure 8 shows the completed “Repeated Measures Define Factor(S)” window.

Figure 8

Completed Define Factors Window

Click “Define” in the bottom of the window to move to the next window.

Assigning Variables

With the design of our model set, we’ll need to tell SPSS how our variables map on to the combination of IVs. Figure 9 is a portion of the “Repeated Measures” main window that contains the combination of IVs and DVs that SPSS is expecting to have corresponding variables.

Figure 9

Within-Subjects Variables Box

The important organizational key is in the parenthesis above the box. It reads “(color,intensity)”. These are the IV for which the dummy coded levels appear in the parentheses in the box below. For example, the first variable should correspond with color = 1 and intensity = 1. The second variable should correspond with color = 1 and intensity = 2.

Now, it is true that the values of dummy codes is arbitrary. As such, it does not matter which level you choose to be color = 1. What does matter, however, is that you are consistent. That means if you choose “green_bright” for the first variable, you are assigning a 1 to “green” and a 1 to “bright.” That means that any variable for which the first number in the parenthesis is a 1, you need to select a “green_” variable from the box on the left. Any variable that has a 1 in the second position needs to be assigned a "_bright" variable form the box on the left.

If we choose “green_bright” for “(1,1,flavor)”, we need to choose “green_dim” for “(1,2,flavor).” We will continue on this way until we have a variable from our dataset assigned to each of the within-subjects variables. Figure 10 has a partially completed “within-subjects variables” box.

Figure 10

Partially Completed Within-Subjects Variables Box

Try to complete fill in the rest of the variables before checking your approach with Figure 11.

Figure 11

Completed Within-Subjects Variable Box

Profile Plots

Now that SPSS knows how the data maps onto our model, we can ask for some profile plots to help guide our interpretation of our results.

Click the “Plots” button on the right side of the “Repeated Measures” window.

We’ll need to ask for both main effects plots and an interaciton plots.

Main Effects

To create a main effect plot, move an IV from the “Factors” box on the left to the “Horizontal Axis” box on the right. Figure 12 shows the set up for the main effect of “color.”

Figure 12

Setting up Main Effect Profile Plot

Click the “Add” button to ensure that the plot gets made.

Do the same procedure for the main effect of “intensity.”

Interaction Effects

The process for making an interaction plot requires including both IVs. Let’s move “color” to the “Horizontal Axis” box and “intensity” to the “Separate Lines” box (see Figure 13). Click the “Add” button to complete the process.

Figure 13

Setting up Interaction Effect Profile Plot

Chart Type and Error Bars

Let’s ensure that our charts are helpful by choosing the “line chart” option for “Chart Type” and to “include error bars” under the “Error Bars” section. Figure 14 contains the completed “Profile Plots” set up.

Figure 14

Completed Profile Plot Window

Click “Continue” to return to the “Repeated Measures” window.

Estimated Marginal Means and Post Hoc Analyses

Our last step in setting up the repeated measures GLM is to ask SPSS for some statistics to help us interpret any significant effects. We’ll ask SPSS for marginal means and to provide some adjustment for post hoc tests.

Click on the “EM Means” button to open the “Repeated Measures: Estimated Marginal Means” window.

We’ll want to know the mean flavor rating across levels of color (for the main effect of color), across levels of intensity (for the main effect of intensity), and across the combination of levels of color and intensity (for the interaction effect of color * intensity). Simply drag all but the “(OVERALL)” factor from the “Factor(s) and Factor Interactions” box to the “Display Means for” box (see Figure 15).

Figure 15

Setting up Marginal Means Factors

Should we have some significant main effects that need explored (that is, no significant interaction effect), we can ask SPSS to perform some adjustment of the confidence intervals associated with the multiple comparisons. Click the “Compare main effects” option below the “Display Means” for box. Then select “Bonferroni” from the “Confidence interval adjustment” dropdown menu (see Figure 16).

Figure 16

Requesting Post Hoc Adjustments

Click the “Continue” button to return to the main “Repeated Measures” window.

Run GLM Syntax

We now have all of the options set for our repeated measures GLM. Navigate to the syntax editor to select and run the GLM syntax (see Figure 17 for complete syntax).

Figure 17

Syntax for Repeated Measures GLM

Interpreting the Output

Checking Variable Alignment

Before we dive into the model, let’s ensure that we have aligned our variables correctly with the factors by checking the “Within-Subjects Factors” table (see Figure 18).

Figure 18

Within-Subjects Factors

We expect to see the same colors in each row grouping under “color”. We have “green_” variables where color is 1, “natural_” where the color is 2, and “red_” where the color is 3. We also see that we have "_bright" each time there is a 1 in the “intensity” column and "_dim" each time there is a 2.

If your variables are not aligned properly, go back to the “Reated Measures” window and reassign the variables.

Now we can check our assumption of sphericity.

Mauchly’s Test of Sphericity

Scroll to the “Mauchly’s Test of Sphericity” table. We will spot any violations of sphericity when the Sig. value for an effect is < .05. In our example, there is a violation for the “color” factor (see Figure 19).

Figure 19

Mauchly’s Test of Sphericity

With this violation, we’ll need to use the Greenhouse-Geisser corrected degrees of freedom when interpreting our test of within-subjects effects.

You may be wondering why, for three effects, we only have one effect with a Sig. value. Intensity does not have a sig. value because there is no actual check of sphericity happening for this factor. This is because intensity only has two levels. With only two levels, we can only compute one difference score. Sphericity is the assumption that the difference scores have equal variance. If we only have one set of difference scores, we cannot compare them to other difference scores in the “intensity” factor.

The reason why our interaction effect does not have a value listed for the chi-square or sig. column is because we have more comparisons than we do participants. Although this seems bad, as we’ll find soon, even when we take the most conservative correction, our results do not change.

Test of Within-Subjects Effects

When interpeting the ANOVA table presented as the test of within-subjects effects table, we’ll need to be sure to check the row that corresponds with the correct adjustment, when needed. That is, we ought check the “Greenhouse-Geisser” rows for the effect of color and the interaction of color*intensity. Figure 20 is the “test of within-subjects effects” table from our output.

Figure 20

Test of Within-Subjects Effects

$Test of Within-Subjects Effects$

Just to reinforce the point about our robust effects, take note of how the Sig. values are relatively equal (although hard to see with SPSS rounding), regardless of the correction applied to the degrees of freedom. Althought it doesn’t seem to matter now, we need to stay in the practice of reporting the adjusted degrees of freedom when sphericity is violated.

Let’s walk through the effects. The table indicates a significant main effect of color (F[1.064,4.257] = 199.556, p < .001), a significant main effect of intensity (F[1,4] = 51.200, p = .002), and a significant interaction effect of color * intensity (F[1,4] = 1448.162, p < .001).

To refresh your memory on where to find the relevant information, Figure 21 includes highlights the correct rows to read and annotations for the main effect of color.

Figure 21

Annotated Test of Within-Subjects Effects

Post Hoc Analyses

We’ve found three significant effects but we need to focus on the interaction effect as it contains the most detail about how the two IV simultaneously impact the DV. We’ll use the interaction plot to guide our interpretatio, then follow up with the means and confidence intervals for DV at the intersection of the IV levels.

Interaction Plot

Figure 22 is the interaction plot of color and intensity on flavor rating. Notice that SPSS uses the dummy codes provided in the “within-subjects factors” table from the begining of the output for the GLM (see reproduced Figure 18 below).

Figure 22

Interacton Plot of Color and Intensity on Flavor Rating

Figure 18

Within-Subjects Factors

Given that we are working with the same data set as was used in the initial example, we can make quick work of reviewing the interpretation this interaction effect.

The interaction seems to be driven, primarily, by the change in the order of means in the natural color conditions compared to the green or red color conditions. Whereas flavor ratings are higher in dim lighting for the red and green color conditions, the opposite is true for the natural color condition. That is, higher flavor ratings occur in the bright lighting than dim lighting.

Means and Confidence Intervals

We’ll want to back that up with some numbers, so let’s bring in the means and confidence intervals (see Figure 23).

Figure 23

Means and Confidence Intervals for Flavor Ratings within levels of Color and Intensity

Again, SPSS uses dummy codes rather than the appropriate labels so we’ll need to insert those appropriately. Table 5 is an APA styled version of Figure 23.

Table 5

Means and Confidence Intervals for Flavor Ratings within Levels of Color and Intensity

Color	Intensity	Mean	95% CI LL	95% CI UL
Green	Bright	70.800	67.708	73.892
Green	Dim	80.600	78.025	83.175
Natural	Bright	85.800	82.708	88.892
Natural	Dim	70.400	68.517	72.283
Red	Bright	37.800	31.324	44.276
Red	Dim	59.400	52.972	65.828

Note.Confidence intervals adjusted using Bonferroni correction.

We now have all that we need to complete our post hoc write up.

“The interaction plot in Figure 22 suggests that a trend of flavor ratings decreasing from green light, to natural light, to red light in the dim condition was different for the bright condition. Although the flavor ratings in green and red lights are reliable lower in the bright than dim condition (see Bonferroni-corrected 95% CI in Table 5), the trend reverses for the natural light condition. That is ratings are reliably higher higher in the bright condition than the dim conditon for natural light (see Table 5).”

Presenting the Results in APA Format

We’ve done a lot of the work along the way this time so we’ll just need to piece things together.

Interaction Plot

Let’s start with the interaction plot. From SPSS, you’ll want to:

Remove tile from above figure.
Remove “Error bars: 95% CI” from below figure.
Remove grid lines.
Replace dummy codes with IV level label.
Update y-axis title to “Mean Vegetable Flavor Rating”.
Update x-axis title to “Color of Light”.
Move legend into chart area.

Here is a reproduction of Figure 1 for which I have complete the above steps.

Figure 1

Interaction Line Graph of Color and Intensity on Vegetable Flavor

Note.Error bars represent 95% CI.

ANOVA Write-Up

To report the outcome of the factorial within-subjects ANOVA, we need to remember the formula:

Test + Interpretation + Stat Summary

“A factorial within-subjects ANOVA was implemented through a repeated measures general linear model to determine if color of light and intensity of light impacted vegetable flavor ratings. The ANOVA revealed a significant main effect of color (F[1.064,4.257] = 199.556, p < .001), a significant main effect of intensity (F[1,4] = 51.200, p = .002), and a significant interaction effect of color * intensity (F[1,4] = 1448.162, p < .001).”

Post Hoc Write-up

We’ll follow that same formula again, but this time we’ll refer to our tables instead of inserting the values directly into our statements.

Summary

In this lesson, we’ve:

Compared the factorial within-subjects ANOVA to the factorial between-subjects ANOVA
Reviewed the assumptions of the within-subjects ANOVA
Reviewed the order of analyses for the factorial design
Used the univariate GLM procedure to run a factorial within-subjects ANOVA
Interpreted the results of the GLM procedure for a factorial within-subjects ANOVA
Presented the results of a factorial within-subjects ANOVA using APA style guidelines.

In the next lesson, we’ll combine within- and between-subjects factors in a mixed factorial ANOVA.