This chapter will help you to:

  1. Explain the difference between the independent samples t-test and the one-way between-subjects ANOVA
  2. Describe the variance partitioning approach taken by ANOVA
  3. State the goals of the one-way between-subjects ANOVA
  4. State the null hypothesis for the one-way between-subjects ANOVA
  5. State the assumptions of the one-way between-subjects ANOVA
  6. Explain the need for post-hoc analyses
  7. Use SPSS to test the assumptions of the one-way between-subjects ANOVA
  8. Use SPSS to conduct a one-way between-subjects ANOVA with the General Linear Model procedure
  9. Interpret the output of the General Linear Model Procdedure
  10. Write up the results of the one-way between-subjects ANOVA in APA style

Generalizing Beyond T-Tests

The t-tests were a great step from the the z-test but that approach, too, has some limitations. The analysis of variance (ANOVA) was developed to answer similar questions as the t-tests with an expanded scope. Recall that the t-test allowed us to test if two samples came from the same population. What if we wanted to know about more than two samples?

One approach would be to perform as many t-tests as necessary to compare each mean to each other mean (e.g., \(\textrm{Mean}_A - \textrm{Mean}_B\), \(\textrm{Mean}_A-\textrm{Mean}_C\), \(\textrm{Mean}_B-\textrm{Mean}_C\)). Although simple enough, it introduces a real problem. Each time we perform a test, we run the risk of making a type I error at the established \(\alpha\)-level. If we have \(\alpha\) set to 0.05, and we run three tests, that is a total of 0.05 + 0.05 + 0.05 = 0.15 chance of making a Type I error. That’s not good. ANOVA solves this problem.

There is another problem that t-test cannot address. What if we wanted to know about more than one IV? ANOVA can handle multiple IVs and their interactions. We’ll deal with more than one IV in the next lesson. However, we’ll find that our approach in SPSS will be very similar to what we’ve done for the independent samples t-tests because the ANOVA also falls within the general linear model.

ANOVA is used when we want to know if more than 2 samples of continuous data come from different populations. The samples may be derived from multiple independent variables.

The Analysis of Variance Approach

For the t-test, we established the general statistical test form of

\[ t = \frac{\textrm{Effect}}{\textrm{Error}} \]

This remains the same for the ANOVA but we will need to work on how we define “effect” and “error.” Recall that, for the independent samples t-test, we judged the effect to be the difference between the two samples. Can we extend this to three or more samples? We can but we have to expand our definition of difference.

Effect of IV

Rather than sequentually comparing the sample means, we compare them all simultaneously. We do this “simultaneous difference comparison” all the time. When we calculate variance or standard deviation, we are summarizing how different our sample values are from a sample mean. We even do this with sample means when we look at sampling distributions of sample means. The standard deviation of the sampling distribution of sample means is referred to as the standard error of the mean and it is calculated by comparing all of sample means to the mean of the mean, better known as the grand mean.

Standard error of the mean is the standard deviation of sample means in a sampling distribution of sample means.

Grand mean is the mean of sample means.

The “effect” for an ANOVA is measured be calculating the variability of sample means.

Sampling Error

Now that we have the numerator of our fraction set (i.e., the “effect”), we need the denominator (i.e., the “error”). The denominator in t-test was the variability due to sampling error. The same is true for ANOVA; we want to summarize how different the scores may be due to sampling error. Let’s examine a visual of the analysis of variance approach to help us understand what best summarizes sampling error.

Figure 1 is a quasi-scatterplot of test scores. I’m calling it “quasi-scatterplot” because it does not have a variable on the x-axis. In the figure, you’ll see the distrinution of test scores on the y-axis.

Figure 1

Quasi-Scatterplot


One of the goals of statistical analyses is to understand why scores vary. In the “quasi-scatterplot” of Figure 1, we have no explanation as to why the scores are different. If we want to make a guess about what value we might randomly select from the population, the best (that is, least biased, not necessary most accurate) guess is the mean of all of these score. Figure 2 has the mean added as a solid black bar.

Figure 2

Quasi-Scatterplot with Mean

Note. Solid black bar represents the mean.


As you can see from Figure 2, some of our points are close to our “estimate” while others are far. The difference between our estimate and the actual data can be considered an error in prediciton. We can reduce this error if we we learn more about our data set.

Let’s add in a predictor variable to our model. Let’s imagine that we had individuals use one of three study techniques: no studying, flash cards, quizzing. Figure 3 includes this by color coding the points.

Figure 3

Quasi-Scatterplot of Test score and Study Technique

Note. Solid black bar represents the mean.


The differences in scores seems to be influenced, in part, by the study techniques used. Perhaps a better (i.e., more accurate) guess would be to use the sample mean for the group in which you are interested. Figure 4 includes the sample means as dashed lines.

Figure 4

Quasi-Scatterplot of Test score and Study Technique with Means

Note. Solid black bar represents the grand mean. Colored dashed bars represent group means.


We can now partition our estimation error into two sources of variance. We have the variability of the scores due to group memembership (i.e., the effect and sampling error) and the variability of the scores due to sampling alone. Stated differently, we are better able to predict what a score will be by knowing its group memembership than nothing at all. However, we still are unsure why the scores in a given group should differ.

The unexplainable variability is the error in our fraction. There are various reasons for that error, but we can assume that it has to do with things outside of our study. As such, we’ll chalk up that variability to differences in our samples. Just like in the independent samples t-test, we will combine the sampling error from all of the samples.

The Split-Plot Design

Another way to think about how ANOVA approaches comparing multiple groups is thinking about farming. Imagine that you’ve got a little plot of land on which you want to grow some strawberries. You’ve gotten some advice from friends and DIY farming websites about what makes strawberries grow best. A friend tells you that sugar water as a fertilizer makes strawberries sweeter. The DIY website claims that MiracleGro is the fertilizer to use. Being a scientist, you decide to test each suggestion against a control of just water. Imagine that Table 1 represents your land and the the scores represent the sweetness rating (out of 10) for each plant in the given treatment group.

Table 1

Representation of Split Plot Design

Water Only Sugar Water MiracleGro
4

5

6
1

2

3
7

8

9


We can already see how the scores are partitioned into the groups but we’ll want to summarize each treatment group with a mean. We’ll add that to the bottom of the table. The bottom and sides of the table are referred to as the margins of the table. As such, we will call these the “marginal means.” Table 2 contains the “marginal means”

Marginal Means are the means calculated for the levels of an independent variable.

Table 2

Table with Marginal Means

Water Only Sugar Water MiracleGro
4

5

6
1

2

3
7

8

9
5 2 8


To measure the effect, we will want to compare how different all of the average sweetness are from the overall sweetness (e.g., the grand mean). To measure the error, we will want to compare how different each score is from that group’s score.

Table 3 rewrites each value as a difference score. The individual plant sweetness scores are compared to the mean of its treatment group. The marginal means are compared to the grand mean.

Table 3

Comparing Scores to Means

Water Only Sugar Water MiracleGro
-1

0

1
-1

0

1
-1

0

1
0 -3 3


Notice that the difference scores are larger in the marginal means than within each group. This may foreshadow the outcome. To get to a final number, we will want to add these scores together. We’ll have to square them to avoid losing information (after all, -1 + 0 + 1 = 0). Table 4 contains the squared difference scores.

Table 4

Squared Difference Scores

Water Only Sugar Water MiracleGro
1

0

1
1

0

1
1

0

1
0 9 9


We further summarize these different scores by adding up these squared difference scores. We want to maintain all of the information from the original data set. So to account for the “impact” for each individual score we’re going to multiply the squared difference score of each group’s marginal mean and the grand mean by the number of observations in that group. Table 5 contains the weighted difference scores.

Table 5

Weighted Squared Difference Scores

Water Only Sugar Water MiracleGro
1

0

1
1

0

1
1

0

1
0 27 27


We can then add up all of these difference scores into the the sum of squares for the “effect” and the “error”.

\[ \textrm{sum of squares}_{\textrm{Treatment}}=0+27+27=54\] \[ \textrm{sum of squares}_{\textrm{Error}}=1+0+1+1+0+1+1+0+1=6\]

The last bit we need to do is to turn these sums into variance (it is analysis of variance after all). We do this by dividing by degrees of freedom rather than the number of scores.

The degrees of freedom are equal to the number of observations minus the number of reference means.

For the effect of treatment, we have three marginal means that we are comparing to the grand mean (our reference mean). The degrees of freedom are thus 3 marginal means - 1 grand mean = 2.

For the error, we have 9 observations compared to three marginal means. We thus have 9 observations - 3 marginal means (our reference means) = 6.

Our variances for effect and error are:

\[ \textrm{Variance}_{\textrm{Treatment}}=\frac{\textrm{Sum of Squares}}{df}=\frac{54}{2}=27\] \[ \textrm{Variance}_{\textrm{Error}}=\frac{\textrm{Sum of Squares}}{df}=\frac{6}{6}=6\]

The ANOVA approach is to compare the variability in group scores to the variability of individual scores in the groups. We can thus write our fraction as:

\[ \frac{\textrm{Between-Group Variance}}{\textrm{Within-Group Variance}}=\frac{\textrm{Variance}_{\textrm{Effect}}}{\textrm{Variance}_{\textrm{Error}}} \]

Our fraction turns out to be \(\frac{27}{1}=27\). That number doesn’t make a lot of sense until we compare it to the null hypothesis for an ANVOA (to be covered in the next section).

The One-Way Between-Subjects ANOVA

We are starting off with the simplest form of the ANOVA, the one-way between-subjects ANOVA. We outlined the “ANOVA” approach (i.e., dividing a distribution of scores into effect and error variances) in the previous section . We reviewed “between-subjects” designs (i.e., each participant recieves only one level of the IV) when discussing the independent samples t-test. The new term to discuss is “one-way”. This refers to the number of independent variables/predictors in our design. As such, a “one-way between-subjects ANOVA” is a statistical analysis for determining the effect of one IV with levels administered to separate groups. A two-way ANOVA has two predictors and a seven-way ANOVA has seven predictors.

Traditionally, we would reserve the ANOVA for designs with more than 2 groups. With only two groups, a t-test is appropriate. However, because the ANOVA is more general than the t-test, it can be used in the same ways as the t-test. Furthermore, because the general linear model is more general than the ANOVA, it can handle ANOVA and t-test. It is for this reason that we will continue to utilize the GLM procedure in SPSS.

Relevant Research Questions

As the name suggests, we’d employ the one-way between-subjects ANOVA when we are interested in examining the effect of just one independent variable / predictor on a continuous dependent variable / outcome. The IV can have any number of levels but there are constraints. The more levels you have, however, the more data you need to detect differences.

Here are a few examples

  • Does the type of breakfast influence alertness levels?
  • Do the types of psychotherapy differently impact anxiety?
  • Does the study technique used lead to differen test scores?

I’ve intentionally phrased these as yes/no questions rather than as “which” questions. This is because the ANOVA can only tell us if there is an effect of the IV, but does not tell us about specific patterns among the levels of the IV. That is, it can tell us that the means of the samples vary significantly (relative to sampling error) but it can not tell us which of the means are reliably different from other means.

If we were studying the effect of type of breakfast on alertness levels, we may ask some individuals to eat a balanced breakfast, some to eat a high protein breakfast, and another group to have a high carb breakfast. The ANOVA could tell us that the type of breaksfast does matter but would not reveal which is best or worst.

That may seem like the ANOVA is fairly pointless. It would be like telling the server at a restaurant that you are ready to order but not telling them what you want! No worries, the ANOVA is the first step in a two-step process.

Post Hoc Analyses

The ANOVA can tell us if there is an effect. We need post hoc (i.e., “after the event”) tests to determine the pattern of the effects.

For the between-subjects one-way ANOVA, our post hoc test is the Tukey Honestly Signifacnt Difference (HSD) test. It is a modified version of the independent samples t-test we discussed in a previous lesson. The modification comes in the form of adjusting sampling distributions based on the number of comparisons being performed.

The order of procedures is important here:

  1. Peform ANOVA
  2. Check for significant Effect
  3. Perform post hoc test ONLY IF EFFECT IS SIGNIFICANT

SPSS will give us the outut simultaneously but we must adhere to this order. If we fail to reject the null hypothesis for the ANVOA but then claim that two of the means are reliably different, we are presenting conflicting information. Again, we can only report significant post hoc results IF AND ONLY IF we have a significant effect in the ANOVA.

Assumptions

For the conclusion of ANOVA to be valid, there are certain assumptions about the data that need met. These are the same as the independent samples t-test.

  1. Normally distributed dependent variable for each group. Since we are calculating means for each sample, we need to verify this assumption for each sample.

  2. Equal variance of dependent variable for each group. This assumption is also known as homogeneity of variance. When variances are unequal, our estimate of sampling error requires adjustment.

We will discuss how to check these assumptions using SPSS in the sections that follow.

Null Hypothesis

In the general form, the null hypothesis for ANOVA is the same as the t-test. That is, we start with the assumption that all of the samples are derived from the same population. That means we should expect that all the sample means are equal and thus equal to the population mean. We could state this general form as:

\[ H_0: M_1 = M_2 = M_3 = ... = M_n = \mu_0 \]

The null hypothesis, applied to research, is that we have no effect of an IV / predictor variable. To update our null hypothesis to the more specific version for ANOVA, we need to think about how our scores would vary, if there was no effect.

Although it is tempting to think that the variability would equal zero, we must still account for sampling error. That is, even without the influence of an IV, our scores will vary randomly around the population mean. As such, we should expect that our fraction (e.g., \(\frac{\textrm{Effect}}{\textrm{Error}}\)) will equal 1. This is because our “Effect” is the influence of the IV PLUS random variance. As such, we would expect, under the null hypothesis, that our ANOVA fraction will be 1.

The ANOVA Fraction and the F-distribution

It is time to update our statistical test formula.

\[ F = \frac{\textrm{Effect}}{\textrm{Error}}=\frac{\textrm{Group Variance + Sample Variance}}{\textrm{Sample Variance}} \]

Notice that we’ve switched from calculating a t-value to an F-value. We have a new letter because we have a new sampling distribution. Whereas t-values could be positive or negative, F-values can only be positive because variances are squared values. This changes the shape of the distributions from normal to positively skewed and is determined by the degrees of freedom.

The ANOVA Table

You will find the degrees of freedom, F-values, p-values (SPSS calls this Sig.), and more for the different sources of variance in an Analysis of Variance (ANOVA) table. Table 6 is a prototypical ANOVA table.

Table 6

Prototypical ANOVA Table

Source Sum of Squares df Mean Square F Sig.
Pet 255.00 2 127.5 246.61 .000
Error 15.00 27 .517
Total 270.00 29


The row that starts with “Pet” is the “effect variance.” It contains the information related to the statistical significance of the independent variable. The “Error” row is related to the error variance. The “total” row is the combination of all the variances associated with factors and errors.

Interpretation of the table happens in two steps.

  1. Find the row for the effect your are interested in. If you have a one-way ANOVA, there will only be one that is labeled with your IV.
  2. Check if the Sig. is less than \(\alpha\) (.05)

If the p-value is less than \(\alpha\), we will reject the null hypothesis and claim that the IV has an effect on the DV. We would then move on to interpreting the post hoc tests (Tukey HSD in this case).

The Tukey HSD Table

In SPSS, the Tukey HSD results are presented in a table titled “Multiple Comparisons” because it presents the resuls of comparing each group to each other group. There are a lot of redundancies in this table so you’ll only need to check a few rows. Table 7 is an example from SPSS.

Table 7

Example Multiple Comparison Table

Example Multiple Comparison Table


As with the ANOVA table, you’ll want to select the comparison of interest (e.g., Cat vs. Dog) then check if the “Sig.” value is less than \(\alpha\).

If the “Sig.” value is less than \(\alpha\), we will reject the null hypothesis that the two samples came from the same population. It is possible (and likely more common than not) to have a significant ANOVA and some of the comparisons not reach statistical significance. Remember, ANOVA tells us that there is some effect, but does not reveal the pattern.

The “Multiple Comparisons” table focuses on the difference of the groups so it does not report the means of each group. You can get the means, standard deviations, and 95% Confidence Intervals from the “Estimated Marginal Means” table.

Using SPSS: GLM for the One-Way Between-Subjects ANOVA

Although the additional benefits of ANOVA may suggest a lot of new things to learn in SPSS, we will have easy work of this because we have already tackled the Univariate GLM procedure with the independent samples t-test. The only additional aspects to cover will be the post hoc tests.

The Data Set

We’ll work with the PetExercise.sav file available in the “Datasets” folder on Canvas. Figure 6 displays the data view of the file.

Figure 6

Data View of PetExercise.sav

Data View of PetExercise.sav


Notice the layout of the data set. We have two variables and 30 individuals. It seems that every person reported just one “Favorite Pet” and reported their “Weekly Exercise.” Let’s check the variable view for the expected values ofr “Favorite Pet” and the what is being represented in “Weekly Exercise.” Figure 7 highlights this information in the variable view.

Figure 7

Variavle View of PetExercise.sav

Variable View of PetExercise.sav


The label for “Weekly Exercise” indicates the unit is “hours.” As such, we will interpret this variable as the number of hours one exercises in a week. The possible values of “Favorite Pet” are “Fish, Cat, or Dog.”

The Research Question

Given our data set, we can infer the research question to be: “Does the type of pet one has affect how many hours of exercise one performs in a week?”

The one-way between-subjects ANOVA seems appropriate for this question and data set becasue we have an IV with more than two groups and a continous DV. We’ll need to check our assumptions regarding normality and homogeneity of variance to be sure, however.

Checking Assumptions

Normality of DV for Groups

The assumption of normality needs to hold because we will be calculating mean in determining if there is a reliable difference between the groups.

Step 1 Split File

To check normality for each group, we need to tell SPSS to calculate descriptive statistics separately for each group via the “Split File” command.

Click on “Data” in the menu bar, then click on “Split File”

Figure

Split File in Menu Bar

Split File in Menu Bar


We’ll want SPSS to keep all of our output for the two groups together so select “Compare groups” before dragging “FavoritePet” to the “Groups Based on” box. Once this window is set, click “Paste” to generate the syntax (see Figure 8)

Figure 8

Split File Window

Split File Window


Step 2 Descriptive Statistics

Click on “Analyze” then click on “Descriptives.” Choose “Frequencies”.

Figure 9

Frequencies in Menu Bar

Frequencies in Menu Bar


In the Frequencies window (see Figure 10), drag our outcome variable (“WeeklyExercise”) to the “Variables” box. Click on the “Statistics” button to view the available descriptives to calculate.

Figure 10

Frequencies Window

Frequencies Window


In the “Frequencies:Statistics” window, select “Mean”,“Median”, and “Mode” from the “Central Tendency” options and select “Skewness” and “Kurtosis” from the “Characterize Posterior Distribution” options (See Figure 11).

Figure 11

Statistics Available in Frequencies

Statistics Available in Frequencies


Click continue to return to the main “Frequencies” window. Click “Paste” to generate the syntax.

Step 4 Run the Syntax

Now we need to ensure our split file turns off so that we can correctly run our GLM in the next segment.

Go to your “Syntax Editor” and type “SPLIT FILE OFF.” at the end of the syntax, on its own line (see Figure 12).

Figure 12

Split File and Frequencies Syntax

Split File and Frequencies Syntax


Now you can select all of the syntax (by clicking and dragging from the top through the bottom line or pressing “CTRL + A” on the keyboard) then press the green “Play” button in the toolbar.

Step 5 Interpret the Output

We’ll verify the normality of “WeeklyExercise” by comparing the mean, median, and mode for each sample of “FavoritePet” (e.g., Fish, Cat, and Dog). Figure 13 contains the descriptive statistics.

Figure 13

Descriptive Statistics for Weekly Exercise by Favorite Pet

Descriptive Statistics for Weekly Exercise by Favorite Pet


The mean and median for Weekly Exercise are very close for each group. The modes are a little lower but still acceptable.

Checking the skweness and kurtosis values we find that each group has a skewness value between -3 and 3 and each group has a kurtosis value between -8 and 8.

Equal Variance Across Groups

The other assumption we need to verify is that we have equal variance across groups. We’ll ask SPSS to perform Levene’s Test for Equality of Variances when we run the general linear model procedure, just as we did with the independent samples t-test.

Setting Up the General Linear Model

The steps will be very similar to those for the independent samples t-test. This is because, as stated above, the t-test is a form of the ANOVA.

Step 1 Select the Univariate GLM

We’ll need the Univariate GLM because we still only have one dependent variable (see Figure 12).

Figure 12

Available General Linear Models

Available General Linear Models


Step 2 Assign Variables

Figure 13 is the “univariate” window. The instructions for where to place the variables are below the figure.

Figure 13

Univariate Window

Univariate Window


Dependent Variable. Drag WeeklyExercise here.

Fixed Factors Drag favorite pet here.

With our variables set, we’ll ask SPSS to produce a bar chart.

Step 3 Create Bar Chart

Click on the the “Plots” button to open the “Univariate: Profile Plots” window.

Drag our IV (“FavoritePet”) to the “Horizontal Axis” box then click the “Add” button.

Now set the chart type to “Bar Chart” and click to “Include Error Bars”. It should be a default option, but make sure the error bars are set to “Confidence Interval (95.0%).”

The completed window should look like Figure 14.

Figure 14

Completed Profile Plot Setup

Completed Profile Plot Setup


When your chart options are set, click the “Continue” button at the bottom of the window to return to the “Univariate” window for the GLM procedure.

Step 4 Getting Means and Confidence Intervals

To get the group means and associated confidence intervals, click on the “EM Means” button on the right of the “Univariate” window. The “Univariate: Estimated Marginal Means” window will open (see Figure 15).

Figure 15

Estimated Marginal Means Window

Estimated Marginal Means Window


Drag “FavoritePet” to the “Display Means for:” box on the right side. Click “Continue” to return to the main “Univariate” window.

Step 5 Extra Options: Homogeneity Tests

Request Levene’s Test for the Equality of Variances by clicking on the “Options” button in the main “Univariate” window then choose the “Homogeneity tests” as depicted in Figure 16.

Figure 16

Requesting Homogeneity Tests

Requesting Homogeneity Tests


After turning on the “Homogeneity tests” option, click continue to return to the main “Univariate” window.

Step 6 Post Hoc Test

Should we have a significant effect of our independent variable, we’ll want to know which groups differed significantly from others. We’ll ask SPS to produce Tukey’s HSD test for this purpose.

Click the “Post Hoc” button in the main “Univariate” window. This will open the “Post Hoc Multiple Comparisons for Observed Means” window.

Drag the IV to the “Post Hoc Tests for:” box and choose the “Tukey” test as depicted in Figure 17.

Figure 17

Selecting Tukey for a Post Hoc Test

Selecting Tukey for a Post Hoc Test


Click the continue button to return to the

Step 7 Run the Model.

With all of our variables, plots, and options set, we can now generate the syntax for our model by clicking the “Paste” button at the bottom of the main “Univariate” window.

Navigating to the syntax editor should reveal the complete syntax as found in Figure 18. Highlight the syntax starting at “UNIANVOA” through “/DESIGN=FavoritePet.” To run the model, click the green “play” button at the top of the syntax editor.

Figure 18

Selecting the GLM Syntax

Selecting the GLM Syntax


Interpreting the Output

Navigate to your Output Viewer (you can use the “Window” menu in the menu bar). We’ll finish checking our assumptions before interpreting the model.

Levene’s Test for Equality of Variances

The “Levene’s Test for Equality of Error Variances” table is presented in Figure 19.

Figure 19

Levene’s Test for Equality of Error Variances Table

Levene’s Test for Equality of Error Variances Table


Remember that we want to check the “Sig” value in the top row

If the Sig. value for Levene’s Test is > .05, the assumption of equal variance is OK.

Our assumption for homogeneity of variance holds as did our assumption for normality. Now we can interpret the model.

Test of Between-Subjects Effects

To determine if favorite pet had an effect on weekly exercise, we need to check the “Test of Between-Subjects Effects” table (See Figure 20).

Figure 20

Test of Between-Subjects Effects Table

Test of Between-Subjects Effects Table


I’ve highlighted the relevant row in the table for our effect. To reject that null hypothesis that all samples come from the same population, we need a Sig. value (i.e., p-value) less than alpha (.05). We can see that the Sig value is less than .05. We therefore reject the null hypothesis and will claim that those with different favorite pets are likely to report different amounts of weekly exercise.

Rejecting the null hypothesis for the ANOVA means that we have some difference among some of our groups but we do not know which. We need to futher investigate the pattern among the groups using the Tukey HSD post hoc test.

Tukey HSD

Scroll down to the “Post Hoc Tests” section and look for the “Multiple Comparisons” table (presented in Figure 21).

Figure 21

Tukey HSD Post Hoc Test Table

Tukey HSD Post Hoc Test Table


This table reports the difference in means for each pair of groups. I’m not sure why SPSS finds it necessary to report all combinations, but there are a lot of redundancies in this table. As such, you just need to check for each combination once. Figure 22 eliminates the duplicates and highlights important columns to facilitate reading.

Figure 22

Annotated Tukey HSD Table

Annotated Tukey HSD Table


As with the ANOVA table, the Sig. column indicates the p-value. We’ll use the same rule as before, if Sig. is less than 0.05, we will reject the null hypothesis that the two samples came from the same population. This table shows us that the amount of weekly exercises varied significantly across all of the favorite pet groups.

We can come to the same conclusion by checking the 95% Confidence Intervals. None of the 95% CI for the differences include zero. That means we can reject the null hypothesis that the difference between groups is zero.

Bar Chart

Let’s check the bar chart we created to see if our interpretation from the Tukey HSD table matches. Figure 23 shows the unstyled (per APA guidelines) bar chart.

Figure 23

Bar Chart Relating Favorite Pet to Weekly Exercise

Bar Chart Relating Favorite Pet to Weekly Exercise


The error bars do not overlap so we claim statistical significance. Just as we concluded from the Tukey HSD table, our bar chart indicates that those who report different pets report different amounts of weekly exercise. It further illuminates the pattern among the means such that those who report “Dog” as their favorite reported the most exercise and those who reported “Fish” reported the least exercise.

We can find the table equivalent of this bar chart in the “Estimated Marginal Means” table (see Figure 24).

Figure 24

Estimated Marginal Means for Weekly Exercise by Favorite Pet

Estimated Marginal Means for Weekly Exercise by Favorite Pet


As with the Tukey HSD and the bar chart, we can determine statistically reliable differences because of non-overlapping confidence intervals. We’ll want to refer back to this table when writing up our results.

Presenting the Results in APA Format

We’ll write up our results in the same manner as with the t-tests but we’ll have to include the extra tests from the Tukey HSD post hoc analyses. Let’s start with the bar chart.

Styling the Bar Chart

As we’ve covered APA-styled figures previously, I’ll just present a finalized version of our bar chart for your reference in Figure 25.

Figure 25

Estimated Marginal Means of Weekly Exercise by Favorite Pet

APA Styled Bar Chart

Note. Error bars represent 95% CI


Notice that the title has been removed from the figure and placed above the figure. There is also a figure number above that. The grid lines have been removed. Lastly, there is a “Note” below the figure that explains what the error bars represent.

Writing Up the Statistical Results

Our statistical result will follow our formula:

  1. State the test used,
  2. Answer the research question, and
  3. Provide a statistical summary of the results.

We’ll do this for each test we’ve performed; once for the ANOVA and once for the Tukey HSD.

Writing up and F-test

Let’s state what test we’ve used to address the research question.


    “A one-way between-subjects ANOVA was implemented through the general linear model”


Next, we can write the conclusion of our ANOVA by answering the research question.


    “The ANOVA suggested that the amount of weekly exercise differed across groups according to their favorite pet.”


As a reminder, the results of an F-test follow the following format:

\[ F[df_{\textrm{effect}},df_{\textrm{error}}] = F-\textrm{value},\ p = p-\textrm{value}\]

We can find all of this information in the ANOVA table. Figure 20 is reproduced to highlight the needed infromation.

Figure 20

Test of Between-Subjects Effects Table

Test of Between-Subjects Effects Table


Let’s update our F-test write-up


    \(F[2,27] = 186.03,\ p < .001\)


Although the reported p-value in the ANOVA table is “.000”, that is actual SPSS rounding to 0. In fact, this probability of obtaining this F-value could never be zero because the F-distribution continues to \(\infty\) without ever crossing zero. As such, if SPSS reports “.000” in the “Sig.” column, you should report that p < .001

Writing Up the Tukey HSD Test

We’ll follow the same forumal for writing upt the results of the Tukey HSD as the ANOVA (and t-test for that matter).

First, the state the test used.


    “The Tukey HSD test was used for post hoc analyses.”


Next, report the conclusion.


    “Those who reported”dog" as the favorite pet exercised the most, followed by those who reported “cat,” and then those who reported “fish.”


Lastly, we’ll need to do some combining to make our statistical summary succinct. That is, we can combine the three tests because they have the same out. Figure 22 is reproduced to aid in the write-up

Figure 22

Annotated Tukey HSD Table

Annotated Tukey HSD Table


we can combine all of these like this:


    “All meand differences < -2.10, ps < .001”


Finally, we’ll want to include the information regarding the means and confidence intervals for each “favorite pet” group.

We’re going to use our typical set up.

\[ \textrm{M}_{\textrm{Group}} = \textrm{mean},\ 95\%\ \textrm{CI [Lower Limit, Upper Limit]} \]

We can find the necessary information in the Estimated Marginal Means table (Figure 24 is reproduced below for convenience).

Figure 24

Estimated Marginal Means for Weekly Exercise by Favorite Pet

Estimated Marginal Means for Weekly Exercise by Favorite Pet


When we are presenting multiple groups, we’ll separate their informtion with a semicolon (;).


    “\(\textrm{M}_{\textrm{Dog}}\) = 6.046, 95% CI [5.725,6.367]; \(\textrm{M}_{\textrm{Cat}}\) = 3.881, 95% CI [3.561,4.202]; \(\textrm{M}_{\textrm{Fish}}\) = 1.780, 95% CI [1.459,2.101]”


Let’s put it all together.


“A one-way between-subjects ANOVA was implemented through the general linear model. The ANOVA suggested that the amount of weekly exercise differed across groups according to their favorite pet (F[2,27] = 186.03, p < .001). The Tukey HSD test was used for post hoc analyses. Those who reported”dog" as the favorite pet exercised the most, followed by those who reported “cat,” and then those who reported “fish.” (All meand differences < -2.10, ps < .001; \(\textrm{M}_{\textrm{Dog}}\) = 6.046, 95% CI [5.725,6.367]; \(\textrm{M}_{\textrm{Cat}}\) = 3.881, 95% CI [3.561,4.202]; \(\textrm{M}_{\textrm{Fish}}\) = 1.780, 95% CI [1.459,2.101])."


Summary

In this lesson, we’ve:

  1. Enumerated the benefits of ANOVA over t-test,
  2. Described how the ANOVA approach is similar to the t-test,
  3. Described how the ANOVA approach is different than the t-test,
  4. Explained the components of variance,
  5. Stated the relevent research questions for one-way between-subjects ANOVA,
  6. Explored the assumptions of the one-way between-subjects ANOVA,
  7. Described the need for post hoc analyses,
  8. Set up and run a general linear model in SPSS,
  9. Interpreted the results of the GLM, and
  10. Shared the results in APA format.

In the next lesson, we’ll examine the the generalized version of the paired samples t-test: the within-subjects ANOVA.