The General Linear ModelOverviewGoals of General Linear ModelWhat Is A Statistical Model?GLM for Common AnalysesGLM in jamovi
The general linear model (or GLM for short) is a popular approach to statistical inference because of its versatility. The term “general” suggests that it can be applied in most cases. We’ll discuss which cases are appropriate for the general linear model, how it implements the tests of significance, and the goals of the approach.
In this course, we will use the GLM procedure in SPSS for each test of significance. This will limit the number of procedures and menus to learn and facilitate the inclusion of chart creation and assumption checking.
The main goal of GLM is to form population estimates of the strength and direction of relationships among predictor variables and outcome variables. The estimates are usually in the form of change. That is, how much does the outcome variable change when the values of the predictor variables change.
Of course, anytime we form an estimate for the population, we must account for sampling error. We want to know if our estimate is reliable (i.e., likely to be found in other samples of similar size) or just a fluke. We can form confidence intervals and perform null hypothesis significance testing on these estimates to do help with this.
In addition to testing each estimate, we can assess the fit of the full model. We can summarize the extent to which our model correctly predicts the variability in the outcome variables. Although beyond the scope of this course, this feature of GLM allows us to compare different models so that we can choose the best fitting model before moving to interpretation and application of the model.
The general linear model is a way to state the direction and strength of linear relationships among variables. We call it a model because it is a guess about how the population values are related that is built from sample data. Just as an engineer might construct a small scale model to test hypotheses, so to does a statistician construct a model to test hypotheses about a larger population. There are many ways to state predicted relationships among variables but perhaps the most popular model is the linear model, which follows a specific form.
This is the equation for a line where
Slope is the change in the y-variable over the change in the x-variable. That is, it represents the direction (positive or negative) and strength (amount of change) of the relationship of the two variables. The slope is kind of like the correlation coefficient except that the correlation coefficient is standardized and thus does not have units. If you are using the general linear model for regression, you can easily plug in a value of x to get a value of y. If you are interested in testing statistically significant change from one group to another, a standardized value may be easier to work with. A standardized slope is represented as ββ.
The error term is the difference between what the model predicts (e.g., multiplying
In null hypothesis significance testing, we assume that the slope is 0 (i.e., no change in outcome variable across values of the predictor variable). We’ll highlight the similarity of this approach with t-tests in the next section.
Although the linear regression model pre-dates the general linear model, it turns out to be just one specific case or implementation of the general linear model. With the more general form, we can include multiple predictors and multiple outcomes. The formula looks very similar but the letters represent something different.
Here’s what changed. Rather than
Matrix: a way to organize numbers into columns and rows where the columns typically represent some grouping factor. In statistics, the columns often represent variables.
By changing the number of and type of variables that give rise to
We will not need to worry about how to change the general formula to fit each analysis in this course because we will follow similar steps each time and jamovi will set up the equations for us. However, if you are particularly curious about the magic behind the scenes, this site gives a nice overview of a few of the more simple tests. It also relates these to “non-parametric” versions of these tests (in case you do not have normally distributed data).
Like many statistical applications, jamovi has incorporated different analytical traditions. Experimental psychology has favored the Analysis of Variance approach. You'll read more about this next week, but here is a quick introduction. Analysis of variance takes all the differences in scores in an outcome variable and determines how much of the difference (i.e., variance) is due to different ways that we can group the data (i.e., predictor variables). As mentioned before, ANOVA is one of the approaches that can be modeled with a linear equation with categorical predictors.
jamovi has an "ANOVA" button that contains different types of ANOVA.
For this course, we will focus on the "ANOVA" and "Repeated Measure ANOVA"options for all the analyses we'll cover. This will limit the number of steps we'll need to learn and will standardize some of the output we'll encounter.