Structural equation modeling (SEM) is a label for a diverse set of methods used by scientists in both experimental and observational research across the sciences, business, and other fields. It is used most in the social and behavioral sciences. A definition of SEM is difficult without reference to highly technical language, but a good starting place is the name itself.
SEM involves the construction of a model, an informative representation of some observable or theoretical phenomenon. In this model, different aspects of a phenomenon are theorized to be related to one another with a structure. This structure is a system of equations, but it is usually designed on paper or using a computer with arrows and symbols (also known as path notation as shown in Figure 1). The structure implies statistical and often causal relationships between variables, error terms and can include multiple equations. The equation (or equations) in SEM are mathematical and statistical properties that are implied by the model and its structural features, and then estimated with statistical algorithms (usually based on matrix algebra and generalized linear models) using experimental or observational data.
Criticisms of SEM methods hint at mathematical formulation problems, a tendency to accept models without establishing external validity, and philosophical bias inherent in the standard procedures.
Although there are not always clear boundaries of what is and what is not SEM, it generally involves path models (see also path analysis) and measurement models (see also factor analysis) and always employs statistical models and computer programs to investigate the structural connections between latent variables underlying the actual variables taken from observed data.
The SEM toolkit includes confirmatory factor analysis, confirmatory composite analysis, path analysis, multi-group modeling, longitudinal modeling, partial least squares path modeling, latent growth modeling and hierarchical or multi-level modeling. Use of SEM is commonly justified in the social sciences because it is a way to identify latent variables that are believed to exist, but cannot be directly observed in reality.
Researchers using SEM employ software programs (such as Mplus, lavaan (in R), LISREL, SPSS, Stata) to estimate the strength and sign of a coefficient for each modeled arrow (the numbers shown in Figure 1 for example), and to provide diagnostic clues suggesting which indicators or model components might produce inconsistency between the model and the data.
A hypothetical model suggesting that intelligence (as measured by four questions) can predict academic performance (as measured by SAT, ACT, and high school GPA) is shown in Figure 1. The concept of human intelligence cannot be measured directly in the way that one could measure height or weight. Instead, researchers have a theory and conceptualization of intelligence and then design measurement instruments such as a questionnaire or test that provides them with multiple indicators of intelligence. These indicators are then combined in a model to create a plausible way of measuring intelligence as a latent variable (the circle for intelligence in Figure 1) from the indicators (square boxes with scale 1-4 in Figure 1).
In SEM diagrams, latent variables are commonly shown as ovals and observed variables as rectangles. The diagram above shows how error (e) influences each intelligence question and the SAT, ACT, and GPA scores, but does not influence the latent variables. When applying this model to observed data generated from the instruments, the researcher can recover a measure of intelligence and academic for each individual observed with the instruments with a margin of error that is implied by the instruments. The researcher can then use intelligence to test a hypothesis, for example that intelligence causes academic performance (which is another latent variable in Figure 1) defined by a path model drawing an arrow from intelligence to performance. Figure 1 is therefore a general example of a SEM involving measurement of latent variables and estimation of a hypothesized effect between at least one latent variable and another observed or latent variable (in this case latent academic performance).
A great advantage of SEM is that all of these measurements and tests occur simultaneously in one statistical estimation procedure, where the errors throughout the model are calculated using all information from the model. This means the errors are more accurate than if a researcher were to calculate each part of the model separately.
referenceEver curious about what that abbreviation stands for? fullforms has got them all listed out for you to explore. Simply,Choose a subject/topic and get started on a self-paced learning journey in a world of fullforms.
Allow To Receive Free Coins Credit 🪙