Vous êtes sur la nouvelle plateforme d’Érudit. Bonne visite! Retour à l’ancien site

Application of Structural Equation Modeling to the Social Sciences: A Brief Guide for Researchers

  • Vaithehy Shanmugam et
  • John E. Marsh

…plus d’informations

  • Vaithehy Shanmugam
    University of Central Lancashire

  • John E. Marsh
    University of Central Lancashire

Authors’ Note

For correspondence: Vaithehy Shanmugam, School of Psychology, University of Central Lancashire, Darwin Building, Preston, Lancashire, United Kingdom, PR1 2HE, Phone: (+44) 177 28 3257, E-mail: [vshanmugam@uclan.ac.uk].

Couverture de                Volume 37, numéro 3, 2015, p. 1-148, Mesure et évaluation en éducation

Corps de l’article


Structural Equation Modeling (SEM) stems from the family of multivariate analyses. It serves a very similar purpose to that of multiple regressions. While multiple regressions are used to examine the independent predictors of a dependent variable, out a of set of independent variables, SEM represents translations of a series of hypothesized ‘cause and effect’ relationships between variables into a composite hypothesis concerning patterns of statistical dependencies (Shipley, 2000) and offers a comprehensive method for the simultaneous quantification and testing of theoretical models (Pugesek, Tomer, & von Eye, 2003). Specifically, the theoretical model represents causal processes that generate observations of multiple variables (Bentler, 1980) and the relationships between such variables are described as parameters that indicate the magnitude of the effect (either direct or indirect) that the independent/exogenous variable(s) have on the dependent/endogenous variable(s). As such, if the model achieves acceptable ‘goodness of fit’ then it is argued that the postulated relations of the model are plausible; however if the goodness of fit indices are inadequate/unacceptable, then the tenability of such relations is rejected (Byrne, 2006). SEM is an extension of multiple regressions in that it is multivariate and, as such, can simultaneously assess several regressions at one given time, as well as allowing variables to be classified as both exogenous and endogenous within the same model (Schumacker & Lomax, 2004). Moreover, it takes a confirmatory, as opposed to exploratory, approach to data analysis by demanding that the pattern of observed relationships are specified prior to model testing (Byrne, 2006). Finally, it exerts the ability to account and correct for measurement errors, be they random (e.g., sampling error) or systematic (e.g., underlying psychometric properties related to the measure) as the analysis is conducted at the measurement level, by incorporating the error/residual error variance in the estimated model of which traditional multivariate analyses such as regressions are not capable (Kline, 2005), as well as at the structural level by incorporating disturbances.

Although initially developed for use in genetics (Wright, 1921), since its introduction, the use of SEM as a statistical tool to evaluate theoretical and conceptual models and/or to test empirical relations between psychological constructs has gained momentum and grown in popularity in several disciplines such as psychology, sociology and economics. Although there appear to be a number of books dedicated to this topic within education (e.g., Teo & Khine, 2009), the application of such analyses within this research area is considered to be limited (Karadag, 2012). This is quite surprising given that the measures, research questions, and research designs used within education have become more complex, thus calling for more sophisticated and robust methods of analysis. Therefore the purpose of the current paper is to introduce and explain the key concepts and principles of SEM, discuss the advantages of SEM over other multivariate analyses, and integrate a research example to demonstrate the various stages involved in SEM.

Key concepts and principles in SEM

It is generally accepted that a two-step approach is undertaken when conducting SEM (e.g., James, Mulaik, & Brett, 1982; Kline, 2005; Schumacker & Lomax, 2004). Specifically, this approach involves the testing of two models: the measurement model and the structural model. Before proceeding to these, it is important to note that there are two primary variables in SEM: observed (indicators; e.g., individual items pertaining to psychometric instruments) and latent (constructs; e.g., subscales of psychometric instruments). Specifically, latent variables are not measured directly; rather, they are inferred constructs measured indirectly through the observed variables. It is common that multiple observed variables underlie the latent variable and, as such, the benefit of this is that measurement errors related to the reliability or the validity of the observed variable are accounted for (Kline, 2005).

Measurement model

The measurement model is a confirmatory factor model and is often conducted first in SEM. The main objective of the measurement model is to discover the reliability and validity of the observed variables in relation to the latent variable (e.g., are the observed variables accurately measuring the construct under examination?). Traditionally, each latent variable should be represented by multiple indicators (three as a minimum). As such, the relationship between the latent variable and the observed variables is indicated by factor loadings (Byrne, 2006). The factor loadings generate and highlight the extent to which the observed variables are able to measure the latent variable. In addition to producing factor loadings, the measurement model also generates the measurement error associated with the observed variables. Measurement error specifically highlights the extent to which the observed variables are measuring something other than the latent variable it is proposed to measure (Kline, 2005). As such, a factor loading of .40 per observed variable is deemed acceptable (Ford, MacCallum, & Tait, 1986).

Structural model

The second process in SEM involves the structural model. While the measurement is concerned with the reliability and validity of the latent variables, the structural model is primarily concerned with the interrelations between the latent variables. Specifically, the structural model tests the extent to which the hypothesized or theorized relations between the latent variables are supported within the current sample under investigation.

Prior to conducting SEM analyses, it is advised that three preliminary issues related to sample size and convergence, model specification, and model identification are addressed (Marsh, 2007; Schumacker & Lomax, 2004). Each issue will now be discussed accordingly.

Sample size and item parceling

In order for the model to converge (run), it is recommended that there be between five and ten participants per observed variable (e.g., Bentler & Chou, 1987; Byrne, 2006), with a total of 200 participants as the minimum (Bentler, 1999). However, this may not always be feasible within the research setting, especially if a large number of psychological constructs or complicated theoretical models are being tested. A common method to overcome this shortage in participant numbers is item parceling; whereby the items of the underlying latent variable are grouped together to produce parcels of two to six (Marsh & Hau, 1999; Yang, Nay, & Hoyle, 2010).

Several methods for parceling have been suggested, including, parceling all items into a single parcel (mean of each latent variable), splitting all odd and even items into two parcels, randomly selecting a certain number of items to create three or four parcels (e.g., Yang et al., 2010), parceling items that have similar factor loadings (Cattell & Burdsal, 1975), parceling items with high factor loadings with low factor loadings to equalize the loadings (Russell, Kahn, Spoth, & Altmaier, 1998) and parceling items according to their skew (Hau & Marsh, 2004; Nasser-Abu Alhija & Wisenbaker, 2006; Thompson & Melancon, 1996). Although one method of parceling is not advocated over another, the guidelines of Hau and Marsh (2004) and Nasser-Abu Alhija and Wisenbaker (2006) are seen as favourable as in this method items are parceled according to the size and direction of their skew. Specifically, the most skewed items are parceled with the least skewed items, then the next most skewed to the next least skewed and so on. In addition, this process is counterbalanced in that items that were negatively skewed are parceled with positively skewed items.

To parcel or not to parcel?

The use of parcels instead of the indicators has sparked debate among researchers (see Little, Cunningham, Shahar, & Widaman 2002). For example, there are numerous empirical justifications for using parceling, including increased reliability (Kishton & Widamann, 1994), achieving normality within the data (Bandalos, 2002; Nasser & Wisenbaker, 2003), remedying small sample sizes and unstable parameter estimates (Bandalos & Finney, 2001), as well as a greater likelihood of achieving a proper model solution (Bandalos, 2002; Marsh, Hau, Balla, & Grayson, 1998). However, such adventurous properties of item parceling are only said to be effective if the observed items of the underlying latent factor are unidimensional (Bandalos, 2002; Hall, Snell, & Foust, 1999, Yang et al., 2010).

Empirically, the effects of parceling over individual indicators of the latent factors have been documented in several simulation studies (e.g., Bandalos, 2002; Marsh et al., 1998; Yuan, Bentler, & Kano, 1997) and results have demonstrated that it is more beneficial to parcel than to use the same number of individual items, as when parcels were used, not only were the fit indices more adequate, but the results were more likely to yield a proper solution. However, parceling has been likened to ‘cheating’ as it creates bias in the individual’s responses by changing their original scores, which could subsequently manufacture a false structure (Little et al., 2002). Moreover, many of the measures often employed within research have already established norms for population; however, by parceling items, the meaningfulness of these norms can be lost (Little et al., 2002; Violato & Hecker, 2007). For example, if one were to compare the eating disordered symptoms of female students and male students using the Eating Disorder Examination Questionnaire (EDEQ 6.0; Fairburn & Beglin, 2008), the use of parcels will prohibit any meaningful comparisons with pre-existing norms; thus, in terms of applied implications, the use of parcels produces arbitrary data.

Model Specification

Model specification relates to the process where assertions are made about which effects are null, which are fixed, and which are freely estimated (see Figure 1). SEM operates only using a priori hypotheses. Thus any research question developed should be guided by relevant theory and empirical evidence as well as reflected in reliable and valid psychometric measures. Using theory and empirical evidence, testable model(s) are developed and subsequently specified. Specifically, the relations between variables at both the measurement (e.g., pathways, covariances, etc.) and structural level are clarified and defined.

Model Identification

Model identification refers to whether the unique set of parameters is consistent with the data: whether it is possible to attain unique values for the parameters of the model (Violato & Hecker, 2007). Specifically, identification relates to the transposition of the variance–covariance matrix of the observed variables (the data points; number of observed variables) into the structural parameters of the model under examination (Byrne, 2006).

There are three variants of model identification; under-identified, just-identified and over-identified. An under-identified model is one where the number of parameters to be estimated exceeds the number of data points. This type of model is perceived as problematic because the model is considered to contain insufficient information to attain a fixed solution of parameters estimation, meaning that there are an infinite number of possible solutions (Byrne, 2006). Moreover, the parameter estimates are considered to be untrustworthy (Kline, 2005). A just-identified model is one where the number of data points equals the number of parameters to be estimated (e.g., a saturated model). Consequently, this type of model is also considered problematic as it will always achieve a perfect fit to the empirical data (Pugesek et al., 2003) and can never be rejected. The final variant is an over-identified model, where the number of available data points is greater than the number of parameters to be estimated, thus resulting in positive degrees of freedom, allowing for model rejection.

To calculate whether a model is identified, the following equation is often employed, where p = data points/number of observed variables:

Model Estimation

Following model specification and identification, the hypothesized model is then estimated. Model estimation determines how the tested model fits the generated data based on the extent to which the observed covariance matrix (data generated) is equivalent to the model-implied covariance matrix (e.g., hypothetical model; Lei & Wu, 2007). This comparison between two covariance matrices can be expressed in the following equation, ∑ = ∑(θ). In this equation, ∑ (sigma) represents the population covariance matrix of observed variance, θ (theta) represents the vector comprised of the population parameters and ∑(θ) is the covariance presented as a function of θ (Violato & Hecker, 2007). Violato and Hecker proposed a “hand in glove” metaphor, which may be useful in understanding this process. In this metaphor, the glove is the model (∑(θ)), while the hand is the data (∑). In the attempt to find the perfect fitting glove (i.e., model or theory) for the hand (i.e., the data), the lack of fit (i.e., too big, too small) is represented by the θ vector.

Unlike ANOVAs and regressions, which tend to use the least squares methods of estimation, SEM uses iterative estimation methods. This method consists of repeating calculations until the best-fitting estimations for the parameters are obtained. There is a number of methods of estimation, including Maximum Likelihood (ML), Generalised Least Squares (GLS) and Asymptotic Distribution Free (ADF). However, the most frequently employed method is ML, often the default estimation procedure on many SEM programmes. The ML procedure operates by providing estimates of parameters that maximize the likelihood that the predicted model fits the observed model based on the covariance matrix (Bollen, 1989; Violato & Hecker, 2007) and functions on the assumptions that data is normally distributed and that the sample size is large.

Model Evaluation and Respecification

This process of model estimation leads to the Goodness-of-Fit (GOF) testing. The GOF is critical to conducting SEM, as it allows the adequacy of the tested model to be evaluated and permits comparison of the efficacy of multiple competing models. Specifically, GOF reflects the extent to which the model fits the data. In order to find a statistically significant theoretical model with practical and substantive meaning, multiple goodness-of-fit indices to assess model fit have been put forward. Although there are no concrete rules about which fit statistics to use to evaluate models, a combination of fit statistics are employed when comparing and contrasting models.

The first is the non-statistical significance of the chi-square (𝜒2). A non-significant 𝜒2 suggests that the sample covariance (e.g., theoretical model) and the reproduced model implied covariance matrix (tested model) are similar. However, it should be noted that 𝜒2 is considered to be highly sensitive to sample size (Cheung & Rensvold, 2002). Specifically, the larger the sample size (generally over 200), the greater the tendency for 𝜒2 to be significant, whereas with a lower sample size (below 100), the 𝜒2 test has a tendency to indicate a non-significant probability level. As such, Kline (2005) recommended employing the normed 𝜒2, which is calculated by dividing the 𝜒2 value by the degrees of freedom. A normed 𝜒2 value of less than three (3) has been suggested to indicate a reasonable fit to the data (Bollen,1989).

In addition to the 𝜒2 statistic, other incremental fit indices have also been proposed to supplement it, which are said to be designed to avoid the problems associated with sample size as related to the 𝜒2 test (Bentler & Bonett, 1980). These include the Root-Mean-Square Error of Approximation (RMSEA), Standardised Root-Mean-Square residual (SRMR), Comparative Fit Index (CFI), the Non-Normed Fit Index (NNFI), Tucker-Lewis Index (TLI), Goodness-of-Fit Index (GFI), and the Akaike Information Criterion (AIC). Specifically, an RMSEA value of < 0.05 indicates a good-fitting model (Browne & Cudeck, 1993). For CFI, NNFI, TLI, and GFI, a value > 0.90 is regarded as an acceptable fit of data, while for the SRMR a value of < 0.01 is considered good fit (e.g., Kline, 2005; Marsh, 2007; Marsh, Hau, & Wen, 2004). The AIC is used to compare a number of competing models and, in these instances, the model which generates the lowest AIC values is regarded as the best fitting model. The actual AIC value is not relevant, although AIC values which are close to zero are considered to be more favourable. When selecting a model out of a number of possibilities, parsimony should be employed, with the simplest model being selected (Bollen & Long, 1993).

If the model’s fit is acceptable, this suggests that the proposed relationships or the hypothesized model fits the data. If the model’s fit is not adequate, then the model needs to be respecified. However, the respecification needs to grounded in theoretical relevance, as opposed to empirical relevance. Specifically, the respecification of causal relationships needs to be theoretically meaningful and supported by empirical evidence. It should not be empirically guided, as this can result in a good-fitting model in the absence of any theoretical value. Respecification can be conducted in a number of ways. Firstly, non-significant pathways can be deleted or trimmed. Secondly, parameters can be added or deleted in the model to improve the fit. SEM contains modification indices such as the Lagrange Multiplier tests and Wald tests, which provide suggestions for this; however, proceeding with such suggestions should be driven by theory and consistent with the research hypotheses. Through respecification, once a good-fitting model is achieved, ideally the newly formulated model should be tested on a new sample/data.

Is SEM always appropriate for use?

As the research questions being tested have become more complex, there has been a concomitant rise in the demand by reviewers and journal editors for authors to undertake more sophisticated modes of analyses. However, caution must be exercised here as SEM may not be suitable for all research questions. Reviewers and journal editors often want an author to use SEM but do not always understand that it is inappropriate in some cases. In this respect, it is important to clearly understand the nature of the research question being examined, as well as the answers that one would like to generate. Therefore, prior to applying SEM, it is important to consider the strengths and weaknesses of SEM over other multivariate analyses.

Advantages of SEM

  1. Results generated by SEM can provide evidence for causal relationships between variables. However, as SEM is a priori dependent on theory and previous empirical evidence, researchers must be aware and confident of a relationship between the variables (observed and measured) as well as the direction of that relationship. Moreover, such relationship should occur in isolation and not be influenced by other variables (Kline, 2005). However, it is important to note that SEM does not prove causality: rather, it only highlights whether the hypothesized relations or model are consistent with the empirical data.

  2. SEM allows researchers to test and compare a number of competing/alternative models, promoting robust theory- building and validation.

  3. SEM can test models with multiple dependent and independent variables, as well as mediating and interactive effects.

  4. SEM is able to manage with difficult data (e.g., non-normal, incomplete, multi-level and longitudinal data). For example, SEM programs have procedures that are robust against violations of normality (e.g., Maximum Likelihood Estimation), missing data (e.g., multisample analysis; multiple imputation; expectation maximization algorithm; full information maximum likelihood; see Tomarken & Waller, 2005). It can also work with experimental and non-experimental data, as well as continuous, dichotomous, and interval data.

  5. SEM uses CFA to partial out measurement error from multiple indicators underlying each latent variable, and therefore subsequently enables the relationships between “error free” latent variables to be tested (Violato & Hecker, 2007).

Disadvantages of SEM

  1. Given that SEM is dependent on theory and previous empirical literature, there is scope for investigators to misinterpret the causal relationships between variables, especially if the model being tested is exploratory, is grounded in weak theory, employs poor research designs, or is guided by ambiguous hypotheses (Violato & Hecker, 2007).

  2. SEM is an approximation of reality, in that it omits variables implicated in the model or causal processes to achieve goodness-of-fit (Tomarken & Waller, 2005). In doing so, it can create a misrepresentation of the measurement and/or structural processes, resulting in more biased and/or inaccurate parameter estimates and standard errors (e.g., Reichardt, 2002).

  3. SEM is unable to compensate for inadequate psychometric properties of measures (Byrne, 2010; Kline, 2005), in particular measures that are underpinned by poor reliability. The employment of unreliable measures or the use of a single measure to reflect the latent variable is likely to reduce the amount of variability in the latent variable, thus increasing measurement error. Similarly, it cannot compensate for the limitations of the research design nor its methodology.

  4. SEM requires a large sample size. A minimum of 200 participants are considered sufficient; however, a rule of thumb of 5–10 participants per indicator has been proposed (e.g., Byrne, 2006). Still, it should be noted that in populations where this is not always feasible, there are ways to overcome the shortage of participants (see the section on parceling above).

  5. SEM rejects theory and models on the basis of the global fit statistics. It is possible for the relations between variables to be significant although the model yields a poor fit, thus indicating that the model does not fit the data. Before rejecting the model, researchers should consider checking for errors in data or violations of SEM assumptions. Another proposed method to improve fit indices is to estimate as many parameters as there are data-points (just identified model); however, this renders the data meaningless, explains nothing more about the tested model and, as such, should be avoided (Mulaik et al., 1989). In cases where poor global fit indices persist, researchers can rely on the effect size of the association, confidence intervals, and other lower-order components when evaluating a model (Tomarken & Waller, 2003, 2005). However, researchers should be aware of alternative modes of analyses such as the MACROS developed by Preacher and Hayes (2004, 2008; Hayes, 2013) available on SPSS, which can be used to test for similar relations (e.g., mediation, moderation, temporal patterns, etc.), and are not dependent on fit statistics.

Step 1: Mdel specification: Outline and define the research problem

In this step researchers should develop and formulate a research question that is grounded in theory and underpinned by empirical evidence. Moreover, as SEM functions using a priori hypotheses, it is critical that the measurements used to capture and reflect the chosen construct are valid and reliable for use within the given population. Accordingly, based on theory and evidence, researchers should formulate a testable model (or a number of competing testable models). This testable model is then specified. Specifically, the relationships between variables at both the measurement and structural model should be noted.

In the current example, the research problem was aimed at examining the applicability of the components underlining the transdiagnostic cognitive-behavioural theory of eating disorders within an athletic population. (For a more comprehensive outline of the theory and literature, see Shanmugam, Jowett, & Meyer, 2011.) The transdiagnostic cognitive-behavioural theory of eating proposes the mechanisms that cause and maintain eating disorders (be it Anorexia Nervosa, Bulimia Nervosa, or Eating Disorder Not Otherwise Specified) are the same (Fairburn et al., 2003). Specifically, Fairburn et al. postulated that the four core psychopathological processes of clinical perfectionism, unconditional and pervasive low self-esteem, mood intolerance, and interpersonal difficulties all interrelate with the core psychopathology of eating disorders – over-evaluation of eating, shape, weight, and their control – to instigate both the development and the maintenance of the disorder. While their transdiagnostic cognitive-behavioural theory of eating disorders provides a grounded conceptual framework to understand how eating disorders may arise, with relevant evidence to support the associations among its main components within the general population (e.g., Collins & Read, 1990; Dunkley, Zuroff, & Blankstein, 2003; Dunkley & Grilo, 2007; Leveridge, Stoltenberg, & Beesley, 2005; Stirling & Kerr, 2006), there is an observable gap in the scientific understanding of such processes within the athletic population, as well as a poor understanding of the concomitant interrelationships among the processes involved. Thus, the purpose of the present example was to test the main components of Fairburn et al.’s transdiagnostic theory in a sample of athletes to further understand eating psychopathology.

Guided by Fairburn et al.’s (2003) theory and relevant empirical research, the first objective was to test a model that proposed linkages between interpersonal difficulties, clinical perfectionism, self-esteem, depression, and eating psychopathology (see Figure 1). Specifically, it was hypothesized that dispositional interpersonal difficulties as reflected in athletes’ insecure attachment styles [1] would negatively affect their perceptions of situational interpersonal difficulties as reflected in the quality of the athletes’ relationships with parents and coaches (e.g., decreased perceived support and increased perceived conflict) [2]. It was further hypothesized that poor relationship quality would lead to higher levels of clinical perfectionism (personal standards and self-criticism) [3]. Subsequently, athletes’ levels of personal-standards perfectionism was expected to negatively predict their levels of self-esteem, while athletes’ levels of self-critical perfectionism were predicted to negatively estimate their levels of self-esteem [4], but to positively predict depressive symptoms [5] and eating psychopathology [6]. Finally, it was hypothesized that athletes’ levels of self-esteem would negatively predict their levels of depressive symptoms, which in turn were expected to be positively associated with athletes’ eating psychopathology.

Figure 1

The hypothesized transdiagnostic cognitive-behavioural model of athletes’ eating psychopathology. (from Shanmugam et al., 2011) Copyright Human Kinetics. Reprinted with permission

The hypothesized transdiagnostic cognitive-behavioural model of athletes’ eating psychopathology. (from Shanmugam et al., 2011) Copyright Human Kinetics. Reprinted with permission

-> Voir la liste des figures

Step 2: Model identification: Review model for identification

In this step, the constructed model is reviewed for identification. The process of identification is achieved by establishing the number of observed variables and the number of parameters to be calculated. As previously mentioned, an over-identified model is recommended. Specifically, in the testable model, the number of known data points (i.e., variances, covariances) should exceed the number of data points that are unknown or being estimated (i.e., factor loadings, measurement error, disturbances, etc.). In the current example, 127 observed items were utilized. Using the recommended 10:1 ratio of participants to observed variables (Byrne, 2006); would have required a total of 1270 athletes. However, only 588 athletes participated in the study. Thus parceling was conducted, following the guidelines of Hau and Marsh (2004) and Nasser-Abu Alhija and Wisenbaker (2006), whereby the observed items were parceled according to the size and direction of their skew per latent variable, thus reducing the number of observed variables to 48. Employing the aforementioned equation, the model identification of the hypothesized model was tested prior to model estimation (see Figure 2), and revealed an over-identified model, with 1057 degrees of freedom.

= P (P+1)/2 information points > parameters to be estimated
= 48 (49)/2 information points > 37 factor loadings, 48 errors, 31 path coefficients and disturbances, and 1 covariance
= 1176 information points > 119 parameters to be estimated
= 1057 dfs

Figure 2

Model specification of the hypothesized model using the parcelled items. Note 1 = fixed parameter, pointed arrows parameters to be freely estimated, E= error variance, D= disturbance

Model specification of the hypothesized model using the parcelled items. Note 1 = fixed parameter, pointed arrows parameters to be freely estimated, E= error variance, D= disturbance

-> Voir la liste des figures

Step 3: Model estimation: how well does the model fit the data?

In this step, the hypothesized model is estimated. One of the advantages of SEM is the number of commercial SEM softwares that are available and regularly updated.

These include and are not limited to AMOS for SPSS (Arbuckle, 2012), EQS (Bentler, 2006), LISREL (Jöreskog & Sörbom, 1996), and MPlus (Muthén & Muthén, 1998–2010). It is beyond the scope of the current paper to provide an overview of the underlying features of each program; thus, readers are directed to Lei and Wu (2007) for an overview.

Prior to model estimation, it is critical that descriptive and univariate analyses are conducted on the collected data to ensure that the data fulfill the assumption of SEM. These include multivariate normality, independence of observations and homoscedasticity (Violato & Hecker, 2007). In the current example, following model specification and identification, the hypothesized model was estimated using the Maximum Likelihood Estimation procedure within EQS 6.0. Due to the violation of multivariate normality, corrections for non-normality were employed and robust statistics were attained. Moreover, only the variables that were significantly correlated to the dependent variable were included in the SEM.

Step 4: Model evaluation and respecification: Establish the fit of the model to the data

In this step the hypothesized model is determined and evaluated using a number of GOF indices. If the GOF indices obtained are acceptable, this indicates that the hypothesized model fits the data. However, if the GOF indices are not satisfactory, then the model needs to be respecifed. In the current example, the significance of 𝜒2, the normed 𝜒2, the Root-Mean-Square Error of Approximation (RMSEA), the Non-Normed Fit Index (NNFI), and the Comparative Fit Index (CFI) were all used to evaluate the fit of the model. GOF indices revealed that the measurement model of the hypothesized model (see Table 1) fit the data well: 𝜒2 = 2159.95, df = 1025, < 0.0001, RMSEA = 0.043 (90% CI = 0.041–0.046), NNFI = 0.92, and CFI = 0.93; with satisfactory factor loadings (see Table 1), and recorded above the recommended value of 0.40 (Ford et al., 1986).

Table 1

Standardised factor loadings from the measurement model

Note: All loadings are significant at the .05 level. ECR-AV= avoidant attachment, ECR-ANX= anxious attachment, S-SQRI-PS= parental support, S-SQRI-PC= parental conflict, S-SQRI-CS= coach support, S-SQRI-CC=coach conflict, FMPS-PS=personal standards perfectionism, DAS-SC= self-critical perfectionism, RSES=self-esteem, SCL-Depression=depression, EDEQ= eating psychopathology

Standardised factor loadings from the measurement model

-> Voir la liste des tableaux

However, the predicted structural model failed to achieve an acceptable goodness-of-fit: 𝜒2 = 2645.57, df = 1057, < 0.0001, RMSEA = 0.051 (90% CI = 0.048–0.053), NNFI = 0.89, and CFI = 0.90. Thus the model needed to be respecifed. Guided by the Lagrange Multiplier tests’ output, all empirical suggestions that were conceptually and theoretically meaningful were carried out. In particular, removing all the non- significant paths – pathways between Self-Critical perfectionism to Depression and eating psychopathology, parameters associated to Personal Standards and Anxious Attachment and creating a linear pathway between Parental Support, Coach Support, and Parental Conflict and Coach Conflict, respectively – improved the model fit to ensure an acceptable goodness-of-fit and a parsimonious model.

The fit of the respecified model was 𝜒2 = 1367.94, df = 693, p < 0.0001, RMSEA = 0.041(90% CI = 0.038–0.044), NNFI = 0.94, and CFI = 0.94 (see Figure 3). The normed 𝜒2 value was 1.97 (1367.94/693). Thus, the normed X2 value and all the other incremental fit indices provide good support for the final model.

As shown in Figure 3, in the current example, avoidant attachment was associated with poor quality relationships (characterized by decreased perceived support and increased perceived conflict) with their influential parent and principal coach. Moreover, higher levels of conflict in their parent–athlete and coach–athlete relationships were related to higher levels of self-criticism. High levels of self-criticism were related to low self-esteem and feeling of worthlessness. Subsequently, low self-esteem was linked to higher depressive symptoms, which in turn were linked to elevated eating psychopathology. The findings also suggested the same processes that are likely to lead to elevated eating psychopathology are also likely to prevent it. In particular, secure attachment was associated with high quality parent-athlete and coach–athlete relationships, resulting in low levels of self- criticism, which in turn was associated with higher levels of self-esteem.

Figure 3

A structural representation of the transdiagnostic cognitive behavioural model of athletes’ eating psychopathology. The standardized coefficients presented are significant at .05 level. Taken from Shanmugam et al. (2011). Copyright Human Kinetics reprinted with permission

A structural representation of the transdiagnostic cognitive behavioural model of athletes’ eating psychopathology. The standardized coefficients presented are significant at .05 level. Taken from Shanmugam et al. (2011). Copyright Human Kinetics reprinted with permission

-> Voir la liste des figures

Subsequently, high levels of self-esteem were associated with low levels of depression, which in turn was linked to healthy eating. Collectively, these findings are consistent with the assumptions of the transdiagnostic cognitive-behavioural theory and with previous findings that have linked avoidant attachment (e.g., Ramacciotti et al., 2001), poor quality relationships (e.g., McIntosh, Bulik, McKenzie, Luty, & Jordan, 2000), low levels of self-esteem (e.g., Shea & Pritchard, 2007), high levels of self- critical perfectionism (e.g., Dunkley, Blankstein, Masheb, & Grilo, 2006), and depression (e.g., Stice & Bearman, 2001) to disturbed eating behaviors.


The aim of this article was to provide an overview of SEM and to complement this with a worked empirical example. The dichotomy of models (one constituting a theorized organization of indicator variables and how they identify the latent variables, and another referring to the relationships between the latent variables) and the sequential steps involved in theory and model testing, including model specification, model identification, model estimation and model evaluation was outlined. SEM is a theory-strong approach underpinned by established research methods but must be used with caution. As a prerequisite for the proper use of SEM, a substantial base of empirical evidence must exist, combined with strong conceptual understanding of the theory relevant to the research question and access to large samples that may be difficult to access. Skills-training is also a necessity, such that researchers can begin to apprehend the advanced theoretical and statistical methods required to test complex, integrated theoretical models within the social sciences.

Parties annexes