Exercise 6: Multidimensional IRT in R

This exercise requires mirt for estimating multidimensional IRT models in Exercises 1-6 and eRm for Exercise 7.

For the first 5 exercises we will continue with the Workplace 1990 Workplace Industrial Relation Survey. The data are available as data(WIRS,package="ltm"). In the previous exercise we did a lot of parametric and nonparametric modelling of this difficult data set. We found that unidimensional parametric IRT does not fit the data. We saw that some assumptions are violated: There are intersecting IRF, there are non-monotonic IRF (Item 1, perhaps Item 2). Nonparametric IRT gave us a clear indication that there was something wrong. We also saw however, that the nonparametric model fit is not yet satisfactory. This follows when looking at the itemfit of the spline model, the credibility curves and the distribution on the latent trait.

One assumption we have not yet checked is whether it makes sense to assume a single latent trait. Use plot(ksm,plottype="PCA") to investigate a two-dimensional representation of the item space. What do you find? Use aisp(WIRS,search="ga") from mokken to partition the data into sub-scales that fit the Monotone Homogeneity Model. What do you find?
We saw that the items seem not to lie on a unidimensional scale but there is some indication that a two-dimensional parametric MHM model would fit Item 2-6. Use mirt() to fit two-dimensional 2PL, 3PL, 4PL models to Items 2-6 (if there are convergence issues either set technical=list(NCYCLES=2000) or method="MHRM" or combine both). Assess fit with statistics and plots. Which model is best and which would you choose? Look at the summary() to see the factor loadings.
What to do with Item 1? We can use the results from the Mokken analysis and specify a confirmatory model with ideal point or nominal specification for Item 1 (Dimension 1) and a two dimensional 3PL for the other items (Items 2,4 being Dimension 1; Items 3,5,6 being Dimension 2). Compare the fit of both and inspect the IRFs.
Both models from before do not fit very well (but the nominal specification a bit better). One reason could be that there is a nonlinear effect on the latent traits. Indeed, when looking at the IRFs this could explain the non-monotonic effect revealed before. Fit a confirmatory 2D model with nominal specification for Item 1 and 2PL specification for all other items, where each item can load on each latent trait and include a multiplicative effect of the two latent dimensions (Hint: see the help file of mirt.model() how to do this). Fit a compensatory and a partially compensatory version. Which model fits best? Compare this to an exploratory 3D 3PL. Explore the overall best model.
We now turn to polytomous MIRT and mixed MIRT. For this, load the data file familiness.rda. Inspect it. You’ll find data on 11 Likert items (6 points scale) for three dimensions of family involvement that characterize family businesses (“Ownership, Management, Control”= OMC, “Transgenerational Orientation”=TGO, “Family Business Involvement”= FBI) and two explanatory variables (the age of the firm and the size of the firm). Fit an exploratory 1-dimensional GPCM and a 3-dimensional GPCM. Is the 3D solution better? Fit a confirmatory model with no item cross-loading to other dimensions (so OMC only loads on OMC) etc. Would you say that the hypothesized model is worse than the exploratory model?
For the familiness data, include a fixed effect for firm Age and Size in a latent regression model on the confirmatory model structure (i.e., decomposing the latent trait, lr.fixed). Inspect the coefficients and the model fit and interpret it. Compare the explanatory model to the confirmatory model from 5).
Load the AMC data object. AMC 2 is an exam on accounting at WU. We look at the 12 dichotomously scored items. The students were assessed on two different points in time with a break in between. The first round was asking multiple choice format, the second asked the same questions but in an open format. It was of interest to see whether multiple choice is different than open format. Fit an LLRA model for the AMCit with two measurement timepoints. This quantifies the change over time (trend) for each item as its own dimension. Assess the model and plot it. What do the results mean in light of the design setup? What is problematic in the setup? From the AMCcov data frame, select HS the type of high school (AHS is standard high school, HAK is a high school with focus on commerce, other is the rest). Include it as a groups covariate in the analysis. Assess the model in comparison to the model without a group and plot it. What do we learn from this analysis?

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Exercise 6: Multidimensional IRT in R

Thomas Rusch

July 17, 2017