R: Item Response Theory simulation and estimation

rirt {rirt}

R Documentation

Item Response Theory simulation and estimation

Description

This package contains functions to estimates the items and latent variables from the responses of subjects to a questionnaire.

The supported dichotomous IRT (Item Response Theory) models are the 1PLM (one parameter logistic model), the 2PLM (two parameter logistic model) and the 3PLM (three parameter logistic model). For polytomous items, Bock's nominal model, and Samejima's graded model are supported. Two nonparametric methods (kernel regression an PMMLE) are also supported.

The estimations methods available are the MMLE (marginal maximum likelihood estimator) and the BME (Bayes modal estimator, for dichotomous models only) for the parametric estimation of items, the PMMLE (Penalized MMLE) and the kernel regression (Nadaraya-Watson) for the nonparametric estimation of items, and the EAP (expectation a posteriori) and WMLE (Warm's maximum likelihood estimator) for the latent variable predictions.

Methods to generate random items and responses are also provided.

Usage

fitirt(data, model="2PLM", key=NULL, graded=FALSE, options.weights=NULL,
  max.em.iter=100, max.nr.iter=200, precision=1e-5, smooth.factor=NULL,
  z=seq(-4, 4, length=64), grouping=TRUE, init=NULL,
  verbose=0, continue.on.error=1,
  slope.init=1.702, thresh.init=0, asymp.init=0.2,
  slope.prior=NULL, thresh.prior=NULL, asymp.prior=NULL)

genirt(nbr.item, model="2PLM", nbr.options=rep(2, nbr.item),
  genslope=function(n)runif(n,0.5,2), genthresh=function(n)runif(n,-2,2),
  genasymp=function(n)runif(n,0,0.2), genslopenom=function(n)runif(n,-2,2),
  fixed.slope=1, items.label=NULL, options.label=NULL, key=NULL)

predict.rirt(object, select=NULL, type=NULL,
  z=NULL, data=NULL, normal.ogive=FALSE, z.method="EAP",
  show.se=FALSE, max.nr.iter=100, precision=0.001,
  verbose=0, continue.on.error=1)

plot.rirt(object, select=NULL, type=NULL,
  z=NULL, data=NULL, col=NULL, lty=NULL, pch=NULL, lwd=2,
  xlim=NULL, ylim=NULL,  xlab=NULL, ylab=NULL, main=NULL,
  leg=TRUE, leg.loc="bottomright", leg.title=NULL, leg.lab=NULL,
  leg.width=10, plot.type="l", add=FALSE, ...)

Arguments

`data`	A matrix or data.frame (subjects x items) of responses. For dichotomous models, the allowed values are TRUE or 1 for success, FALSE or 0 for failure, and NA for a missing value. For polytomous models the allowed values are NA for a missing value and anything else for a valid option.
`model`	A string with the model name or nonparametric estimator, or a vector of model for each item for mixed model. In the case of a mixed model only the NOMINAL and GRADED models are supported. "1PLM" for the one parameter logistic model (dichotomous), "2PLM" for the two parameters logistic model (dichotomous), "3PLM" for the three parameters logistic model (dichotomous), "GRADED" for Samejima's grade model (polytomous), "NOMINAL" for Bock's nominal model (polytomous), "KERNEL" for the kernel regression method (dichotomous or polytomous), "PMMLE" for the penalized maximum marginal likelihood method (dichotomous or polytomous).
`key`	The answer key, a vector with the correct answer to each item. For polytomous models only.
`graded`	Enable the graded scoring of options. For polytomous models only.
`options.weights`	The weights of each options, as a list of vector for each item or as one vector. For polytomous models only.
`max.em.iter`	The maximum number of EM iterations. The forced minimum number of iterations is the minimum between 20 and max.em.iter.
`max.nr.iter`	The maximum number of Newton iterations.
`precision`	The desired precision.
`smooth.factor`	If using PMMLE, the smoothing factor (default to 4 times the number of subjects to the power of 0.2). If using kernel, the bandwidth (default to 2.7 times the number of subjects to the power of -0.2).
`z`	The latent variable values used.
`grouping`	Enable the grouping of identical responses vectors to speed up the processing of large sample with few items and few options.
`init`	An object of class `rirt` as returned by `fitirt` or `genirt` to use as initial value.
`verbose`	Set the verbosity level. The debugging messages are seen only if R is run in a command prompt not in the GUI.
`continue.on.error`	Set the default error handling for errors in the GSL library.
`slope.init`	The initial slope.
`thresh.init`	The initial threshold.
`asymp.init`	The initial asymptote.
`slope.prior`	A two values vector giving the mean and standard deviation of the log normal prior to estimate the slopes. For dichotomous parametric models only.
`thresh.prior`	A two values vector giving the mean and standard deviation of the normal prior to estimate the thresholds. For dichotomous parametric models only.
`asymp.prior`	A two values vectors giving the mean and weight of the beta prior to estimate the asymptotes. For dichotomous parametric models only.
`nbr.item`	The number of items.
`nbr.options`	A vector with the number of options for each item.
`genslope`	Function to generate random slopes.
`genthresh`	Function to generate random thresholds.
`genasymp`	Function to generate random asymptotes.
`genslopenom`	Function to generate random slopes in a nominal model.
`fixed.slope`	The fixed slope in a 1PLM.
`items.label`	A vector with the label of each item.
`options.label`	A list of vector with the label of each option.
`object`	An object of type `rirt` as returned by `fitirt` or `genirt`.
`select`	The items to use. Default to all the items.
`type`	The type of values to predict or plot: "COEFFICIENTS" for the parameters, "OCC" for the option characteristic curves, "ICC" for the item characteristic curves, "TCC" for the test characteristic curve, "BOUNDARY" for the boundary curves (graded model only), "INFORMATION" for the information functions, "RANDOM_DATA" to generate random data with the model, "LIKELIHOOD" for the likelihood functions, "Z" for the latent variable estimates, "FIT_TEST" for the likelihood ratio goodness-of-fit test, "LOCAL_INDEPENDENCE_TEST" for the likelihood ratio tests of local independence, "KEY" for the answer key, "CTT" for the classical test theory statistics.
`normal.ogive`	If 'TRUE' then the slopes are divided by 1.702 to approximate the normal ogive model.
`z.method`	The estimator used to predict the latent variables: "EAP" for the Expected A Posteriori estimator, "WMLE" for Warm's Maximum Likelihood Estimator.
`show.se`	To display the standard errors if possible.
`leg`	To display a legend in the plot.
`leg.title`	The title of the legend.
`leg.lab`	The label of each curves.
`leg.width`	The width of the legend in number of characters.
`leg.loc`	See `legend`.
`col`	See `matplot`.
`lty`	See `matplot`.
`lwd`	See `matplot`.
`pch`	See `matplot`.
`xlim`	See `matplot`.
`ylim`	See `matplot`.
`xlab`	See `matplot`.
`ylab`	See `matplot`.
`main`	See `matplot`.
`add`	See `matplot`.
`...`	Other arguments to the matplot function.

Details

The function fitirt fit a IRT model to the data. The estimation method used depend on the model:

MMLE (Maximum Marginal Likelihood Estimator) for parametrics models without priors,

BME (Baye's Modal Estimator) for dichotomous parametric models with priors,

KERNEL (kernel regression method),

PMMLE (Penalized Maximum Marginal Likelihood Estimator).

For big datasets, the estimation process can be time consuming. It is advise to try first with model="KERNEL" for a fast preview of the curves and to make sure the data is appropriate.

If the EM algorithm didn't converged or if the Newton algorithm didn't converged in the last EM step for one or more item, a warning is given. In such cases, the first thing to try is to restart the estimation process with the additional paramater init=object, where object is the result of the first run. Alsoincreasing max.em.iter and max.nr.iter, or decreasing precision might help. If the problem persists, removing the items that didn't converged might be necessary.

The function genirt generate a parametric IRT model with random parameters. The model can then be used to generate random responses with the function predict.rirt (type="RANDOM_DATA", z=...).

The functions irt and mirt are deprecated.

See below for examples.

Value

The functions fitirt and genirt return an object of type rirt.

The function predict.rirt returns a data.frame.

Author(s)

Stephane Germain <germste@gmail.com>,

Pierre Valois <pierre.valois@fse.ulaval.ca> and

Belkacem Abdous <belkacem.abdous@uresp.ulaval.ca>

References

Baker, F.B. & Kim, S.-H. (2004). Item response theory: parameter estimation tecchniques. Second Edition. Dekker, New York.

Ramsay, J.O. (1991). Kernel smoothing approaches to nonparametric item characteristic curve estimation. Biometrika, 56, 611-630.

Examples

  #############################################
  ###### First example : simulated 2PLM #######
  #############################################

  # Generate 10 random 2PLM items
  item <- genirt(10)

  # Generate 1000 subjects
  z <- rnorm(1000)

  # Generate a matrix of responses
  data <- predict(item, type="random", z=z)

  # Fit a 2PLM
  fit <- fitirt(data)

  # Show the estimation summary and the parameters
  fit

  # Get the likelihood ratio tests
  predict(fit, type="fit", data=data)

  # Get the parameters of the item 1
  predict(fit, 1, type="coef")

  # Plot the ICC of the 5 first items
  plot(fit, 1:5)

  # Plot the information function of item 1
  plot(fit, 1, type="info")

  # Plot the ICC of item 1 on the given z levels
  plot(fit, 1, z=seq(-3,3,0.1))

  # Plot the likelihood and predict the latent variables
  # of the first 5 subjects
  plot(fit, type="like", data=data[1:5,])
  predict(fit, type="z", data=data[1:5,])

  # Plot the likelihood and predict the latent variables
  # of the given response vector
  plot(fit, type="like", data=c(1,0,1,1,0,0,1,1,1,1))
  predict(fit, type="z", data=c(1,0,1,1,0,0,1,1,1,1))

  #######################################################
  ###### Second example : simulated nominal model #######
  #######################################################

  # Generate 10 random nominal items with 4 options each
  item <- genirt(10, "nominal", rep(4, 10))

  # Get the generated answer key
  key <- attributes(item)$key

  # Generate 2000 subjects
  z <- rnorm(2000)

  # Generate a matrix of responses
  data <- predict(item, type="random", z=z)

  # Fit a nominal model
  fit <- fitirt(data, "nominal", key=key)

  # Show the model summary
  summary(fit)

  # Get the likelihood ratio tests
  predict(fit, type="fit", data=data)

  # Get the parameters of the item 1
  predict(fit, 1, type="coef")

  # Plot the ICC of the first 5 items
  plot(fit, 1:5)

  # Plot the OCCs of item 1
  plot(fit, 1, type="occ")

  # Plot the information function of item 1
  plot(fit, 1, type="info")

  # Plot the ICC of item 1 on the given z levels
  plot(fit, 1, z=seq(-3,3,0.1))

  # Plot the likelihood and predict the latent variables
  # of the first 5 subjects
  plot(fit, type="like", data=data[1:5,])
  predict(fit, type="z", data=data[1:5,])

  # Plot the likelihood and predict the latent variables
  # of the given response vector
  predict(fit, type="z", data=c("A","B","C","D","A","A","B","C","D","A"))
  plot(fit, type="like", data=c("A","B","C","D","A","A","B","C","D","A"))

  #######################################################
  ###### Third example : simulated graded model #########
  #######################################################

  # Generate 10 random graded items with 3 options each
  item <- genirt(10, "graded", rep(3, 10))

  # Generate 2000 subjects
  z <- rnorm(2000)

  # Generate a matrix of responses
  data <- predict(item, type="random", z=z)

  # Fit a graded model
  fit <- fitirt(data, "graded", graded=TRUE)

  # Show the estimation summary
  summary(fit)

  # Get the likelihood ratio tests
  predict(fit, type="fit", data=data)

  # Get the parameters of the item 1
  predict(fit, 1, type="coef")

  # Plot the ICC of the first 5 items
  plot(fit,1:5)

  # Plot the OCCs of item 1
  plot(fit, 1, type="occ")

  # Plot the information function of item 1
  plot(fit, 1, type="info")

  # Plot the ICC of item 1 on the given z levels
  plot(fit, 1, z=seq(-3,3,0.1))

  # Plot the likelihood and predict the latent variables
  # of the first 5 subjects
  plot(fit, type="like", data=data[1:5,])
  predict(fit, type="z", data=data[1:5,])

  # Plot the likelihood and predict the latent variables
  # of the given response vector
  predict(fit, type="z", data=c(2,1,3,3,2,2,1,3,3,2))
  plot(fit, type="like", data=c(2,1,3,3,2,2,1,3,3,2))

  #######################################################
  ###### Fourth example : simulated mixed model #########
  #######################################################

  # Generate 5 random items, one nominal with 3 options, two graded with 4 options, and two nominal with 3 options
  item <- genirt(5, c("NOMINAL", "GRADED", "GRADED", "NOMINAL", "NOMINAL"), c(3,4,4,3,3))
  model <- attributes(item)$model
  key <- attributes(item)$key

  # Generate 2000 subjects
  z <- rnorm(2000)

  # Generate a matrix of responses
  data <- predict(item, type="random", z=z)

  # Fit the sames models
  fit <- fitirt(data, model, key)
  fit

[Package rirt version 1.4.1 Index]