miceDS {dsBase} | R Documentation |
This function is a wrapper function of the mice from the mice R package. The function creates multiple imputations (replacement values) for multivariate missing data. The method is based on Fully Conditional Specification, where each incomplete variable is imputed by a separate model. The MICE algorithm can impute mixes of continuous, binary, unordered categorical and ordered categorical data. In addition, MICE can impute continuous two-level data, and maintain consistency between imputations by means of passive imputation.
miceDS(
data = data,
m = m,
maxit = maxit,
method = method,
post = post,
seed = seed,
predictorMatrix = predictorMatrix,
ncol.pred.mat = ncol.pred.mat,
newobj_mids = newobj_mids,
newobj_df = newobj_df
)
data |
a data frame or a matrix containing the incomplete data. |
m |
Number of multiple imputations. The default is m=5. The maximum allowed number in DataSHIELD is m=20. |
maxit |
A scalar giving the number of iterations. The default is 5. The maximum allowed number in DataSHIELD is maxit=30. |
method |
Can be either a single string, or a vector of strings with length ncol(data), specifying the imputation method to be used for each column in data. If specified as a single string, the same method will be used for all blocks. The default imputation method (when no argument is specified) depends on the measurement level of the target column, as regulated by the defaultMethod argument in native R mice function. Columns that need not be imputed have the empty method "". |
post |
A vector of strings with length ncol(data) specifying expressions as strings. Each string is parsed and executed within the sampler() function to post-process imputed values during the iterations. The default is a vector of empty strings, indicating no post-processing. Multivariate (block) imputation methods ignore the post parameter. |
seed |
either NA (default) or "fixed". If seed is set to "fixed" then a fixed seed random number generator which is study-specific is used. |
predictorMatrix |
A numeric matrix of ncol(data) rows and ncol(data) columns, containing 0/1 data specifying the set of predictors to be used for each target column. Each row corresponds to a variable to be imputed. A value of 1 means that the column variable is used as a predictor for the target variables (in the rows). By default, the predictorMatrix is a square matrix of ncol(data) rows and columns with all 1's, except for the diagonal. |
ncol.pred.mat |
the number of columns of the predictorMatrix. |
newobj_mids |
a character string that provides the name for the output mids object
that is stored on the data servers. Default |
newobj_df |
a character string that provides the name for the output dataframes
that are stored on the data servers. Default |
For additional details see the help header of mice function in native R mice package.
a list with three elements: the method, the predictorMatrix and the post. The function also saves in each server the mids object and all completed datasets as dataframes.
Demetris Avraam for DataSHIELD Development Team