ds.gee {dsBaseClient}R Documentation

Fits a Generalized Estimating Equation (GEE) model

Description

A function that fits generalized estimated equations to deal with correlation structures arising from repeated measures on individuals, or from clustering as in family data.

Usage

ds.gee(formula = NULL, family = NULL, data = NULL,
  corStructure = "ar1", clusterID = NULL, startCoeff = NULL,
  userMatrix = NULL, maxit = 20, checks = TRUE, display = FALSE,
  datasources = NULL)

Arguments

formula

a string character, the formula which describes the model to be fitted.

family

a character, the description of the error distribution: 'binomial', 'gaussian', 'Gamma' or 'poisson'.

data

the name of the data frame that hold the variables in the regression formula.

corStructure

a character, the correlation structure: 'ar1', 'exchangeable', 'independence', 'fixed' or 'unstructure'.

clusterID

a character, the name of the column that hold the cluster IDs

startCoeff

a numeric vector, the starting values for the beta coefficients.

userMatrix

a list of user defined matrix (one for each study). These matrices are required if the correlation structure is set to 'fixed'.

maxit

an integer, the maximum number of iteration to use for convergence.

checks

a boolean, if TRUE (default) checks that takes 1-3min are carried out to verify that the variables in the model are defined (exist) on the server site and that they have the correct characteristics required to fit a GEE. If FALSE (not recommended if you are not an experienced user) no checks are carried except some very basic ones and eventual error messages might not give clear indications about the cause(s) of the error.

display

a boolean to display or not the intermediate results. Default is FALSE.

datasources

a list of opal object(s) obtained after login to opal servers; these objects also hold the data assigned to R, as a dataframe, from opal datasources.

Details

It enables a parallelized analysis of individual-level data sitting on distinct servers by sending commands to each data computer to fit a GEE model model. The estimates returned are then combined and updated coefficients estimate sent back for a new fit. This iterative process goes on until convergence is achieved. The input data should not contain missing values. The data must be in a data.frame obejct and the variables must be refer to through the data.frame.

Value

a list which contains the final coefficient estimates (beta values), the pooled alpha value and the pooled phi value.

Author(s)

Gaye, A.; Jones EM.

References

Jones EM, Sheehan NA, Gaye A, Laflamme P, Burton P. Combined analysis of correlated data when data cannot be pooled. Stat 2013; 2: 72-85.

See Also

ds.glm for genralized linear models

Examples

## Not run: 

  # load the login data file for the correlated data
  data(geeLoginData)
  
  # login and assign all the stored variables to R
  opals <- datashield.login(logins=geeLoginData,assign=TRUE)
  
  # set some parameters for the function 9the rest are set to default values)
  myformula <- 'response~1+sex+age.60'
  myfamily <- 'binomial'
  startbetas <- c(-1,1,0)
  clusters <- 'id'
  mycorr <- 'ar1'
  
  # run a GEE analysis with the above specifed parameters
  ds.gee(data='D',formula=myformula,family=myfamily,corStructure=mycorr,clusterID=clusters,
         startCoeff=startbetas)
  
  # clear the Datashield R sessions and logout
  datashield.logout(opals) 


## End(Not run)


[Package dsBaseClient version 5.0.0 ]