ds.meanByClass {dsBaseClient}R Documentation

Computes the mean and standard deviation across categories


This function calculates the mean and SD of a continuous variable for each class of up to 3 categorical variables.


ds.meanByClass(x = NULL, outvar = NULL, covar = NULL, type = "combine",
  datasources = NULL)



a character, the name of the dataset to get the subsets from or a text formula of the form 'A~B' where A is a single continuous vector and B a single factor vector


a character vector, the names of the continuous variables


a character vector, the names of up to 3 categorical variables


a character which represents the type of analysis to carry out. If type is set to 'combine', a pooled table of results is generated. If type is set to 'split', a table of results is genrated for each study.


a list of opal object(s) obtained after login in to opal servers; these objects hold also the data assign to R, as dataframe, from opal datasources.


The functions splits the input dataset into subsets (one for each category) and calculates the mean and SD of the specified numeric variables. It is important to note that the process of generating the final table(s) can be time consuming particularly if the subsetting is done across more than one categorical variable and the run-time lengthens if the parameter 'split' is set to 'split' as a table is then produced for each study. It is therefore advisable to run the function only for the studies of the user really interested in but including only those studies in the parameter 'datasources'.


a table or a list of tables that hold the length of the numeric variable(s) and their mean and standard deviation in each subgroup (subset).


Gaye, A.

See Also

ds.subsetByClass to subset by the classes of factor vector(s).

ds.subset to subset by complete cases (i.e. removing missing values), threshold, columns and rows.



  # load that contains the login details

  # Example 1: calculate the pooled mean proportion for LAB_HDL across GENDER categories where both vectors are in a tabe structure "D"
  # login and assign LAB_HDL and GENDER to a table "D"
  opals <- datashield.login(logins=logindata,assign=TRUE, variables=list('LAB_HDL', 'GENDER'))

  # Example 2: calculate the mean proportion for LAB_HDL across GENDER categories where both vectors are 'loose' (i.e. not in a table)
  # assign both LAB_HDL and GENDER to vectors not held in a table
  ds.assign("D$LAB_HDL", "ldl")
  ds.assign("D$GENDER", "sex")

  # Example 3: calculate the mean proportion for LAB_HDL across gender, bmi and diabetes status categories
  # login and assign all the variables stored on opal
  opals <- datashield.login(logins=logindata,assign=TRUE)
  ds.meanByClass(x='D', outvar=c('LAB_HDL','LAB_TSC'), covar=c('GENDER','PM_BMI_CATEGORICAL','DIS_DIAB'))

  # clear the Datashield R sessions and logout


[Package dsBaseClient version 4.1.0 ]