ds.subsetByClass {dsBaseClient}R Documentation

Generates valid subset(s) of a data frame or a factor

Description

The function takes a categorical variable or a data frame as input and generates subset(s) variables or data frames for each category.

Usage

ds.subsetByClass(x = NULL, subsets = "subClasses", variables = NULL,
  datasources = NULL)

Arguments

x

a character, the name of the dataframe or the vector to generate subsets from.

variables

a vector of string characters, the name(s) of the variables to subset by.

subsets

the name of the output object, a list that holds the subset objects. If set to NULL the default name of this list is 'subClasses'.

datasources

a list of opal object(s) obtained after login in to opal servers; these objects hold also the data assign to R, as dataframe, from opal datasources.

Details

If the input data object is a data frame it is possible to specify the variables to subset on. If a subset is not 'valid' all its the values are reported as missing (i.e. NA), the name of the subsets is labelled with the suffix '_INVALID'. Subsets are considered invalid if the number of observations it holds are between 1 and the threshold allowed by the data owner. if a subset is empty (i.e. no entries) the name of the subset is labelled with the suffix '_EMPTY'.

Value

a no data are return to the user but messages are printed out.

Author(s)

Gaye, A.

See Also

ds.meanByClass to compute mean and standard deviation across categories of a factor vectors.

ds.subset to subset by complete cases (i.e. removing missing values), threshold, columns and rows.

Examples

{

  # load the login data
  data(logindata)

  # login and assign some variables to R
  myvar <- list('DIS_DIAB','PM_BMI_CONTINUOUS','LAB_HDL', 'GENDER')
  opals <- datashield.login(logins=logindata,assign=TRUE,variables=myvar)

  # Example 1: generate all possible subsets from the table assigned above (one subset table for each class in each factor)
  ds.subsetByClass(x='D', subsets='subclasses')
  # display the names of the subset tables that were generated in each study
  ds.names('subclasses')

  # Example 2: subset the table initially assigned by the variable 'GENDER'
  ds.subsetByClass(x='D', subsets='subtables', variables='GENDER')
  # display the names of the subset tables that were generated in each study
  ds.names('subtables')

  # Example 3: generate a new variable 'gender' and split it into two vectors: males and females
  ds.assign(toAssign='D$GENDER', newobj='gender')
  ds.subsetByClass(x='gender', subsets='subvectors')
  # display the names of the subset vectors that were generated in each study
  ds.names('subvectors')

  # clear the Datashield R sessions and logout
  datashield.logout(opals)

}

[Package dsBaseClient version 4.1.0 ]