ds.cor.o {dsBetaTestClient} | R Documentation |
This function calculates the correlation of two variables or the correlation matrix for the vairables of an input dataframe
ds.cor.o(x = NULL, y = NULL, naAction = "pairwise.complete", type = "split", datasources = NULL)
x |
a character, the name of a vector, matrix or dataframe of variable(s) for which the correlation(s) is (are) calculated for. |
y |
NULL (default) or the name of a vector, matrix or dataframe with compatible dimensions to x. |
naAction |
a character string giving a method for computing correlations in the
presence of missing values. This must be one of the strings "casewise.complete" or
"pairwise.complete". If |
type |
a character which represents the type of analysis to carry out. If |
datasources |
a list of opal object(s) obtained after login in to opal servers;
these objects hold also the data assign to R, as |
In addition to computing correlations; this function, produces a table outlining the number of complete cases and a table outlining the number of missing values to allow for the user to make a decision about the 'relevance' of the correlation based on the number of complete cases included in the correlation calculations.
a list containing the number of missing values in each variable, the number of missing variables
casewise or paiwise depending on the argument use
, the correlation matrix, the number of used complete cases
and an error message which indicates whether or not the input variables pass the disclosure control (i.e. none of them
is dichotomous with a level having less counts than the pre-specified threshold). If any of the input variables does not
pass the disclosure control then all the output values are replaced with NAs. If all the variables are valid and pass
the control, then the output matrices are returned and also an error message is returned but it is replaced by NA.
Gaye A; Avraam D; Burton PR
{ # # load that contains the login details # data(glmLoginData) # library(opal) # # # login and assign specific variable(s) # # (by default the assigned dataset is a dataframe named 'D') # myvar <- list('LAB_HDL', 'LAB_TSC', 'LAB_GLUC_ADJUSTED', 'GENDER') # opals <- datashield.login(logins=glmLoginData, assign=TRUE, variables=myvar) # # # Example 1: generate the correlation matrix for the assigned dataset 'D' # # which contains 4 vectors (3 continuous and 1 categorical) # ds.cor.o(x='D') # # # Example 2: generate the correlation matrix for the dataset 'D' combined for all # # studies and removing any missing values casewise # ds.cor.o(x='D', naAction='casewise.complete', type='combine') # # # Example 3: calculate the correlation between two vectors # # (first assign the vectors from 'D') # ds.assign(newobj='labhdl', toAssign='D$LAB_HDL') # ds.assign(newobj='labtsc', toAssign='D$LAB_TSC') # ds.assign(newobj='gender', toAssign='D$GENDER') # ds.cor.o(x='labhdl', y='labtsc', naAction='pairwise.complete', type='combine') # ds.cor.o(x='labhdl', y='labtsc', naAction='casewise.complete', type='combine') # ds.cor.o(x='labhdl', y='gender', naAction='pairwise.complete', type='combine') # ds.cor.o(x='labhdl', y='gender', naAction='casewise.complete', type='combine') # # # clear the Datashield R sessions and logout # datashield.logout(opals) }