ds.heatmapPlot {dsBaseClient}R Documentation

Generates a Heat Map plot

Description

Generates a heat map plot of the pooled data or one plot for each dataset.

Usage

ds.heatmapPlot(
  x = NULL,
  y = NULL,
  type = "combine",
  show = "all",
  numints = 20,
  method = "smallCellsRule",
  k = 3,
  noise = 0.25,
  datasources = NULL
)

Arguments

x

a character string specifying the name of a numerical vector.

y

a character string specifying the name of a numerical vector.

type

a character string that represents the type of graph to display. type argument can be set as 'combine' or 'split'. Default 'combine'. For more information see Details.

show

a character string that represents where the plot should be focused. show argument can be set as 'all' or 'zoomed'. Default 'all'. For more information see Details.

numints

the number of intervals for a density grid object. Default numints value is 20.

method

a character string that defines which heat map will be created. The method argument can be set as 'smallCellsRule', 'deterministic' or 'probabilistic'. Default 'smallCellsRule'. For more information see Details.

k

the number of the nearest neighbours for which their centroid is calculated. Default k value is 3. For more information see Details.

noise

the percentage of the initial variance that is used as the variance of the embedded noise if the argument method is set to 'probabilistic'. Default noise value is 0.25. For more information see Details.

datasources

a list of DSConnection-class objects obtained after login. If the datasources argument is not specified the default set of connections will be used: see datashield.connections_default.

Details

The ds.heatmapPlot function first generates a density grid and uses it to plot the graph. Cells of the grid density matrix that hold a count of less than the filter set by DataSHIELD (usually 5) are considered invalid and turned into 0 to avoid potential disclosure. A message is printed to inform the user about the number of invalid cells. The ranges returned by each study and used in the process of getting the grid density matrix are not the exact minimum and maximum values but rather close approximates of the real minimum and maximum value. This was done to reduce the risk of potential disclosure.

In the argument type can be specified two types of graphics to display:

In the argument show can be specified two options:

In the argument method can be specified 3 different heat map to be created:

In the k argument the user can choose any value for k equal to or greater than the pre-specified threshold used as a disclosure control for this method and lower than the number of observations minus the value of this threshold. By default the value of k is set to be equal to 3 (we suggest k to be equal to, or bigger than, 3). Note that the function fails if the user uses the default value but the study has set a bigger threshold. The value of k is used only if the argument method is set to 'deterministic'. Any value of k is ignored if the argument method is set to 'probabilistic' or 'smallCellsRule'.

The value of noise is used only if the argument method is set to 'probabilistic'. Any value of noise is ignored if the argument method is set to 'deterministic' or 'smallCellsRule'. The user can choose any value for noise equal to or greater than the pre-specified threshold 'nfilter.noise'.

Server function called: heatmapPlotDS

Value

ds.heatmapPlot returns to the client-side a heat map plot and a message specifying the number of invalid cells in each study.

Author(s)

DataSHIELD Development Team

Examples

## Not run: 

## Version 6, for version 5 see the Wiki
  # Connecting to the Opal servers

  require('DSI')
  require('DSOpal')
  require('dsBaseClient')

  builder <- DSI::newDSLoginBuilder()
  builder$append(server = "study1", 
                 url = "http://192.168.56.100:8080/", 
                 user = "administrator", password = "datashield_test&", 
                 table = "CNSIM.CNSIM1", driver = "OpalDriver")
  builder$append(server = "study2", 
                 url = "http://192.168.56.100:8080/", 
                 user = "administrator", password = "datashield_test&", 
                 table = "CNSIM.CNSIM2", driver = "OpalDriver")
  builder$append(server = "study3",
                 url = "http://192.168.56.100:8080/", 
                 user = "administrator", password = "datashield_test&", 
                 table = "CNSIM.CNSIM3", driver = "OpalDriver")
  logindata <- builder$build()
  
  # Log onto the remote Opal training servers
  connections <- DSI::datashield.login(logins = logindata, assign = TRUE, symbol = "D") 
  
  # Compute the heat map plot 
  # Example 1: Plot a combined (default) heat map plot of the variables 'LAB_TSC'
  # and 'LAB_HDL' using the method 'smallCellsRule' (default)
  ds.heatmapPlot(x = 'D$LAB_TSC',
                 y = 'D$LAB_HDL',
                 datasources = connections) #all servers are used
                 
  # Example 2: Plot a split heat map  plot of the variables 'LAB_TSC'
  # and 'LAB_HDL' using the method 'smallCellsRule' (default)
  ds.heatmapPlot(x = 'D$LAB_TSC', 
                 y = 'D$LAB_HDL',
                 method = 'smallCellsRule', 
                 type = 'split',
                 datasources = connections[1]) #only the first server is used (study1)
                 
  # Example 3: Plot a combined heat map plot using the method 'deterministic' centroids of each 
  k = 7 nearest neighbours for numints = 40
  ds.heatmapPlot(x = 'D$LAB_TSC',
                 y = 'D$LAB_HDL', 
                 numints = 40, 
                 method = 'deterministic',
                 k = 7,
                 type = 'split',
                 datasources = connections[2]) #only the second server is used (study2)


  # clear the Datashield R sessions and logout
  datashield.logout(connections)


## End(Not run)


[Package dsBaseClient version 6.3.0 ]