ds.histogram {dsBaseClient} | R Documentation |

This function plots a non-disclosive histogram

ds.histogram(x = NULL, type = "split", num.breaks = 10, method = "smallCellsRule", k = 3, noise = 0.25, vertical.axis = "Frequency", datasources = NULL)

`x` |
a charcater, the name of the vector of values for which the histogram is desired. |

`type` |
a character which represent the type of graph to display. If |

`num.breaks` |
a numeric specifying the number of breaks of the histogram. The default value is set to 10. |

`method` |
a character which defines which histogram will be created. If |

`k` |
the number of the nearest neghbours for which their centroid is calculated.
The user can choose any value for k equal to or greater than the pre-specified threshold
used as a disclosure control for this method and lower than the number of observations
minus the value of this threshold. By default the value of k is set to be equal to 3
(we suggest k to be equal to, or bigger than, 3). Note that the function fails if the user
uses the default value but the study has set a bigger threshold. The value of k is used only
if the argument |

`noise` |
the percentage of the initial variance that is used as the variance of the embedded
noise if the argument |

`vertical.axis, ` |
a character which defines what is shown in the vertical axis of the
plot. If |

`datasources` |
a list of opal object(s) obtained after login in to opal servers;
these objects hold also the data assign to R, as |

It calls a datashield server side function that produces the
histogram objects to plot. Two options are possible as identified by the argument
`method`

. The first option creates a histogram that excludes bins with
counts smaller than the allowed threshold. The second option creates a histogram
of the centroids of each k nearest neighbours. The function allows for the user to plot
disctinct histograms (one for each study) or a combine histogram that merges
the single plots.

one or more histogram objects and plots depending on the argument `type`

Amadou Gaye, Demetris Avraam for DataSHIELD Development Team

## Not run: # load that contains the login details data(logindata) # login to the servers opals <- opal::datashield.login(logins=logindata, assign=TRUE) # Example 1: generate a histogram for each study separately (the default behaviour) ds.histogram(x='LD$PM_BMI_CONTINUOUS', type="split") # Example 2: generate a combined histogram with the default small cells counts suppression rule ds.histogram(x='LD$PM_BMI_CONTINUOUS', method='smallCellsRule', type='combine') # Example 3: if a variable is of type factor then the function returns an error ds.histogram(x='LD$PM_BMI_CATEGORICAL') # Example 4: generate a combined histogram with the deterministic method ds.histogram(x='LD$PM_BMI_CONTINUOUS', method='deterministic', type='combine') # Example 5: same as Example 4 but with k=50 ds.histogram(x='LD$PM_BMI_CONTINUOUS', k=50, method='deterministic', type='combine') # Example 6: same as Example 4 but with k=1740 (here we see that as k increases we have big utility loss) ds.histogram(x='LD$PM_BMI_CONTINUOUS', k=1740, method='deterministic', type='combine') # Example 7: same as Example 6 but for split analysis ds.histogram(x='LD$PM_BMI_CONTINUOUS', k=1740, method='deterministic', type='split') # Example 7: if k is less than the pre-specified threshold then the function returns an error ds.histogram(x='LD$PM_BMI_CONTINUOUS', k=2, method='deterministic') # Example 8: generate a combined histogram with the probabilistic method ds.histogram(x='LD$PM_BMI_CONTINUOUS', method='probabilistic', type='combine') # Example 9: generate a histogram with the probabilistic method for each study separately ds.histogram(x='LD$PM_BMI_CONTINUOUS', method='probabilistic', type='split') # Example 10: same as Example 9 but with higher level of noise ds.histogram(x='LD$PM_BMI_CONTINUOUS', method='probabilistic', noise=0.5, type='split') # Example 11: if 'noise' is less than the pre-specified threshold then the function returns an error ds.histogram(x='LD$PM_BMI_CONTINUOUS', method='probabilistic', noise=0.1, type='split') # Example 12: same as Example 9 but with bigger number of breaks ds.histogram(x='LD$PM_BMI_CONTINUOUS', method='probabilistic', type='split', num.breaks=30) # Example 13: same as Example 12 but the vertical axis shows densities instead of frequencies ds.histogram(x='LD$PM_BMI_CONTINUOUS', method='probabilistic', type='split', num.breaks=30, vertical.axis='Density') # Example 14: create a histogram and the probability density on the plot hist <- ds.histogram(x='LD$PM_BMI_CONTINUOUS', method='probabilistic', type='combine', num.breaks=30, vertical.axis='Density') lines(hist$mids, hist$density) # clear the Datashield R sessions and logout opal::datashield.logout(opals) ## End(Not run)

[Package *dsBaseClient* version 5.0.0 ]