rm_compactsum.RdOutputs a table formatted for pdf, word or html output with summary statistics
rm_compactsum(
data,
xvars,
grp,
use_mean,
caption = NULL,
tableOnly = FALSE,
covTitle = "",
digits = 1,
digits.cat = 0,
nicenames = TRUE,
iqr = TRUE,
all.stats = FALSE,
pvalue = TRUE,
effSize = FALSE,
p.adjust = "none",
unformattedp = FALSE,
show.sumstats = FALSE,
show.tests = FALSE,
full = TRUE,
percentage = "col"
)dataframe containing data
character vector with the names of covariates to include in table
character with the name of the grouping variable
logical indicating whether mean and standard deviation will be returned for continuous variables instead of median. Otherwise, can specify for individual variables using a character vector containing the names of covariates to return mean and sd for (if use_mean is not supplied, all covariates will have median summaries). See examples.
character containing table caption (default is no caption)
logical, if TRUE then a dataframe is returned, otherwise a formatted printed object is returned (default is FALSE)
character with the name of the covariate (predictor) column. The default is to leave this empty for output or, for table only output to use the column name 'Covariate'
numeric specifying the number of digits for summarizing mean data. Digits can be specified for individual variables using a named vector in the format digits=c("var1"=2,"var2"=3). If a variable is not in the vector the default will be used for it (default is 1). See examples
numeric specifying the number of digits for the proportions when summarizing categorical data (default is 0)
logical indicating if you want to replace . and _ in strings . with a space
logical indicating if you want to display the interquartile range (Q1-Q3) as opposed to (min-max) in the summary for continuous variables
logical indicating if all summary statistics (Q1, Q3 + min, max on a separate line) should be displayed. Overrides iqr
logical indicating if you want p-values included in the table
logical indicating if you want effect sizes and their 95% confidence intervals included in the table. Effect sizes calculated include Cramer's V for categorical variables, and Cohen's d, Wilcoxon r, Epsilon-squared, or Omega-squared for numeric/continuous variables
p-adjustments to be performed
logical indicating if you would like the p-value to be returned unformatted (ie. not rounded or prefixed with '<'). Best used with tableOnly = T and outTable function. See examples
logical indicating if the type of statistical summary (mean, median, etc) used should be shown.
logical indicating if the type of statistical test and effect size (if effSize = TRUE) used should be shown in a column beside the p-values.
logical indicating if you want the full sample included in the table, ignored if grp is not specified
choice of how percentages are presented, either column (default) or row
A character vector of the table source code, unless tableOnly = TRUE in which case a data frame is returned. The output has the following attribute:
"description", which describes what is included in the output table and the type of statistical summary for each covariate. When applicable, the types of statistical tests used will be included. If effSize = TRUE, the effect sizes for each covariate will also be mentioned.
Comparisons for categorical variables default to chi-square tests, but if there are counts of <5 then the Fisher Exact test will be used. For grouping variables with two levels, either t-tests (mean) or wilcoxon tests (median) will be used for numerical variables. Otherwise, ANOVA (mean) or Kruskal- Wallis tests will be used. The statistical test used can be displayed by specifying show.tests = TRUE. Statistical tests and effect sizes for grp and/ or xvars with less than 2 counts in any level will not be shown.
Effect sizes are calculated as Cohen d for between group differences if the variable is summarised with the mean, otherwise Wilcoxon R if summarised with a median. Cramer's V is used for categorical variables, omega is used for differences in means among more than two groups and epsilon for differences in medians among more than two groups. Confidence intervals are calculated using bootstrapping.
tidyselect can only be used for xvars and grp arguments. Additional arguments (digits, use_mean) must be passed in using characters if variable names are used.
Smithson, M. (2002). Noncentral Confidence Intervals for Standardized Effect Sizes. (07/140 ed., Vol. 140). SAGE Publications. doi:10.4135/9781412983761.n4
Steiger, J. H. (2004). Beyond the F Test: Effect Size Confidence Intervals and Tests of Close Fit in the Analysis of Variance and Contrast Analysis. Psychological Methods, 9(2), 164–182. doi:10.1037/1082-989X.9.2.164
Kelley, T. L. (1935). An Unbiased Correlation Ratio Measure. Proceedings of the National Academy of Sciences - PNAS, 21(9), 554–559. doi:10.1073/pnas.21.9.554
Okada, K. (2013). Is Omega Squared Less Biased? A Comparison of Three Major Effect Size Indices in One-Way ANOVA. Behavior Research Methods, 40(2), 129-147.
Breslow, N. (1970). A generalized Kruskal-Wallis test for comparing K samples subject to unequal patterns of censorship. Biometrika, 57(3), 579-594.
FRITZ, C. O., MORRIS, P. E., & RICHLER, J. J. (2012). Effect Size Estimates: Current Use, Calculations, and Interpretation. Journal of Experimental Psychology. General, 141(1), 2–18. doi:10.1037/a0024338
data("pembrolizumab")
rm_compactsum(data = pembrolizumab, xvars = c("age",
"change_ctdna_group", "l_size", "pdl1"), grp = "sex", use_mean = "age",
digits = c("age" = 2, "l_size" = 3), digits.cat = 1, iqr = TRUE,
show.tests = TRUE)
#> <table class="table table" style="margin-left: auto; margin-right: auto; margin-left: auto; margin-right: auto;">
#> <thead>
#> <tr>
#> <th style="text-align:left;position: sticky; top:0; background-color: #FFFFFF;"> </th>
#> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> Full Sample (n=94) </th>
#> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> Female (n=58) </th>
#> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> Male (n=36) </th>
#> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> p-value </th>
#> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> Missing </th>
#> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> pTest </th>
#> </tr>
#> </thead>
#> <tbody>
#> <tr>
#> <td style="text-align:left;"> <span style="font-weight: bold;">Age at study entry</span> </td>
#> <td style="text-align:right;"> 57.86 (12.75) </td>
#> <td style="text-align:right;"> 56.95 (12.59) </td>
#> <td style="text-align:right;"> 59.32 (13.05) </td>
#> <td style="text-align:right;"> 0.39 </td>
#> <td style="text-align:right;"> 0 </td>
#> <td style="text-align:right;"> t-test </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> <span style="font-weight: bold;">Did ctDNA increase or decrease from baseline to cycle 3 - Increase from baseline</span> </td>
#> <td style="text-align:right;"> 40 (54.8%) </td>
#> <td style="text-align:right;"> 21 (52.5%) </td>
#> <td style="text-align:right;"> 19 (57.6%) </td>
#> <td style="text-align:right;"> 0.84 </td>
#> <td style="text-align:right;"> 21 </td>
#> <td style="text-align:right;"> ChiSq </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> <span style="font-weight: bold;">Target lesion size at baseline</span> </td>
#> <td style="text-align:right;"> 73.500 (49.250-108.750) </td>
#> <td style="text-align:right;"> 68.000 (44.250-97.750) </td>
#> <td style="text-align:right;"> 93.000 (65.500-121.000) </td>
#> <td style="text-align:right;"> 0.066 </td>
#> <td style="text-align:right;"> 0 </td>
#> <td style="text-align:right;"> Wilcoxon Rank Sum </td>
#> </tr>
#> <tr>
#> <td style="text-align:left;"> <span style="font-weight: bold;">PD L1 percent</span> </td>
#> <td style="text-align:right;"> 0.0 (0.0-10.0) </td>
#> <td style="text-align:right;"> 0.5 (0.0-13.8) </td>
#> <td style="text-align:right;"> 0.0 (0.0-4.5) </td>
#> <td style="text-align:right;"> 0.76 </td>
#> <td style="text-align:right;"> 1 </td>
#> <td style="text-align:right;"> Wilcoxon Rank Sum </td>
#> </tr>
#> </tbody>
#> </table>
# Other Examples (not run)
## Include the summary statistic in the variable column
#rm_compactsum(data = pembrolizumab, xvars = c("age",
#"change_ctdna_group"), grp = "sex", use_mean = "age", show.sumstats=TRUE)
## To show effect sizes
#rm_compactsum(data = pembrolizumab, xvars = c("age",
#"change_ctdna_group"), grp = "sex", use_mean = "age", digits = 2,
#effSize = TRUE, show.tests = TRUE)
## To return unformatted p-values
#rm_compactsum(data = pembrolizumab, xvars = c("l_size",
#"change_ctdna_group"), grp = "cohort", effSize = TRUE, unformattedp = TRUE)
## Using tidyselect
#pembrolizumab |> rm_compactsum(xvars = c(age, sex, pdl1), grp = cohort,
#effSize = TRUE)