Title: | Presentation-Ready Data Summary and Analytic Result Tables |
---|---|
Description: | Creates presentation-ready tables summarizing data sets, regression models, and more. The code to create the tables is concise and highly customizable. Data frames can be summarized with any function, e.g. mean(), median(), even user-written functions. Regression models are summarized and include the reference rows for categorical variables. Common regression models, such as logistic regression and Cox proportional hazards regression, are automatically identified and the tables are pre-filled with appropriate column headers. |
Authors: | Daniel D. Sjoberg [aut, cre] , Joseph Larmarange [aut] , Michael Curry [aut] , Jessica Lavery [aut] , Karissa Whiting [aut] , Emily C. Zabor [aut] , Xing Bai [ctb], Esther Drill [ctb] , Jessica Flynn [ctb] , Margie Hannum [ctb] , Stephanie Lobaugh [ctb], Shannon Pileggi [ctb] , Amy Tin [ctb] , Gustavo Zapata Wainberg [ctb] |
Maintainer: | Daniel D. Sjoberg <[email protected]> |
License: | MIT + file LICENSE |
Version: | 2.0.3.9008 |
Built: | 2024-11-25 18:29:55 UTC |
Source: | https://github.com/ddsjoberg/gtsummary |
Add a new column with the confidence intervals for proportions, means, etc.
add_ci(x, ...) ## S3 method for class 'tbl_summary' add_ci( x, method = list(all_continuous() ~ "t.test", all_categorical() ~ "wilson"), include = everything(), statistic = list(all_continuous() ~ "{conf.low}, {conf.high}", all_categorical() ~ "{conf.low}%, {conf.high}%"), conf.level = 0.95, style_fun = list(all_continuous() ~ label_style_sigfig(), all_categorical() ~ label_style_sigfig(scale = 100)), pattern = NULL, ... )
add_ci(x, ...) ## S3 method for class 'tbl_summary' add_ci( x, method = list(all_continuous() ~ "t.test", all_categorical() ~ "wilson"), include = everything(), statistic = list(all_continuous() ~ "{conf.low}, {conf.high}", all_categorical() ~ "{conf.low}%, {conf.high}%"), conf.level = 0.95, style_fun = list(all_continuous() ~ label_style_sigfig(), all_categorical() ~ label_style_sigfig(scale = 100)), pattern = NULL, ... )
x |
( |
... |
These dots are for future extensions and must be empty. |
method |
( |
include |
( |
statistic |
( |
conf.level |
(scalar |
style_fun |
( |
pattern |
( |
gtsummary table
Must be one of
"wilson"
, "wilson.no.correct"
calculated via prop.test(correct = c(TRUE, FALSE))
for categorical variables
"exact"
calculated via stats::binom.test()
for categorical variables
"wald"
, "wald.no.correct"
calculated via cardx::proportion_ci_wald(correct = c(TRUE, FALSE)
for categorical variables
"agresti.coull"
calculated via cardx::proportion_ci_agresti_coull()
for categorical variables
"jeffreys"
calculated via cardx::proportion_ci_jeffreys()
for categorical variables
"t.test"
calculated via stats::t.test()
for continuous variables
"wilcox.test"
calculated via stats::wilcox.test()
for continuous variables
# Example 1 ---------------------------------- trial |> tbl_summary( missing = "no", statistic = all_continuous() ~ "{mean} ({sd})", include = c(marker, response, trt) ) |> add_ci() # Example 2 ---------------------------------- trial |> select(response, grade) %>% tbl_summary( statistic = all_categorical() ~ "{p}%", missing = "no", include = c(response, grade) ) |> add_ci(pattern = "{stat} ({ci})") |> modify_footnote(everything() ~ NA)
# Example 1 ---------------------------------- trial |> tbl_summary( missing = "no", statistic = all_continuous() ~ "{mean} ({sd})", include = c(marker, response, trt) ) |> add_ci() # Example 2 ---------------------------------- trial |> select(response, grade) %>% tbl_summary( statistic = all_categorical() ~ "{p}%", missing = "no", include = c(response, grade) ) |> add_ci(pattern = "{stat} ({ci})") |> modify_footnote(everything() ~ NA)
Add a new column with the confidence intervals for proportions, means, etc.
## S3 method for class 'tbl_svysummary' add_ci( x, method = list(all_continuous() ~ "svymean", all_categorical() ~ "svyprop.logit"), include = everything(), statistic = list(all_continuous() ~ "{conf.low}, {conf.high}", all_categorical() ~ "{conf.low}%, {conf.high}%"), conf.level = 0.95, style_fun = list(all_continuous() ~ label_style_sigfig(), all_categorical() ~ label_style_sigfig(scale = 100)), pattern = NULL, df = survey::degf(x$inputs$data), ... )
## S3 method for class 'tbl_svysummary' add_ci( x, method = list(all_continuous() ~ "svymean", all_categorical() ~ "svyprop.logit"), include = everything(), statistic = list(all_continuous() ~ "{conf.low}, {conf.high}", all_categorical() ~ "{conf.low}%, {conf.high}%"), conf.level = 0.95, style_fun = list(all_continuous() ~ label_style_sigfig(), all_categorical() ~ label_style_sigfig(scale = 100)), pattern = NULL, df = survey::degf(x$inputs$data), ... )
x |
( |
method |
( |
include |
( |
statistic |
( |
conf.level |
(scalar |
style_fun |
( |
pattern |
( |
df |
( |
... |
These dots are for future extensions and must be empty. |
gtsummary table
Must be one of
"svyprop.logit"
, "svyprop.likelihood"
, "svyprop.asin"
,
"svyprop.beta"
, "svyprop.mean"
, "svyprop.xlogit"
calculated via survey::svyciprop()
for categorical variables
"svymean"
calculated via survey::svymean()
for continuous variables
"svymedian.mean"
, "svymedian.beta"
, "svymedian.xlogit"
,
"svymedian.asin"
, "svymedian.score"
calculated via survey::svyquantile(quantiles = 0.5)
for continuous variables
data(api, package = "survey") survey::svydesign(id = ~dnum, weights = ~pw, data = apiclus1, fpc = ~fpc) |> tbl_svysummary( by = "both", include = c(api00, stype), statistic = all_continuous() ~ "{mean} ({sd})" ) |> add_stat_label() |> add_ci(pattern = "{stat} (95% CI {ci})") |> modify_header(all_stat_cols() ~ "**{level}**") |> modify_spanning_header(all_stat_cols() ~ "**Survived**")
data(api, package = "survey") survey::svydesign(id = ~dnum, weights = ~pw, data = apiclus1, fpc = ~fpc) |> tbl_svysummary( by = "both", include = c(api00, stype), statistic = all_continuous() ~ "{mean} ({sd})" ) |> add_stat_label() |> add_ci(pattern = "{stat} (95% CI {ci})") |> modify_header(all_stat_cols() ~ "**{level}**") |> modify_spanning_header(all_stat_cols() ~ "**Survived**")
Adds difference to tables created by tbl_summary()
.
The difference between two groups (typically mean or rate difference) is added
to the table along with the difference's confidence interval and a p-value (when applicable).
## S3 method for class 'tbl_summary' add_difference( x, test = NULL, group = NULL, adj.vars = NULL, test.args = NULL, conf.level = 0.95, include = everything(), pvalue_fun = label_style_pvalue(digits = 1), estimate_fun = list(c(all_continuous(), all_categorical(FALSE)) ~ label_style_sigfig(), all_dichotomous() ~ label_style_sigfig(scale = 100, suffix = "%"), all_tests("smd") ~ label_style_sigfig()), ... )
## S3 method for class 'tbl_summary' add_difference( x, test = NULL, group = NULL, adj.vars = NULL, test.args = NULL, conf.level = 0.95, include = everything(), pvalue_fun = label_style_pvalue(digits = 1), estimate_fun = list(c(all_continuous(), all_categorical(FALSE)) ~ label_style_sigfig(), all_dichotomous() ~ label_style_sigfig(scale = 100, suffix = "%"), all_tests("smd") ~ label_style_sigfig()), ... )
x |
( |
test |
( See below for details on default tests and ?tests for details on available tests and creating custom tests. |
group |
( |
adj.vars |
( |
test.args |
( |
conf.level |
( |
include |
( |
pvalue_fun |
( |
estimate_fun |
( |
... |
These dots are for future extensions and must be empty. |
a gtsummary table of class "tbl_summary"
# Example 1 ---------------------------------- trial |> select(trt, age, marker, response, death) %>% tbl_summary( by = trt, statistic = list( all_continuous() ~ "{mean} ({sd})", all_dichotomous() ~ "{p}%" ), missing = "no" ) |> add_n() |> add_difference() # Example 2 ---------------------------------- # ANCOVA adjusted for grade and stage trial |> select(trt, age, marker, grade, stage) %>% tbl_summary( by = trt, statistic = list(all_continuous() ~ "{mean} ({sd})"), missing = "no", include = c(age, marker, trt) ) |> add_n() |> add_difference(adj.vars = c(grade, stage))
# Example 1 ---------------------------------- trial |> select(trt, age, marker, response, death) %>% tbl_summary( by = trt, statistic = list( all_continuous() ~ "{mean} ({sd})", all_dichotomous() ~ "{p}%" ), missing = "no" ) |> add_n() |> add_difference() # Example 2 ---------------------------------- # ANCOVA adjusted for grade and stage trial |> select(trt, age, marker, grade, stage) %>% tbl_summary( by = trt, statistic = list(all_continuous() ~ "{mean} ({sd})"), missing = "no", include = c(age, marker, trt) ) |> add_n() |> add_difference(adj.vars = c(grade, stage))
Adds difference to tables created by tbl_summary()
.
The difference between two groups (typically mean or rate difference) is added
to the table along with the difference's confidence interval and a p-value (when applicable).
## S3 method for class 'tbl_svysummary' add_difference( x, test = NULL, group = NULL, adj.vars = NULL, test.args = NULL, conf.level = 0.95, include = everything(), pvalue_fun = label_style_pvalue(digits = 1), estimate_fun = list(c(all_continuous(), all_categorical(FALSE)) ~ label_style_sigfig(), all_dichotomous() ~ label_style_sigfig(scale = 100, suffix = "%"), all_tests("smd") ~ label_style_sigfig()), ... )
## S3 method for class 'tbl_svysummary' add_difference( x, test = NULL, group = NULL, adj.vars = NULL, test.args = NULL, conf.level = 0.95, include = everything(), pvalue_fun = label_style_pvalue(digits = 1), estimate_fun = list(c(all_continuous(), all_categorical(FALSE)) ~ label_style_sigfig(), all_dichotomous() ~ label_style_sigfig(scale = 100, suffix = "%"), all_tests("smd") ~ label_style_sigfig()), ... )
x |
( |
test |
( See below for details on default tests and ?tests for details on available tests and creating custom tests. |
group |
( |
adj.vars |
( |
test.args |
( |
conf.level |
( |
include |
( |
pvalue_fun |
( |
estimate_fun |
( |
... |
These dots are for future extensions and must be empty. |
a gtsummary table of class "tbl_summary"
Add model statistics returned from broom::glance()
. Statistics can either
be appended to the table (add_glance_table()
), or added as a
table source note (add_glance_source_note()
).
add_glance_table( x, include = everything(), label = NULL, fmt_fun = list(everything() ~ label_style_sigfig(digits = 3), any_of("p.value") ~ label_style_pvalue(digits = 1), c(where(is.integer), starts_with("df")) ~ label_style_number()), glance_fun = glance_fun_s3(x$inputs$x) ) add_glance_source_note( x, include = everything(), label = NULL, fmt_fun = list(everything() ~ label_style_sigfig(digits = 3), any_of("p.value") ~ label_style_pvalue(digits = 1), c(where(is.integer), starts_with("df")) ~ label_style_number()), glance_fun = glance_fun_s3(x$inputs$x), text_interpret = c("md", "html"), sep1 = " = ", sep2 = "; " )
add_glance_table( x, include = everything(), label = NULL, fmt_fun = list(everything() ~ label_style_sigfig(digits = 3), any_of("p.value") ~ label_style_pvalue(digits = 1), c(where(is.integer), starts_with("df")) ~ label_style_number()), glance_fun = glance_fun_s3(x$inputs$x) ) add_glance_source_note( x, include = everything(), label = NULL, fmt_fun = list(everything() ~ label_style_sigfig(digits = 3), any_of("p.value") ~ label_style_pvalue(digits = 1), c(where(is.integer), starts_with("df")) ~ label_style_number()), glance_fun = glance_fun_s3(x$inputs$x), text_interpret = c("md", "html"), sep1 = " = ", sep2 = "; " )
x |
( |
include |
( |
label |
( |
fmt_fun |
( |
glance_fun |
( |
text_interpret |
( |
sep1 |
( |
sep2 |
( |
gtsummary table
When combining add_glance_table()
with tbl_merge()
, the
ordering of the model terms and the glance statistics may become jumbled.
To re-order the rows with glance statistics on bottom, use the script below:
tbl_merge(list(tbl1, tbl2)) %>% modify_table_body(~.x %>% arrange(row_type == "glance_statistic"))
mod <- lm(age ~ marker + grade, trial) |> tbl_regression() # Example 1 ---------------------------------- mod |> add_glance_table( label = list(sigma = "\U03C3"), include = c(r.squared, AIC, sigma) ) # Example 2 ---------------------------------- mod |> add_glance_source_note( label = list(sigma = "\U03C3"), include = c(r.squared, AIC, sigma) )
mod <- lm(age ~ marker + grade, trial) |> tbl_regression() # Example 1 ---------------------------------- mod |> add_glance_table( label = list(sigma = "\U03C3"), include = c(r.squared, AIC, sigma) ) # Example 2 ---------------------------------- mod |> add_glance_source_note( label = list(sigma = "\U03C3"), include = c(r.squared, AIC, sigma) )
This function uses car::Anova()
(by default) to calculate global p-values
for model covariates.
Output from tbl_regression
and tbl_uvregression
objects supported.
add_global_p(x, ...) ## S3 method for class 'tbl_regression' add_global_p( x, include = everything(), keep = FALSE, anova_fun = global_pvalue_fun, type = "III", quiet, ... ) ## S3 method for class 'tbl_uvregression' add_global_p( x, include = everything(), keep = FALSE, anova_fun = global_pvalue_fun, type = "III", quiet, ... )
add_global_p(x, ...) ## S3 method for class 'tbl_regression' add_global_p( x, include = everything(), keep = FALSE, anova_fun = global_pvalue_fun, type = "III", quiet, ... ) ## S3 method for class 'tbl_uvregression' add_global_p( x, include = everything(), keep = FALSE, anova_fun = global_pvalue_fun, type = "III", quiet, ... )
x |
( |
... |
Additional arguments to be passed to |
include |
( |
keep |
(scalar |
anova_fun |
( To pass a custom function, it must accept as its first argument is a model.
Note that anything passed in |
type |
Type argument passed to |
quiet |
Daniel D. Sjoberg
# Example 1 ---------------------------------- lm(marker ~ age + grade, trial) |> tbl_regression() |> add_global_p() # Example 2 ---------------------------------- trial[c("response", "age", "trt", "grade")] |> tbl_uvregression( method = glm, y = response, method.args = list(family = binomial), exponentiate = TRUE ) |> add_global_p()
# Example 1 ---------------------------------- lm(marker ~ age + grade, trial) |> tbl_regression() |> add_global_p() # Example 2 ---------------------------------- trial[c("response", "age", "trt", "grade")] |> tbl_uvregression( method = glm, y = response, method.args = list(family = binomial), exponentiate = TRUE ) |> add_global_p()
Add N to regression table
## S3 method for class 'tbl_regression' add_n(x, location = "label", ...) ## S3 method for class 'tbl_uvregression' add_n(x, location = "label", ...)
## S3 method for class 'tbl_regression' add_n(x, location = "label", ...) ## S3 method for class 'tbl_uvregression' add_n(x, location = "label", ...)
x |
( |
location |
( When |
... |
These dots are for future extensions and must be empty. |
# Example 1 ---------------------------------- trial |> select(response, age, grade) |> tbl_uvregression( y = response, exponentiate = TRUE, method = glm, method.args = list(family = binomial), hide_n = TRUE ) |> add_n(location = "label") # Example 2 ---------------------------------- glm(response ~ age + grade, trial, family = binomial) |> tbl_regression(exponentiate = TRUE) |> add_n(location = "level")
# Example 1 ---------------------------------- trial |> select(response, age, grade) |> tbl_uvregression( y = response, exponentiate = TRUE, method = glm, method.args = list(family = binomial), hide_n = TRUE ) |> add_n(location = "label") # Example 2 ---------------------------------- glm(response ~ age + grade, trial, family = binomial) |> tbl_regression(exponentiate = TRUE) |> add_n(location = "level")
For each variable in a tbl_summary
table, the add_n
function adds a column with the
total number of non-missing (or missing) observations
## S3 method for class 'tbl_summary' add_n( x, statistic = "{N_nonmiss}", col_label = "**N**", footnote = FALSE, last = FALSE, ... ) ## S3 method for class 'tbl_svysummary' add_n( x, statistic = "{N_nonmiss}", col_label = "**N**", footnote = FALSE, last = FALSE, ... ) ## S3 method for class 'tbl_likert' add_n( x, statistic = "{N_nonmiss}", col_label = "**N**", footnote = FALSE, last = FALSE, ... )
## S3 method for class 'tbl_summary' add_n( x, statistic = "{N_nonmiss}", col_label = "**N**", footnote = FALSE, last = FALSE, ... ) ## S3 method for class 'tbl_svysummary' add_n( x, statistic = "{N_nonmiss}", col_label = "**N**", footnote = FALSE, last = FALSE, ... ) ## S3 method for class 'tbl_likert' add_n( x, statistic = "{N_nonmiss}", col_label = "**N**", footnote = FALSE, last = FALSE, ... )
x |
( |
statistic |
(
The argument uses |
col_label |
( |
footnote |
(scalar |
last |
(scalar |
... |
These dots are for future extensions and must be empty. |
A table of class c('tbl_summary', 'gtsummary')
Daniel D. Sjoberg
# Example 1 ---------------------------------- trial |> tbl_summary(by = trt, include = c(trt, age, grade, response)) |> add_n() # Example 2 ---------------------------------- survey::svydesign(~1, data = as.data.frame(Titanic), weights = ~Freq) |> tbl_svysummary(by = Survived, percent = "row", include = c(Class, Age)) |> add_n()
# Example 1 ---------------------------------- trial |> tbl_summary(by = trt, include = c(trt, age, grade, response)) |> add_n() # Example 2 ---------------------------------- survey::svydesign(~1, data = as.data.frame(Titanic), weights = ~Freq) |> tbl_svysummary(by = Survived, percent = "row", include = c(Class, Age)) |> add_n()
For each survfit()
object summarized with tbl_survfit()
this function
will add the total number of observations in a new column.
## S3 method for class 'tbl_survfit' add_n(x, ...)
## S3 method for class 'tbl_survfit' add_n(x, ...)
x |
object of class " |
... |
Not used |
library(survival) fit1 <- survfit(Surv(ttdeath, death) ~ 1, trial) fit2 <- survfit(Surv(ttdeath, death) ~ trt, trial) # Example 1 ---------------------------------- list(fit1, fit2) |> tbl_survfit(times = c(12, 24)) |> add_n()
library(survival) fit1 <- survfit(Surv(ttdeath, death) ~ 1, trial) fit2 <- survfit(Surv(ttdeath, death) ~ trt, trial) # Example 1 ---------------------------------- list(fit1, fit2) |> tbl_survfit(times = c(12, 24)) |> add_n()
Add event N
add_nevent(x, ...) ## S3 method for class 'tbl_regression' add_nevent(x, location = "label", ...) ## S3 method for class 'tbl_uvregression' add_nevent(x, location = "label", ...)
add_nevent(x, ...) ## S3 method for class 'tbl_regression' add_nevent(x, location = "label", ...) ## S3 method for class 'tbl_uvregression' add_nevent(x, location = "label", ...)
x |
( |
... |
These dots are for future extensions and must be empty. |
location |
( When |
# Example 1 ---------------------------------- trial |> select(response, trt, grade) |> tbl_uvregression( y = response, exponentiate = TRUE, method = glm, method.args = list(family = binomial), ) |> add_nevent() # Example 2 ---------------------------------- glm(response ~ age + grade, trial, family = binomial) |> tbl_regression(exponentiate = TRUE) |> add_nevent(location = "level")
# Example 1 ---------------------------------- trial |> select(response, trt, grade) |> tbl_uvregression( y = response, exponentiate = TRUE, method = glm, method.args = list(family = binomial), ) |> add_nevent() # Example 2 ---------------------------------- glm(response ~ age + grade, trial, family = binomial) |> tbl_regression(exponentiate = TRUE) |> add_nevent(location = "level")
For each survfit()
object summarized with tbl_survfit()
this function
will add the total number of events observed in a new column.
## S3 method for class 'tbl_survfit' add_nevent(x, ...)
## S3 method for class 'tbl_survfit' add_nevent(x, ...)
x |
object of class 'tbl_survfit' |
... |
Not used |
Other tbl_survfit tools:
add_p.tbl_survfit()
library(survival) fit1 <- survfit(Surv(ttdeath, death) ~ 1, trial) fit2 <- survfit(Surv(ttdeath, death) ~ trt, trial) # Example 1 ---------------------------------- list(fit1, fit2) |> tbl_survfit(times = c(12, 24)) |> add_n() |> add_nevent()
library(survival) fit1 <- survfit(Surv(ttdeath, death) ~ 1, trial) fit2 <- survfit(Surv(ttdeath, death) ~ trt, trial) # Example 1 ---------------------------------- list(fit1, fit2) |> tbl_survfit(times = c(12, 24)) |> add_n() |> add_nevent()
Adds a column with overall summary statistics to tables
created by tbl_summary()
, tbl_svysummary()
, tbl_continuous()
or
tbl_custom_summary()
.
add_overall(x, ...) ## S3 method for class 'tbl_summary' add_overall( x, last = FALSE, col_label = "**Overall** \nN = {style_number(N)}", statistic = NULL, digits = NULL, ... ) ## S3 method for class 'tbl_continuous' add_overall( x, last = FALSE, col_label = "**Overall** \nN = {style_number(N)}", statistic = NULL, digits = NULL, ... ) ## S3 method for class 'tbl_svysummary' add_overall( x, last = FALSE, col_label = "**Overall** \nN = {style_number(N)}", statistic = NULL, digits = NULL, ... ) ## S3 method for class 'tbl_custom_summary' add_overall( x, last = FALSE, col_label = "**Overall** \nN = {style_number(N)}", statistic = NULL, digits = NULL, ... ) ## S3 method for class 'tbl_hierarchical' add_overall( x, last = FALSE, col_label = "**Overall** \nN = {style_number(N)}", statistic = NULL, digits = NULL, ... ) ## S3 method for class 'tbl_hierarchical_count' add_overall( x, last = FALSE, col_label = ifelse(rlang::is_empty(x$inputs$denominator), "**Overall**", "**Overall** \nN = {style_number(N)}"), statistic = NULL, digits = NULL, ... )
add_overall(x, ...) ## S3 method for class 'tbl_summary' add_overall( x, last = FALSE, col_label = "**Overall** \nN = {style_number(N)}", statistic = NULL, digits = NULL, ... ) ## S3 method for class 'tbl_continuous' add_overall( x, last = FALSE, col_label = "**Overall** \nN = {style_number(N)}", statistic = NULL, digits = NULL, ... ) ## S3 method for class 'tbl_svysummary' add_overall( x, last = FALSE, col_label = "**Overall** \nN = {style_number(N)}", statistic = NULL, digits = NULL, ... ) ## S3 method for class 'tbl_custom_summary' add_overall( x, last = FALSE, col_label = "**Overall** \nN = {style_number(N)}", statistic = NULL, digits = NULL, ... ) ## S3 method for class 'tbl_hierarchical' add_overall( x, last = FALSE, col_label = "**Overall** \nN = {style_number(N)}", statistic = NULL, digits = NULL, ... ) ## S3 method for class 'tbl_hierarchical_count' add_overall( x, last = FALSE, col_label = ifelse(rlang::is_empty(x$inputs$denominator), "**Overall**", "**Overall** \nN = {style_number(N)}"), statistic = NULL, digits = NULL, ... )
x |
( |
... |
These dots are for future extensions and must be empty. |
last |
(scalar |
col_label |
( |
statistic |
( |
digits |
( |
A gtsummary
of same class as x
Daniel D. Sjoberg
# Example 1 ---------------------------------- trial |> tbl_summary(include = c(age, grade), by = trt) |> add_overall() # Example 2 ---------------------------------- trial |> tbl_summary( include = grade, by = trt, percent = "row", statistic = ~"{p}%", digits = ~1 ) |> add_overall( last = TRUE, statistic = ~"{p}% (n={n})", digits = ~ c(1, 0) ) # Example 3 ---------------------------------- trial |> tbl_continuous( variable = age, by = trt, include = grade ) |> add_overall(last = TRUE)
# Example 1 ---------------------------------- trial |> tbl_summary(include = c(age, grade), by = trt) |> add_overall() # Example 2 ---------------------------------- trial |> tbl_summary( include = grade, by = trt, percent = "row", statistic = ~"{p}%", digits = ~1 ) |> add_overall( last = TRUE, statistic = ~"{p}% (n={n})", digits = ~ c(1, 0) ) # Example 3 ---------------------------------- trial |> tbl_continuous( variable = age, by = trt, include = grade ) |> add_overall(last = TRUE)
Adds a column with overall summary statistics to tables
created by tbl_ard_summary()
.
## S3 method for class 'tbl_ard_summary' add_overall( x, cards, last = FALSE, col_label = "**Overall**", statistic = NULL, ... )
## S3 method for class 'tbl_ard_summary' add_overall( x, cards, last = FALSE, col_label = "**Overall**", statistic = NULL, ... )
x |
( |
cards |
( |
last |
(scalar |
col_label |
( |
statistic |
( |
... |
These dots are for future extensions and must be empty. |
A gtsummary
of same class as x
Daniel D. Sjoberg
# Example 1 ---------------------------------- # build primary table tbl <- cards::ard_stack( trial, .by = trt, cards::ard_continuous(variables = age), cards::ard_categorical(variables = grade), .missing = TRUE, .attributes = TRUE, .total_n = TRUE ) |> tbl_ard_summary(by = trt) # create ARD with overall results ard_overall <- cards::ard_stack( trial, cards::ard_continuous(variables = age), cards::ard_categorical(variables = grade), .missing = TRUE, .attributes = TRUE, .total_n = TRUE ) # add an overall column tbl |> add_overall(cards = ard_overall)
# Example 1 ---------------------------------- # build primary table tbl <- cards::ard_stack( trial, .by = trt, cards::ard_continuous(variables = age), cards::ard_categorical(variables = grade), .missing = TRUE, .attributes = TRUE, .total_n = TRUE ) |> tbl_ard_summary(by = trt) # create ARD with overall results ard_overall <- cards::ard_stack( trial, cards::ard_continuous(variables = age), cards::ard_categorical(variables = grade), .missing = TRUE, .attributes = TRUE, .total_n = TRUE ) # add an overall column tbl |> add_overall(cards = ard_overall)
Add p-values
## S3 method for class 'tbl_continuous' add_p( x, test = NULL, pvalue_fun = label_style_pvalue(digits = 1), include = everything(), test.args = NULL, group = NULL, ... )
## S3 method for class 'tbl_continuous' add_p( x, test = NULL, pvalue_fun = label_style_pvalue(digits = 1), include = everything(), test.args = NULL, group = NULL, ... )
x |
( |
test |
List of formulas specifying statistical tests to perform for each
variable.
Default is two-way ANOVA when |
pvalue_fun |
( |
include |
( |
test.args |
( |
group |
( |
... |
These dots are for future extensions and must be empty. |
'tbl_continuous' object
# Example 1 ---------------------------------- trial |> tbl_continuous(variable = age, by = trt, include = grade) |> add_p(pvalue_fun = label_style_pvalue(digits = 2)) # Example 2 ---------------------------------- trial |> tbl_continuous(variable = age, include = grade) |> add_p(test = everything() ~ "kruskal.test")
# Example 1 ---------------------------------- trial |> tbl_continuous(variable = age, by = trt, include = grade) |> add_p(pvalue_fun = label_style_pvalue(digits = 2)) # Example 2 ---------------------------------- trial |> tbl_continuous(variable = age, include = grade) |> add_p(test = everything() ~ "kruskal.test")
Calculate and add a p-value comparing the two variables in the cross table. If missing levels are included in the tables, they are also included in p-value calculation.
## S3 method for class 'tbl_cross' add_p( x, test = NULL, pvalue_fun = ifelse(source_note, label_style_pvalue(digits = 1, prepend_p = TRUE), label_style_pvalue(digits = 1)), source_note = FALSE, test.args = NULL, ... )
## S3 method for class 'tbl_cross' add_p( x, test = NULL, pvalue_fun = ifelse(source_note, label_style_pvalue(digits = 1, prepend_p = TRUE), label_style_pvalue(digits = 1)), source_note = FALSE, test.args = NULL, ... )
x |
( |
test |
( |
pvalue_fun |
( |
source_note |
(scalar |
test.args |
(named |
... |
These dots are for future extensions and must be empty. |
Karissa Whiting, Daniel D. Sjoberg
# Example 1 ---------------------------------- trial |> tbl_cross(row = stage, col = trt) |> add_p() # Example 2 ---------------------------------- trial |> tbl_cross(row = stage, col = trt) |> add_p(source_note = TRUE)
# Example 1 ---------------------------------- trial |> tbl_cross(row = stage, col = trt) |> add_p() # Example 2 ---------------------------------- trial |> tbl_cross(row = stage, col = trt) |> add_p(source_note = TRUE)
Adds p-values to tables created by tbl_summary()
by comparing values across groups.
## S3 method for class 'tbl_summary' add_p( x, test = NULL, pvalue_fun = label_style_pvalue(digits = 1), group = NULL, include = everything(), test.args = NULL, adj.vars = NULL, ... )
## S3 method for class 'tbl_summary' add_p( x, test = NULL, pvalue_fun = label_style_pvalue(digits = 1), group = NULL, include = everything(), test.args = NULL, adj.vars = NULL, ... )
x |
( |
test |
( See below for details on default tests and ?tests for details on available tests and creating custom tests. |
pvalue_fun |
( |
group |
( |
include |
( |
test.args |
( |
adj.vars |
( |
... |
These dots are for future extensions and must be empty. |
a gtsummary table of class "tbl_summary"
See the ?tests help file for details on available tests and creating custom tests. The ?tests help file also includes pseudo-code for each test to be clear precisely how the calculation is performed.
The default test used in add_p()
primarily depends on these factors:
whether the variable is categorical/dichotomous vs continuous
number of levels in the tbl_summary(by)
variable
whether the add_p(group)
argument is specified
whether the add_p(adj.vars)
argument is specified
add_p(group)
nor add_p(adj.vars)
"wilcox.test"
when by
variable has two levels and variable is continuous.
"kruskal.test"
when by
variable has more than two levels and variable is continuous.
"chisq.test.no.correct"
for categorical variables with all expected cell counts >=5,
and "fisher.test"
for categorical variables with any expected cell count <5.
add_p(group)
and not add_p(adj.vars)
"lme4"
when by
variable has two levels for all summary types.
There is no default for grouped data when by
variable has more than two levels.
Users must create custom tests for this scenario.
add_p(adj.vars)
and not add_p(group)
"ancova"
when variable is continuous and by
variable has two levels.
# Example 1 ---------------------------------- trial |> tbl_summary(by = trt, include = c(age, grade)) |> add_p() # Example 2 ---------------------------------- trial |> select(trt, age, marker) |> tbl_summary(by = trt, missing = "no") |> add_p( # perform t-test for all variables test = everything() ~ "t.test", # assume equal variance in the t-test test.args = all_tests("t.test") ~ list(var.equal = TRUE) )
# Example 1 ---------------------------------- trial |> tbl_summary(by = trt, include = c(age, grade)) |> add_p() # Example 2 ---------------------------------- trial |> select(trt, age, marker) |> tbl_summary(by = trt, missing = "no") |> add_p( # perform t-test for all variables test = everything() ~ "t.test", # assume equal variance in the t-test test.args = all_tests("t.test") ~ list(var.equal = TRUE) )
Calculate and add a p-value to stratified tbl_survfit()
tables.
## S3 method for class 'tbl_survfit' add_p( x, test = "logrank", test.args = NULL, pvalue_fun = label_style_pvalue(digits = 1), include = everything(), quiet, ... )
## S3 method for class 'tbl_survfit' add_p( x, test = "logrank", test.args = NULL, pvalue_fun = label_style_pvalue(digits = 1), include = everything(), quiet, ... )
x |
( |
test |
( |
test.args |
(named |
pvalue_fun |
( |
include |
( |
quiet |
|
... |
These dots are for future extensions and must be empty. |
The most common way to specify test=
is by using a single string indicating
the test name. However, if you need to specify different tests within the same
table, the input in flexible using the list notation common throughout the
gtsummary package. For example, the following code would call the log-rank test,
and a second test of the G-rho family.
... |> add_p(test = list(trt ~ "logrank", grade ~ "survdiff"), test.args = grade ~ list(rho = 0.5))
To calculate the p-values, the formula is re-constructed from the the call in the
original survfit()
object.
When the survfit()
object is created a for loop, lapply()
, purrr::map()
setting the call may not reflect the true formula which may result in an
error or an incorrect calculation.
To ensure correct results, the call formula in survfit()
must represent the
formula that will be used in survival::survdiff()
.
If you utilize the tbl_survfit.data.frame()
S3 method, this is handled for you.
Other tbl_survfit tools:
add_nevent.tbl_survfit()
library(survival) gts_survfit <- list( survfit(Surv(ttdeath, death) ~ grade, trial), survfit(Surv(ttdeath, death) ~ trt, trial) ) |> tbl_survfit(times = c(12, 24)) # Example 1 ---------------------------------- gts_survfit |> add_p() # Example 2 ---------------------------------- # Pass `rho=` argument to `survdiff()` gts_survfit |> add_p(test = "survdiff", test.args = list(rho = 0.5))
library(survival) gts_survfit <- list( survfit(Surv(ttdeath, death) ~ grade, trial), survfit(Surv(ttdeath, death) ~ trt, trial) ) |> tbl_survfit(times = c(12, 24)) # Example 1 ---------------------------------- gts_survfit |> add_p() # Example 2 ---------------------------------- # Pass `rho=` argument to `survdiff()` gts_survfit |> add_p(test = "survdiff", test.args = list(rho = 0.5))
Adds p-values to tables created by tbl_svysummary()
by comparing values across groups.
## S3 method for class 'tbl_svysummary' add_p( x, test = list(all_continuous() ~ "svy.wilcox.test", all_categorical() ~ "svy.chisq.test"), pvalue_fun = label_style_pvalue(digits = 1), include = everything(), test.args = NULL, ... )
## S3 method for class 'tbl_svysummary' add_p( x, test = list(all_continuous() ~ "svy.wilcox.test", all_categorical() ~ "svy.chisq.test"), pvalue_fun = label_style_pvalue(digits = 1), include = everything(), test.args = NULL, ... )
x |
( |
test |
( See below for details on default tests and ?tests for details on available tests and creating custom tests. |
pvalue_fun |
( |
include |
( |
test.args |
( |
... |
These dots are for future extensions and must be empty. |
a gtsummary table of class "tbl_svysummary"
# Example 1 ---------------------------------- # A simple weighted dataset survey::svydesign(~1, data = as.data.frame(Titanic), weights = ~Freq) |> tbl_svysummary(by = Survived, include = c(Sex, Age)) |> add_p() # A dataset with a complex design data(api, package = "survey") d_clust <- survey::svydesign(id = ~dnum, weights = ~pw, data = apiclus1, fpc = ~fpc) # Example 2 ---------------------------------- tbl_svysummary(d_clust, by = both, include = c(api00, api99)) |> add_p() # Example 3 ---------------------------------- # change tests to svy t-test and Wald test tbl_svysummary(d_clust, by = both, include = c(api00, api99, stype)) |> add_p( test = list( all_continuous() ~ "svy.t.test", all_categorical() ~ "svy.wald.test" ) )
# Example 1 ---------------------------------- # A simple weighted dataset survey::svydesign(~1, data = as.data.frame(Titanic), weights = ~Freq) |> tbl_svysummary(by = Survived, include = c(Sex, Age)) |> add_p() # A dataset with a complex design data(api, package = "survey") d_clust <- survey::svydesign(id = ~dnum, weights = ~pw, data = apiclus1, fpc = ~fpc) # Example 2 ---------------------------------- tbl_svysummary(d_clust, by = both, include = c(api00, api99)) |> add_p() # Example 3 ---------------------------------- # change tests to svy t-test and Wald test tbl_svysummary(d_clust, by = both, include = c(api00, api99, stype)) |> add_p( test = list( all_continuous() ~ "svy.t.test", all_categorical() ~ "svy.wald.test" ) )
Adjustments to p-values are performed with stats::p.adjust()
.
add_q(x, method = "fdr", pvalue_fun = NULL, quiet = NULL)
add_q(x, method = "fdr", pvalue_fun = NULL, quiet = NULL)
x |
( |
method |
( |
pvalue_fun |
( |
quiet |
Daniel D. Sjoberg, Esther Drill
# Example 1 ---------------------------------- add_q_ex1 <- trial |> tbl_summary(by = trt, include = c(trt, age, grade, response)) |> add_p() |> add_q() # Example 2 ---------------------------------- trial |> tbl_uvregression( y = response, include = c("trt", "age", "grade"), method = glm, method.args = list(family = binomial), exponentiate = TRUE ) |> add_global_p() |> add_q()
# Example 1 ---------------------------------- add_q_ex1 <- trial |> tbl_summary(by = trt, include = c(trt, age, grade, response)) |> add_p() |> add_q() # Example 2 ---------------------------------- trial |> tbl_uvregression( y = response, include = c("trt", "age", "grade"), method = glm, method.args = list(family = binomial), exponentiate = TRUE ) |> add_global_p() |> add_q()
Add significance stars to estimates with small p-values
add_significance_stars( x, pattern = ifelse(inherits(x, c("tbl_regression", "tbl_uvregression")), "{estimate}{stars}", "{p.value}{stars}"), thresholds = c(0.001, 0.01, 0.05), hide_ci = TRUE, hide_p = inherits(x, c("tbl_regression", "tbl_uvregression")), hide_se = FALSE )
add_significance_stars( x, pattern = ifelse(inherits(x, c("tbl_regression", "tbl_uvregression")), "{estimate}{stars}", "{p.value}{stars}"), thresholds = c(0.001, 0.01, 0.05), hide_ci = TRUE, hide_p = inherits(x, c("tbl_regression", "tbl_uvregression")), hide_se = FALSE )
x |
( |
pattern |
( |
thresholds |
( |
hide_ci |
(scalar |
hide_p |
(scalar |
hide_se |
(scalar |
a 'gtsummary' table
tbl <- lm(time ~ ph.ecog + sex, survival::lung) |> tbl_regression(label = list(ph.ecog = "ECOG Score", sex = "Sex")) # Example 1 ---------------------------------- tbl |> add_significance_stars(hide_ci = FALSE, hide_p = FALSE) # Example 2 ---------------------------------- tbl |> add_significance_stars( pattern = "{estimate} ({conf.low}, {conf.high}){stars}", hide_ci = TRUE, hide_se = TRUE ) |> modify_header(estimate = "**Beta (95% CI)**") |> modify_footnote(estimate = "CI = Confidence Interval", abbreviation = TRUE) # Example 3 ---------------------------------- # Use ' \n' to put a line break between beta and SE tbl |> add_significance_stars( hide_se = TRUE, pattern = "{estimate}{stars} \n({std.error})" ) |> modify_header(estimate = "**Beta \n(SE)**") |> modify_footnote(estimate = "SE = Standard Error", abbreviation = TRUE) |> as_gt() |> gt::fmt_markdown(columns = everything()) |> gt::tab_style( style = "vertical-align:top", locations = gt::cells_body(columns = label) ) # Example 4 ---------------------------------- lm(marker ~ stage + grade, data = trial) |> tbl_regression() |> add_global_p() |> add_significance_stars( hide_p = FALSE, pattern = "{p.value}{stars}" )
tbl <- lm(time ~ ph.ecog + sex, survival::lung) |> tbl_regression(label = list(ph.ecog = "ECOG Score", sex = "Sex")) # Example 1 ---------------------------------- tbl |> add_significance_stars(hide_ci = FALSE, hide_p = FALSE) # Example 2 ---------------------------------- tbl |> add_significance_stars( pattern = "{estimate} ({conf.low}, {conf.high}){stars}", hide_ci = TRUE, hide_se = TRUE ) |> modify_header(estimate = "**Beta (95% CI)**") |> modify_footnote(estimate = "CI = Confidence Interval", abbreviation = TRUE) # Example 3 ---------------------------------- # Use ' \n' to put a line break between beta and SE tbl |> add_significance_stars( hide_se = TRUE, pattern = "{estimate}{stars} \n({std.error})" ) |> modify_header(estimate = "**Beta \n(SE)**") |> modify_footnote(estimate = "SE = Standard Error", abbreviation = TRUE) |> as_gt() |> gt::fmt_markdown(columns = everything()) |> gt::tab_style( style = "vertical-align:top", locations = gt::cells_body(columns = label) ) # Example 4 ---------------------------------- lm(marker ~ stage + grade, data = trial) |> tbl_regression() |> add_global_p() |> add_significance_stars( hide_p = FALSE, pattern = "{p.value}{stars}" )
The function allows a user to add a new column (or columns) of statistics to an
existing tbl_summary
, tbl_svysummary
, or tbl_continuous
object.
add_stat(x, fns, location = everything() ~ "label")
add_stat(x, fns, location = everything() ~ "label")
x |
( |
fns |
( |
location |
( |
A 'gtsummary' of the same class as the input
The returns from custom functions passed in fns=
are required to follow a
specified format. Each of these function will execute on a single variable.
Each function must return a tibble or a vector. If a vector is returned, it will be converted to a tibble with one column and number of rows equal to the length of the vector.
When location='label'
, the returned statistic from the custom function
must be a tibble with one row. When location='level'
the tibble must have
the same number of rows as there are levels in the variable (excluding the
row for unknown values).
Each function may take the following arguments: foo(data, variable, by, tbl, ...)
data=
is the input data frame passed to tbl_summary()
variable=
is a string indicating the variable to perform the calculation on. This is the variable in the label column of the table.
by=
is a string indicating the by variable from tbl_summary=
, if present
tbl=
the original tbl_summary()
/tbl_svysummary()
object is also available to utilize
The user-defined function does not need to utilize each of these inputs. It's
encouraged the user-defined function accept ...
as each of the arguments
will be passed to the function, even if not all inputs are utilized by
the user's function, e.g. foo(data, variable, by, ...)
Use modify_header()
to update the column headers
Use modify_fmt_fun()
to update the functions that format the statistics
Use modify_footnote()
to add a explanatory footnote
If you return a tibble with column names p.value
or q.value
, default
p-value formatting will be applied, and you may take advantage of subsequent
p-value formatting functions, such as bold_p()
or add_q()
.
# Example 1 ---------------------------------- # fn returns t-test pvalue my_ttest <- function(data, variable, by, ...) { t.test(data[[variable]] ~ as.factor(data[[by]]))$p.value } trial |> tbl_summary( by = trt, include = c(trt, age, marker), missing = "no" ) |> add_stat(fns = everything() ~ my_ttest) |> modify_header(add_stat_1 = "**p-value**", all_stat_cols() ~ "**{level}**") # Example 2 ---------------------------------- # fn returns t-test test statistic and pvalue my_ttest2 <- function(data, variable, by, ...) { t.test(data[[variable]] ~ as.factor(data[[by]])) |> broom::tidy() %>% dplyr::mutate( stat = glue::glue("t={style_sigfig(statistic)}, {style_pvalue(p.value, prepend_p = TRUE)}") ) %>% dplyr::pull(stat) } trial |> tbl_summary( by = trt, include = c(trt, age, marker), missing = "no" ) |> add_stat(fns = everything() ~ my_ttest2) |> modify_header(add_stat_1 = "**Treatment Comparison**") # Example 3 ---------------------------------- # return test statistic and p-value is separate columns my_ttest3 <- function(data, variable, by, ...) { t.test(data[[variable]] ~ as.factor(data[[by]])) %>% broom::tidy() %>% select(statistic, p.value) } trial |> tbl_summary( by = trt, include = c(trt, age, marker), missing = "no" ) |> add_stat(fns = everything() ~ my_ttest3) |> modify_header(statistic = "**t-statistic**", p.value = "**p-value**") |> modify_fmt_fun(statistic = label_style_sigfig(), p.value = label_style_pvalue(digits = 2))
# Example 1 ---------------------------------- # fn returns t-test pvalue my_ttest <- function(data, variable, by, ...) { t.test(data[[variable]] ~ as.factor(data[[by]]))$p.value } trial |> tbl_summary( by = trt, include = c(trt, age, marker), missing = "no" ) |> add_stat(fns = everything() ~ my_ttest) |> modify_header(add_stat_1 = "**p-value**", all_stat_cols() ~ "**{level}**") # Example 2 ---------------------------------- # fn returns t-test test statistic and pvalue my_ttest2 <- function(data, variable, by, ...) { t.test(data[[variable]] ~ as.factor(data[[by]])) |> broom::tidy() %>% dplyr::mutate( stat = glue::glue("t={style_sigfig(statistic)}, {style_pvalue(p.value, prepend_p = TRUE)}") ) %>% dplyr::pull(stat) } trial |> tbl_summary( by = trt, include = c(trt, age, marker), missing = "no" ) |> add_stat(fns = everything() ~ my_ttest2) |> modify_header(add_stat_1 = "**Treatment Comparison**") # Example 3 ---------------------------------- # return test statistic and p-value is separate columns my_ttest3 <- function(data, variable, by, ...) { t.test(data[[variable]] ~ as.factor(data[[by]])) %>% broom::tidy() %>% select(statistic, p.value) } trial |> tbl_summary( by = trt, include = c(trt, age, marker), missing = "no" ) |> add_stat(fns = everything() ~ my_ttest3) |> modify_header(statistic = "**t-statistic**", p.value = "**p-value**") |> modify_fmt_fun(statistic = label_style_sigfig(), p.value = label_style_pvalue(digits = 2))
Adds or modifies labels describing the summary statistics presented for
each variable in a tbl_summary()
table.
add_stat_label(x, ...) ## S3 method for class 'tbl_summary' add_stat_label(x, location = c("row", "column"), label = NULL, ...) ## S3 method for class 'tbl_svysummary' add_stat_label(x, location = c("row", "column"), label = NULL, ...) ## S3 method for class 'tbl_ard_summary' add_stat_label(x, location = c("row", "column"), label = NULL, ...)
add_stat_label(x, ...) ## S3 method for class 'tbl_summary' add_stat_label(x, location = c("row", "column"), label = NULL, ...) ## S3 method for class 'tbl_svysummary' add_stat_label(x, location = c("row", "column"), label = NULL, ...) ## S3 method for class 'tbl_ard_summary' add_stat_label(x, location = c("row", "column"), label = NULL, ...)
x |
( |
... |
These dots are for future extensions and must be empty. |
location |
( |
label |
( |
A tbl_summary
or tbl_svysummary
object
When using add_stat_label(location='row')
with subsequent tbl_merge()
,
it's important to have somewhat of an understanding of the underlying
structure of the gtsummary table.
add_stat_label(location='row')
works by adding a new column called
"stat_label"
to x$table_body
. The "label"
and "stat_label"
columns are merged when the gtsummary table is printed.
The tbl_merge()
function merges on the "label"
column (among others),
which is typically the first column you see in a gtsummary table.
Therefore, when you want to merge a table that has run add_stat_label(location='row')
you need to match the "label"
column values before the "stat_column"
is merged with it.
For example, the following two tables merge properly
tbl1 <- trial %>% select(age, grade) |> tbl_summary() |> add_stat_label() tbl2 <- lm(marker ~ age + grade, trial) |> tbl_regression() tbl_merge(list(tbl1, tbl2))
The addition of the new "stat_label"
column requires a default
labels for categorical variables, which is "No. (%)"
. This
can be changed to either desired text or left blank using NA_character_
.
The blank option is useful in the location="row"
case to keep the
output for categorical variables identical what was produced without
a "add_stat_label()"
function call.
Daniel D. Sjoberg
tbl <- trial |> dplyr::select(trt, age, grade, response) |> tbl_summary(by = trt) # Example 1 ---------------------------------- # Add statistic presented to the variable label row tbl |> add_stat_label( # update default statistic label for continuous variables label = all_continuous() ~ "med. (iqr)" ) # Example 2 ---------------------------------- tbl |> add_stat_label( # add a new column with statistic labels location = "column" ) # Example 3 ---------------------------------- trial |> select(age, grade, trt) |> tbl_summary( by = trt, type = all_continuous() ~ "continuous2", statistic = all_continuous() ~ c("{median} ({p25}, {p75})", "{min} - {max}"), ) |> add_stat_label(label = age ~ c("IQR", "Range"))
tbl <- trial |> dplyr::select(trt, age, grade, response) |> tbl_summary(by = trt) # Example 1 ---------------------------------- # Add statistic presented to the variable label row tbl |> add_stat_label( # update default statistic label for continuous variables label = all_continuous() ~ "med. (iqr)" ) # Example 2 ---------------------------------- tbl |> add_stat_label( # add a new column with statistic labels location = "column" ) # Example 3 ---------------------------------- trial |> select(age, grade, trt) |> tbl_summary( by = trt, type = all_continuous() ~ "continuous2", statistic = all_continuous() ~ c("{median} ({p25}, {p75})", "{min} - {max}"), ) |> add_stat_label(label = age ~ c("IQR", "Range"))
Add the variance inflation factor (VIF) or
generalized VIF (GVIF) to the regression table.
Function uses car::vif()
to calculate the VIF.
add_vif(x, statistic = NULL, estimate_fun = label_style_sigfig(digits = 2))
add_vif(x, statistic = NULL, estimate_fun = label_style_sigfig(digits = 2))
x |
|
statistic |
|
estimate_fun |
Default is |
Review list, formula, and selector syntax used throughout gtsummary
# Example 1 ---------------------------------- lm(age ~ grade + marker, trial) |> tbl_regression() |> add_vif() # Example 2 ---------------------------------- lm(age ~ grade + marker, trial) |> tbl_regression() |> add_vif(c("aGVIF", "df"))
# Example 1 ---------------------------------- lm(age ~ grade + marker, trial) |> tbl_regression() |> add_vif() # Example 2 ---------------------------------- lm(age ~ grade + marker, trial) |> tbl_regression() |> add_vif(c("aGVIF", "df"))
Function converts a gtsummary object to a flextable object. A user can use this function if they wish to add customized formatting available via the flextable functions. The flextable output is particularly useful when combined with R markdown with Word output, since the gt package does not support Word.
as_flex_table(x, include = everything(), return_calls = FALSE, ...)
as_flex_table(x, include = everything(), return_calls = FALSE, ...)
x |
( |
include |
Commands to include in output. Input may be a vector of
quoted or unquoted names. tidyselect and gtsummary select helper
functions are also accepted.
Default is |
return_calls |
Logical. Default is |
... |
Not used |
The as_flex_table()
function supports bold and italic markdown syntax in column headers
and spanning headers ('**'
and '_'
only).
Text wrapped in double stars ('**bold**'
) will be made bold, and text between single
underscores ('_italic_'
) will be made italic.
No other markdown syntax is supported and the double-star and underscore cannot be combined.
To further style your table, you may convert the table to flextable with
as_flex_table()
, then utilize any of the flextable functions.
A 'flextable' object
Daniel D. Sjoberg
trial |> select(trt, age, grade) |> tbl_summary(by = trt) |> add_p() |> as_flex_table()
trial |> select(trt, age, grade) |> tbl_summary(by = trt) |> add_p() |> as_flex_table()
Function converts a gtsummary object to a "gt_tbl"
object,
that is, a table created with gt::gt()
.
Function is used in the background when the results are printed or knit.
A user can use this function if they wish to add customized formatting
available via the gt package.
as_gt(x, include = everything(), return_calls = FALSE, ...)
as_gt(x, include = everything(), return_calls = FALSE, ...)
x |
( |
include |
Commands to include in output. Input may be a vector of
quoted or unquoted names. tidyselect and gtsummary select helper
functions are also accepted.
Default is |
return_calls |
Logical. Default is |
... |
Arguments passed on to |
A gt_tbl
object
As of 2024-08-15, line breaks (e.g. '\n'
) do not render properly for PDF output.
For now, these line breaks are stripped when rendering to PDF with Quarto and R markdown.
Daniel D. Sjoberg
# Example 1 ---------------------------------- trial |> tbl_summary(by = trt, include = c(age, grade, response)) |> as_gt()
# Example 1 ---------------------------------- trial |> tbl_summary(by = trt, include = c(age, grade, response)) |> as_gt()
Function converts a gtsummary object to a huxtable object. A user can use this function if they wish to add customized formatting available via the huxtable functions. The huxtable package supports output to PDF via LaTeX, as well as HTML and Word.
as_hux_table( x, include = everything(), return_calls = FALSE, strip_md_bold = FALSE ) as_hux_xlsx(x, file, include = everything(), bold_header_rows = TRUE)
as_hux_table( x, include = everything(), return_calls = FALSE, strip_md_bold = FALSE ) as_hux_xlsx(x, file, include = everything(), bold_header_rows = TRUE)
x |
( |
include |
Commands to include in output. Input may be a vector of
quoted or unquoted names. tidyselect and gtsummary select helper
functions are also accepted.
Default is |
return_calls |
Logical. Default is |
strip_md_bold |
|
file |
File path for the output. |
bold_header_rows |
(scalar |
A {huxtable} object
Use the as_hux_xlsx()
function to save a copy of the table in an excel file.
The file is saved using huxtable::quick_xlsx()
.
David Hugh-Jones, Daniel D. Sjoberg
trial |> tbl_summary(by = trt, include = c(age, grade)) |> add_p() |> as_hux_table()
trial |> tbl_summary(by = trt, include = c(age, grade)) |> add_p() |> as_hux_table()
Output from knitr::kable()
is less full featured compared to
summary tables produced with gt.
For example, kable summary tables do not include indentation, footnotes,
or spanning header rows.
Line breaks (\n
) are removed from column headers and table cells.
as_kable(x, ..., include = everything(), return_calls = FALSE)
as_kable(x, ..., include = everything(), return_calls = FALSE)
x |
( |
... |
Additional arguments passed to |
include |
Commands to include in output. Input may be a vector of
quoted or unquoted names. tidyselect and gtsummary select helper
functions are also accepted.
Default is |
return_calls |
Logical. Default is |
Tip: To better distinguish variable labels and level labels when
indenting is not supported, try bold_labels()
or italicize_levels()
.
A knitr_kable
object
Daniel D. Sjoberg
trial |> tbl_summary(by = trt) |> bold_labels() |> as_kable()
trial |> tbl_summary(by = trt) |> bold_labels() |> as_kable()
Function converts a gtsummary object to a knitr_kable + kableExtra object.
This allows the customized formatting available via knitr::kable()
and {kableExtra}; as_kable_extra()
supports arguments in knitr::kable()
.
as_kable_extra()
output via gtsummary supports
bold and italic cells for table bodies. Users
are encouraged to leverage as_kable_extra()
for enhanced pdf printing; for html
output options there is better support via as_gt()
.
as_kable_extra( x, escape = FALSE, format = NULL, ..., include = everything(), addtl_fmt = TRUE, return_calls = FALSE )
as_kable_extra( x, escape = FALSE, format = NULL, ..., include = everything(), addtl_fmt = TRUE, return_calls = FALSE )
x |
( |
format , escape , ...
|
arguments passed to |
include |
Commands to include in output. Input may be a vector of
quoted or unquoted names. tidyselect and gtsummary select helper
functions are also accepted.
Default is |
addtl_fmt |
logical indicating whether to include additional formatting.
Default is |
return_calls |
Logical. Default is |
A {kableExtra} table
This section shows options intended for use with output: pdf_document
in yaml of .Rmd
.
When the default values of as_kable_extra(escape = FALSE, addtl_fmt = TRUE)
are utilized, the following formatting occurs.
Markdown bold, italic, and underline syntax in the headers, spanning headers, caption, and footnote will be converted to escaped LaTeX code
Special characters in the table body, headers, spanning headers, caption,
and footnote will be escaped with .escape_latex()
or .escape_latex2()
The "\n"
symbol will be recognized as a line break in the table
headers, spanning headers, caption, and the table body
The "\n"
symbol is removed from the footnotes
To suppress these additional formats, set as_kable_extra(addtl_fmt = FALSE)
Additional styling is available with
kableExtra::kable_styling()
as shown in Example 2, which implements row
striping and repeated column headers in the presence of page breaks.
This section discusses options intended for use with output: html_document
in yaml of .Rmd
.
When the default values of as_kable_extra(escape = FALSE, addtl_fmt = TRUE)
are utilized, the following formatting occurs.
The default markdown syntax in the headers and spanning headers is removed
Special characters in the table body, headers, spanning headers, caption,
and footnote will be escaped with .escape_html()
The "\n"
symbol is removed from the footnotes
To suppress the additional formatting, set as_kable_extra(addtl_fmt = FALSE)
Daniel D. Sjoberg
# basic gtsummary tbl to build upon as_kable_extra_base <- trial |> tbl_summary(by = trt, include = c(age, stage)) |> bold_labels() # Example 1 (PDF via LaTeX) --------------------- # add linebreak in table header with '\n' as_kable_extra_ex1_pdf <- as_kable_extra_base |> modify_header(all_stat_cols() ~ "**{level}** \n*N = {n}*") |> as_kable_extra() # Example 2 (PDF via LaTeX) --------------------- # additional styling in `knitr::kable()` and with # call to `kableExtra::kable_styling()` as_kable_extra_ex2_pdf <- as_kable_extra_base |> as_kable_extra( booktabs = TRUE, longtable = TRUE, linesep = "" ) |> kableExtra::kable_styling( position = "left", latex_options = c("striped", "repeat_header"), stripe_color = "gray!15" )
# basic gtsummary tbl to build upon as_kable_extra_base <- trial |> tbl_summary(by = trt, include = c(age, stage)) |> bold_labels() # Example 1 (PDF via LaTeX) --------------------- # add linebreak in table header with '\n' as_kable_extra_ex1_pdf <- as_kable_extra_base |> modify_header(all_stat_cols() ~ "**{level}** \n*N = {n}*") |> as_kable_extra() # Example 2 (PDF via LaTeX) --------------------- # additional styling in `knitr::kable()` and with # call to `kableExtra::kable_styling()` as_kable_extra_ex2_pdf <- as_kable_extra_base |> as_kable_extra( booktabs = TRUE, longtable = TRUE, linesep = "" ) |> kableExtra::kable_styling( position = "left", latex_options = c("striped", "repeat_header"), stripe_color = "gray!15" )
Function converts a gtsummary object to a tibble.
## S3 method for class 'gtsummary' as_tibble( x, include = everything(), col_labels = TRUE, return_calls = FALSE, fmt_missing = FALSE, ... ) ## S3 method for class 'gtsummary' as.data.frame(...)
## S3 method for class 'gtsummary' as_tibble( x, include = everything(), col_labels = TRUE, return_calls = FALSE, fmt_missing = FALSE, ... ) ## S3 method for class 'gtsummary' as.data.frame(...)
x |
( |
include |
Commands to include in output. Input may be a vector of
quoted or unquoted names. tidyselect and gtsummary select helper
functions are also accepted.
Default is |
col_labels |
(scalar |
return_calls |
Logical. Default is |
fmt_missing |
(scalar |
... |
Arguments passed on to |
a tibble
Daniel D. Sjoberg
tbl <- trial |> tbl_summary(by = trt, include = c(age, grade, response)) as_tibble(tbl) # without column labels as_tibble(tbl, col_labels = FALSE)
tbl <- trial |> tbl_summary(by = trt, include = c(age, grade, response)) as_tibble(tbl) # without column labels as_tibble(tbl, col_labels = FALSE)
Used to assign the default formatting for variables summarized with
tbl_summary()
.
assign_summary_digits(data, statistic, type, digits = NULL)
assign_summary_digits(data, statistic, type, digits = NULL)
data |
( |
statistic |
( |
type |
( |
digits |
( |
a named list
assign_summary_digits( mtcars, statistic = list(mpg = "{mean}"), type = list(mpg = "continuous") )
assign_summary_digits( mtcars, statistic = list(mpg = "{mean}"), type = list(mpg = "continuous") )
Function inspects data and assigns a summary type when not specified
in the type
argument.
assign_summary_type(data, variables, value, type = NULL, cat_threshold = 10L)
assign_summary_type(data, variables, value, type = NULL, cat_threshold = 10L)
data |
( |
variables |
( |
value |
( |
type |
( |
cat_threshold |
( |
named list
assign_summary_type( data = trial, variables = c("age", "grade", "response"), value = NULL )
assign_summary_type( data = trial, variables = c("age", "grade", "response"), value = NULL )
This function is used to assign default tests for add_p()
and add_difference()
.
assign_tests(x, ...) ## S3 method for class 'tbl_summary' assign_tests( x, include, by = x$inputs$by, test = NULL, group = NULL, adj.vars = NULL, summary_type = x$inputs$type, calling_fun = c("add_p", "add_difference"), ... ) ## S3 method for class 'tbl_svysummary' assign_tests( x, include, by = x$inputs$by, test = NULL, group = NULL, adj.vars = NULL, summary_type = x$inputs$type, calling_fun = c("add_p", "add_difference"), ... ) ## S3 method for class 'tbl_continuous' assign_tests(x, include, by, cont_variable, test = NULL, group = NULL, ...) ## S3 method for class 'tbl_survfit' assign_tests(x, include, test = NULL, ...)
assign_tests(x, ...) ## S3 method for class 'tbl_summary' assign_tests( x, include, by = x$inputs$by, test = NULL, group = NULL, adj.vars = NULL, summary_type = x$inputs$type, calling_fun = c("add_p", "add_difference"), ... ) ## S3 method for class 'tbl_svysummary' assign_tests( x, include, by = x$inputs$by, test = NULL, group = NULL, adj.vars = NULL, summary_type = x$inputs$type, calling_fun = c("add_p", "add_difference"), ... ) ## S3 method for class 'tbl_continuous' assign_tests(x, include, by, cont_variable, test = NULL, group = NULL, ...) ## S3 method for class 'tbl_survfit' assign_tests(x, include, test = NULL, ...)
x |
( |
... |
Passed to |
include |
( |
by |
( |
test |
(named |
group |
( |
adj.vars |
( |
summary_type |
(named |
calling_fun |
( |
cont_variable |
( |
A table of class 'gtsummary'
trial |> tbl_summary( by = trt, include = c(age, stage) ) |> assign_tests(include = c("age", "stage"), calling_fun = "add_p")
trial |> tbl_summary( by = trt, include = c(age, stage) ) |> assign_tests(include = c("age", "stage"), calling_fun = "add_p")
Bold or italicize labels or levels in gtsummary tables
bold_labels(x) italicize_labels(x) bold_levels(x) italicize_levels(x) ## S3 method for class 'gtsummary' bold_labels(x) ## S3 method for class 'gtsummary' bold_levels(x) ## S3 method for class 'gtsummary' italicize_labels(x) ## S3 method for class 'gtsummary' italicize_levels(x) ## S3 method for class 'tbl_cross' bold_labels(x) ## S3 method for class 'tbl_cross' bold_levels(x) ## S3 method for class 'tbl_cross' italicize_labels(x) ## S3 method for class 'tbl_cross' italicize_levels(x)
bold_labels(x) italicize_labels(x) bold_levels(x) italicize_levels(x) ## S3 method for class 'gtsummary' bold_labels(x) ## S3 method for class 'gtsummary' bold_levels(x) ## S3 method for class 'gtsummary' italicize_labels(x) ## S3 method for class 'gtsummary' italicize_levels(x) ## S3 method for class 'tbl_cross' bold_labels(x) ## S3 method for class 'tbl_cross' bold_levels(x) ## S3 method for class 'tbl_cross' italicize_labels(x) ## S3 method for class 'tbl_cross' italicize_levels(x)
x |
( |
Functions return the same class of gtsummary object supplied
Daniel D. Sjoberg
# Example 1 ---------------------------------- tbl_summary(trial, include = c("trt", "age", "response")) |> bold_labels() |> bold_levels() |> italicize_labels() |> italicize_levels()
# Example 1 ---------------------------------- tbl_summary(trial, include = c("trt", "age", "response")) |> bold_labels() |> bold_levels() |> italicize_labels() |> italicize_levels()
Bold values below a chosen threshold (e.g. <0.05) in a gtsummary tables.
bold_p(x, t = 0.05, q = FALSE)
bold_p(x, t = 0.05, q = FALSE)
x |
( |
t |
(scalar |
q |
(scalar |
Daniel D. Sjoberg, Esther Drill
# Example 1 ---------------------------------- trial |> tbl_summary(by = trt, include = c(response, marker, trt), missing = "no") |> add_p() |> bold_p(t = 0.1) # Example 2 ---------------------------------- glm(response ~ trt + grade, trial, family = binomial(link = "logit")) |> tbl_regression(exponentiate = TRUE) |> bold_p(t = 0.65)
# Example 1 ---------------------------------- trial |> tbl_summary(by = trt, include = c(response, marker, trt), missing = "no") |> add_p() |> bold_p(t = 0.1) # Example 2 ---------------------------------- glm(response ~ trt + grade, trial, family = binomial(link = "logit")) |> tbl_regression(exponentiate = TRUE) |> bold_p(t = 0.65)
Bridge function for converting tbl_continuous()
cards to basic gtsummary objects.
This bridge function converts the 'cards' object to a format suitable to
pass to brdg_summary()
: no pier_*()
functions required.
brdg_continuous(cards, by = NULL, statistic, include, variable, type)
brdg_continuous(cards, by = NULL, statistic, include, variable, type)
cards |
( |
by |
( |
statistic |
(named |
include |
( |
variable |
( |
type |
(named |
a gtsummary object
library(cards) bind_ard( # the primary ARD with the results ard_continuous(trial, by = grade, variables = age), # add missing and attributes ARD ard_missing(trial, by = grade, variables = age), ard_attributes(trial, variables = c(grade, age)) ) |> # adding the column name dplyr::mutate( gts_column = ifelse(!context %in% "attributes", "stat_0", NA_character_) ) |> brdg_continuous( variable = "age", include = "grade", statistic = list(grade = "{median} ({p25}, {p75})"), type = list(grade = "categorical") ) |> as_tibble()
library(cards) bind_ard( # the primary ARD with the results ard_continuous(trial, by = grade, variables = age), # add missing and attributes ARD ard_missing(trial, by = grade, variables = age), ard_attributes(trial, variables = c(grade, age)) ) |> # adding the column name dplyr::mutate( gts_column = ifelse(!context %in% "attributes", "stat_0", NA_character_) ) |> brdg_continuous( variable = "age", include = "grade", statistic = list(grade = "{median} ({p25}, {p75})"), type = list(grade = "categorical") ) |> as_tibble()
Bridge function for converting tbl_hierarchical()
(and similar) cards to basic gtsummary objects.
All bridge functions begin with prefix brdg_*()
.
This file also contains helper functions for constructing the bridge,
referred to as the piers (supports for a bridge) and begin with pier_*()
.
brdg_hierarchical()
: The bridge function ingests an ARD data frame and returns
a gtsummary table that includes .$table_body
and a basic .$table_styling
.
The .$table_styling$header
data frame includes the header statistics.
Based on context, this function adds a column to the ARD data frame named
"gts_column"
. This column is used during the reshaping in the pier_*()
functions defining column names.
pier_*()
: these functions accept a cards tibble and returns a tibble
that is a piece of the .$table_body
. Typically these will be stacked
to construct the final table body data frame. The ARD object passed here
will have two primary parts: the calculated summary statistics and the
attributes ARD. The attributes ARD is used for labeling. The ARD data frame
passed to this function must include a "gts_column"
column, which is
added in brdg_hierarchical()
.
brdg_hierarchical( cards, variables, by, include, statistic, overall_row, count, is_ordered, label ) pier_summary_hierarchical(cards, variables, include, statistic)
brdg_hierarchical( cards, variables, by, include, statistic, overall_row, count, is_ordered, label ) pier_summary_hierarchical(cards, variables, include, statistic)
cards |
( |
variables |
( |
by |
( |
include |
( |
statistic |
(named |
overall_row |
(scalar |
count |
(scalar |
is_ordered |
(scalar |
label |
(named |
a gtsummary object
Review list, formula, and selector syntax used throughout gtsummary
Bridge function for converting tbl_summary()
(and similar) cards to basic gtsummary objects.
All bridge functions begin with prefix brdg_*()
.
This file also contains helper functions for constructing the bridge,
referred to as the piers (supports for a bridge) and begin with pier_*()
.
brdg_summary()
: The bridge function ingests an ARD data frame and returns
a gtsummary table that includes .$table_body
and a basic .$table_styling
.
The .$table_styling$header
data frame includes the header statistics.
Based on context, this function adds a column to the ARD data frame named
"gts_column"
. This column is used during the reshaping in the pier_*()
functions defining column names.
pier_*()
: these functions accept a cards tibble and returns a tibble
that is a piece of the .$table_body
. Typically these will be stacked
to construct the final table body data frame. The ARD object passed here
will have two primary parts: the calculated summary statistics and the
attributes ARD. The attributes ARD is used for labeling. The ARD data frame
passed to this function must include a "gts_column"
column, which is
added in brdg_summary()
.
brdg_summary( cards, variables, type, statistic, by = NULL, missing = "no", missing_stat = "{N_miss}", missing_text = "Unknown" ) pier_summary_dichotomous(cards, variables, statistic) pier_summary_categorical(cards, variables, statistic) pier_summary_continuous2(cards, variables, statistic) pier_summary_continuous(cards, variables, statistic) pier_summary_missing_row( cards, variables, missing = "no", missing_stat = "{N_miss}", missing_text = "Unknown" )
brdg_summary( cards, variables, type, statistic, by = NULL, missing = "no", missing_stat = "{N_miss}", missing_text = "Unknown" ) pier_summary_dichotomous(cards, variables, statistic) pier_summary_categorical(cards, variables, statistic) pier_summary_continuous2(cards, variables, statistic) pier_summary_continuous(cards, variables, statistic) pier_summary_missing_row( cards, variables, missing = "no", missing_stat = "{N_miss}", missing_text = "Unknown" )
cards |
( |
variables |
( |
type |
(named |
statistic |
(named |
by |
( |
missing , missing_text , missing_stat
|
Arguments dictating how and if missing values are presented:
|
a gtsummary object
library(cards) # first build ARD data frame cards <- ard_stack( mtcars, ard_continuous(variables = c("mpg", "hp")), ard_categorical(variables = "cyl"), ard_dichotomous(variables = "am"), .missing = TRUE, .attributes = TRUE ) |> # this column is used by the `pier_*()` functions dplyr::mutate(gts_column = ifelse(context == "attributes", NA, "stat_0")) brdg_summary( cards = cards, variables = c("cyl", "am", "mpg", "hp"), type = list( cyl = "categorical", am = "dichotomous", mpg = "continuous", hp = "continuous2" ), statistic = list( cyl = "{n} / {N}", am = "{n} / {N}", mpg = "{mean} ({sd})", hp = c("{median} ({p25}, {p75})", "{mean} ({sd})") ) ) |> as_tibble() pier_summary_dichotomous( cards = cards, variables = "am", statistic = list(am = "{n} ({p})") ) pier_summary_categorical( cards = cards, variables = "cyl", statistic = list(cyl = "{n} ({p})") ) pier_summary_continuous2( cards = cards, variables = "hp", statistic = list(hp = c("{median}", "{mean}")) ) pier_summary_continuous( cards = cards, variables = "mpg", statistic = list(mpg = "{median}") )
library(cards) # first build ARD data frame cards <- ard_stack( mtcars, ard_continuous(variables = c("mpg", "hp")), ard_categorical(variables = "cyl"), ard_dichotomous(variables = "am"), .missing = TRUE, .attributes = TRUE ) |> # this column is used by the `pier_*()` functions dplyr::mutate(gts_column = ifelse(context == "attributes", NA, "stat_0")) brdg_summary( cards = cards, variables = c("cyl", "am", "mpg", "hp"), type = list( cyl = "categorical", am = "dichotomous", mpg = "continuous", hp = "continuous2" ), statistic = list( cyl = "{n} / {N}", am = "{n} / {N}", mpg = "{mean} ({sd})", hp = c("{median} ({p25}, {p75})", "{mean} ({sd})") ) ) |> as_tibble() pier_summary_dichotomous( cards = cards, variables = "am", statistic = list(am = "{n} ({p})") ) pier_summary_categorical( cards = cards, variables = "cyl", statistic = list(cyl = "{n} ({p})") ) pier_summary_continuous2( cards = cards, variables = "hp", statistic = list(hp = c("{median}", "{mean}")) ) pier_summary_continuous( cards = cards, variables = "mpg", statistic = list(mpg = "{median}") )
Bridge function for converting tbl_wide_summary()
(and similar) cards to basic gtsummary objects.
All bridge functions begin with prefix brdg_*()
.
brdg_wide_summary(cards, variables, statistic, type)
brdg_wide_summary(cards, variables, statistic, type)
cards |
( |
variables |
( |
statistic |
(named |
type |
(named |
a gtsummary object
library(cards) bind_ard( ard_continuous(trial, variables = c(age, marker)), ard_attributes(trial, variables = c(age, marker)) ) |> brdg_wide_summary( variables = c("age", "marker"), statistic = list(age = c("{mean}", "{sd}"), marker = c("{mean}", "{sd}")), type = list(age = "continuous", marker = "continuous") )
library(cards) bind_ard( ard_continuous(trial, variables = c(age, marker)), ard_attributes(trial, variables = c(age, marker)) ) |> brdg_wide_summary( variables = c("age", "marker"), statistic = list(age = c("{mean}", "{sd}"), marker = c("{mean}", "{sd}")), type = list(age = "continuous", marker = "continuous") )
The function combines terms from a regression model, and replaces the terms
with a single row in the output table. The p-value is calculated using
stats::anova()
.
combine_terms(x, formula_update, label = NULL, quiet, ...)
combine_terms(x, formula_update, label = NULL, quiet, ...)
x |
( |
formula_update |
( |
label |
( |
quiet |
|
... |
Additional arguments passed to stats::anova |
tbl_regression
object
Daniel D. Sjoberg
# Example 1 ---------------------------------- # Logistic Regression Example, LRT p-value glm(response ~ marker + I(marker^2) + grade, trial[c("response", "marker", "grade")] |> na.omit(), # keep complete cases only! family = binomial) |> tbl_regression(label = grade ~ "Grade", exponentiate = TRUE) |> # collapse non-linear terms to a single row in output using anova combine_terms( formula_update = . ~ . - marker - I(marker^2), label = "Marker (non-linear terms)", test = "LRT" )
# Example 1 ---------------------------------- # Logistic Regression Example, LRT p-value glm(response ~ marker + I(marker^2) + grade, trial[c("response", "marker", "grade")] |> na.omit(), # keep complete cases only! family = binomial) |> tbl_regression(label = grade ~ "Grade", exponentiate = TRUE) |> # collapse non-linear terms to a single row in output using anova combine_terms( formula_update = . ~ . - marker - I(marker^2), label = "Marker (non-linear terms)", test = "LRT" )
Collection of tidiers that can be utilized in gtsummary. See details below.
tidy_standardize( x, exponentiate = FALSE, conf.level = 0.95, conf.int = TRUE, ..., quiet = FALSE ) tidy_bootstrap( x, exponentiate = FALSE, conf.level = 0.95, conf.int = TRUE, ..., quiet = FALSE ) tidy_robust( x, exponentiate = FALSE, conf.level = 0.95, conf.int = TRUE, vcov = NULL, vcov_args = NULL, ..., quiet = FALSE ) pool_and_tidy_mice(x, pool.args = NULL, ..., quiet = FALSE) tidy_gam(x, conf.int = FALSE, exponentiate = FALSE, conf.level = 0.95, ...) tidy_wald_test(x, tidy_fun = NULL, vcov = stats::vcov(x), ...)
tidy_standardize( x, exponentiate = FALSE, conf.level = 0.95, conf.int = TRUE, ..., quiet = FALSE ) tidy_bootstrap( x, exponentiate = FALSE, conf.level = 0.95, conf.int = TRUE, ..., quiet = FALSE ) tidy_robust( x, exponentiate = FALSE, conf.level = 0.95, conf.int = TRUE, vcov = NULL, vcov_args = NULL, ..., quiet = FALSE ) pool_and_tidy_mice(x, pool.args = NULL, ..., quiet = FALSE) tidy_gam(x, conf.int = FALSE, exponentiate = FALSE, conf.level = 0.95, ...) tidy_wald_test(x, tidy_fun = NULL, vcov = stats::vcov(x), ...)
x |
( |
exponentiate |
(scalar |
conf.level |
(scalar |
conf.int |
(scalar |
... |
Arguments passed to method;
|
quiet |
|
vcov , vcov_args
|
|
pool.args |
(named |
tidy_fun |
( |
These tidiers are passed to tbl_regression()
and tbl_uvregression()
to
obtain modified results.
tidy_standardize()
tidier to report standardized coefficients. The
parameters
package includes a wonderful function to estimate standardized coefficients.
The tidier uses the output from parameters::standardize_parameters()
, and
merely takes the result and puts it in broom::tidy()
format.
tidy_bootstrap()
tidier to report bootstrapped coefficients. The
parameters
package includes a wonderful function to estimate bootstrapped coefficients.
The tidier uses the output from parameters::bootstrap_parameters(test = "p")
, and
merely takes the result and puts it in broom::tidy()
format.
tidy_robust()
tidier to report robust standard errors, confidence intervals,
and p-values. The parameters
package includes a wonderful function to calculate robust standard errors, confidence intervals, and p-values
The tidier uses the output from parameters::model_parameters()
, and
merely takes the result and puts it in broom::tidy()
format. To use this
function with tbl_regression()
, pass a function with the arguments for
tidy_robust()
populated.
pool_and_tidy_mice()
tidier to report models resulting from multiply imputed data
using the mice package. Pass the mice model object before the model results
have been pooled. See example.
tidy_wald_test()
tidier to report Wald p-values, wrapping the
aod::wald.test()
function.
Use this tidier with add_global_p(anova_fun = tidy_wald_test)
# Example 1 ---------------------------------- mod <- lm(age ~ marker + grade, trial) tbl_stnd <- tbl_regression(mod, tidy_fun = tidy_standardize) tbl <- tbl_regression(mod) tidy_standardize_ex1 <- tbl_merge( list(tbl_stnd, tbl), tab_spanner = c("**Standardized Model**", "**Original Model**") ) # Example 2 ---------------------------------- # use "posthoc" method for coef calculation tbl_regression(mod, tidy_fun = \(x, ...) tidy_standardize(x, method = "posthoc", ...)) # Example 3 ---------------------------------- # Multiple Imputation using the mice package set.seed(1123) pool_and_tidy_mice_ex3 <- suppressWarnings(mice::mice(trial, m = 2)) |> with(lm(age ~ marker + grade)) |> tbl_regression()
# Example 1 ---------------------------------- mod <- lm(age ~ marker + grade, trial) tbl_stnd <- tbl_regression(mod, tidy_fun = tidy_standardize) tbl <- tbl_regression(mod) tidy_standardize_ex1 <- tbl_merge( list(tbl_stnd, tbl), tab_spanner = c("**Standardized Model**", "**Original Model**") ) # Example 2 ---------------------------------- # use "posthoc" method for coef calculation tbl_regression(mod, tidy_fun = \(x, ...) tidy_standardize(x, method = "posthoc", ...)) # Example 3 ---------------------------------- # Multiple Imputation using the mice package set.seed(1123) pool_and_tidy_mice_ex3 <- suppressWarnings(mice::mice(trial, m = 2)) |> with(lm(age ~ marker + grade)) |> tbl_regression()
Extract the ARDs from a gtsummary table. If needed, results may be combined
with cards::bind_ard()
.
gather_ard(x)
gather_ard(x)
x |
( |
list
tbl_summary(trial, by = trt, include = age) |> add_overall() |> add_p() |> gather_ard() glm(response ~ trt, data = trial, family = binomial()) |> tbl_regression() |> gather_ard()
tbl_summary(trial, by = trt, include = age) |> add_overall() |> add_p() |> gather_ard() glm(response ~ trt, data = trial, family = binomial()) |> tbl_regression() |> gather_ard()
Report statistics from summary tables inline
## S3 method for class 'gtsummary' inline_text(x, variable, level = NULL, column = NULL, pattern = NULL, ...)
## S3 method for class 'gtsummary' inline_text(x, variable, level = NULL, column = NULL, pattern = NULL, ...)
x |
( |
variable |
( |
level |
( |
column |
( |
pattern |
( |
... |
These dots are for future extensions and must be empty. |
A string
Some gtsummary tables report multiple statistics in a single cell,
e.g. "{mean} ({sd})"
in tbl_summary()
or tbl_svysummary()
.
We often need to report just the mean or the SD, and that can be accomplished
by using both the column=
and pattern=
arguments. When both of these
arguments are specified, the column argument selects the column to report
statistics from, and the pattern argument specifies which statistics to report,
e.g. inline_text(x, column = "stat_1", pattern = "{mean}")
reports just the
mean from a tbl_summary()
. This is not supported for all tables.
Extracts and returns statistics from a tbl_continuous()
object for
inline reporting in an R markdown document. Detailed examples in the
inline_text vignette
## S3 method for class 'tbl_continuous' inline_text( x, variable, column = NULL, level = NULL, pattern = NULL, pvalue_fun = label_style_pvalue(prepend_p = TRUE), ... )
## S3 method for class 'tbl_continuous' inline_text( x, variable, column = NULL, level = NULL, pattern = NULL, pvalue_fun = label_style_pvalue(prepend_p = TRUE), ... )
x |
( |
variable |
( |
column |
( |
level |
( |
pattern |
( |
pvalue_fun |
( |
... |
These dots are for future extensions and must be empty. |
A string reporting results from a gtsummary table
Daniel D. Sjoberg
t1 <- trial |> tbl_summary(by = trt, include = grade) |> add_p() inline_text(t1, variable = grade, level = "I", column = "Drug A", pattern = "{n}/{N} ({p}%)") inline_text(t1, variable = grade, column = "p.value")
t1 <- trial |> tbl_summary(by = trt, include = grade) |> add_p() inline_text(t1, variable = grade, level = "I", column = "Drug A", pattern = "{n}/{N} ({p}%)") inline_text(t1, variable = grade, column = "p.value")
Extracts and returns statistics from a tbl_cross
object for
inline reporting in an R markdown document. Detailed examples in the
inline_text vignette
## S3 method for class 'tbl_cross' inline_text( x, col_level, row_level = NULL, pvalue_fun = label_style_pvalue(prepend_p = TRUE), ... )
## S3 method for class 'tbl_cross' inline_text( x, col_level, row_level = NULL, pvalue_fun = label_style_pvalue(prepend_p = TRUE), ... )
x |
( |
col_level |
( |
row_level |
( |
pvalue_fun |
( |
... |
These dots are for future extensions and must be empty. |
A string reporting results from a gtsummary table
tbl_cross <- tbl_cross(trial, row = trt, col = response) %>% add_p() inline_text(tbl_cross, row_level = "Drug A", col_level = "1") inline_text(tbl_cross, row_level = "Total", col_level = "1") inline_text(tbl_cross, col_level = "p.value")
tbl_cross <- tbl_cross(trial, row = trt, col = response) %>% add_p() inline_text(tbl_cross, row_level = "Drug A", col_level = "1") inline_text(tbl_cross, row_level = "Total", col_level = "1") inline_text(tbl_cross, col_level = "p.value")
Takes an object with class tbl_regression
, and the
location of the statistic to report and returns statistics for reporting
inline in an R markdown document. Detailed examples in the
inline_text vignette
## S3 method for class 'tbl_regression' inline_text( x, variable, level = NULL, pattern = "{estimate} ({conf.level*100}% CI {conf.low}, {conf.high}; {p.value})", estimate_fun = x$inputs$estimate_fun, pvalue_fun = label_style_pvalue(prepend_p = TRUE), ... )
## S3 method for class 'tbl_regression' inline_text( x, variable, level = NULL, pattern = "{estimate} ({conf.level*100}% CI {conf.low}, {conf.high}; {p.value})", estimate_fun = x$inputs$estimate_fun, pvalue_fun = label_style_pvalue(prepend_p = TRUE), ... )
x |
( |
variable |
( |
level |
( |
pattern |
( |
estimate_fun |
( |
pvalue_fun |
function to style p-values and/or q-values.
Default is |
... |
These dots are for future extensions and must be empty. |
A string reporting results from a gtsummary table
The following items (and more) are available to print. Use print(x$table_body)
to
print the table the estimates are extracted from.
{estimate}
coefficient estimate formatted with 'estimate_fun'
{conf.low}
lower limit of confidence interval formatted with 'estimate_fun'
{conf.high}
upper limit of confidence interval formatted with 'estimate_fun'
{p.value}
p-value formatted with 'pvalue_fun'
{N}
number of observations in model
{label}
variable/variable level label
Daniel D. Sjoberg
inline_text_ex1 <- glm(response ~ age + grade, trial, family = binomial(link = "logit")) %>% tbl_regression(exponentiate = TRUE) inline_text(inline_text_ex1, variable = age) inline_text(inline_text_ex1, variable = grade, level = "III")
inline_text_ex1 <- glm(response ~ age + grade, trial, family = binomial(link = "logit")) %>% tbl_regression(exponentiate = TRUE) inline_text(inline_text_ex1, variable = age) inline_text(inline_text_ex1, variable = grade, level = "III")
Extracts and returns statistics from a tbl_summary()
object for
inline reporting in an R markdown document. Detailed examples in the
inline_text vignette
## S3 method for class 'tbl_summary' inline_text( x, variable, column = NULL, level = NULL, pattern = NULL, pvalue_fun = label_style_pvalue(prepend_p = TRUE), ... ) ## S3 method for class 'tbl_svysummary' inline_text( x, variable, column = NULL, level = NULL, pattern = NULL, pvalue_fun = label_style_pvalue(prepend_p = TRUE), ... )
## S3 method for class 'tbl_summary' inline_text( x, variable, column = NULL, level = NULL, pattern = NULL, pvalue_fun = label_style_pvalue(prepend_p = TRUE), ... ) ## S3 method for class 'tbl_svysummary' inline_text( x, variable, column = NULL, level = NULL, pattern = NULL, pvalue_fun = label_style_pvalue(prepend_p = TRUE), ... )
x |
( |
variable |
( |
column |
( |
level |
( |
pattern |
( |
pvalue_fun |
( |
... |
These dots are for future extensions and must be empty. |
A string reporting results from a gtsummary table
Daniel D. Sjoberg
t1 <- trial |> tbl_summary(by = trt, include = grade) |> add_p() inline_text(t1, variable = grade, level = "I", column = "Drug A", pattern = "{n}/{N} ({p}%)") inline_text(t1, variable = grade, column = "p.value")
t1 <- trial |> tbl_summary(by = trt, include = grade) |> add_p() inline_text(t1, variable = grade, level = "I", column = "Drug A", pattern = "{n}/{N} ({p}%)") inline_text(t1, variable = grade, column = "p.value")
Extracts and returns statistics from a tbl_survfit
object for
inline reporting in an R markdown document. Detailed examples in the
inline_text vignette
## S3 method for class 'tbl_survfit' inline_text( x, variable = NULL, level = NULL, pattern = NULL, time = NULL, prob = NULL, column = NULL, estimate_fun = x$inputs$estimate_fun, pvalue_fun = label_style_pvalue(prepend_p = TRUE), ... )
## S3 method for class 'tbl_survfit' inline_text( x, variable = NULL, level = NULL, pattern = NULL, time = NULL, prob = NULL, column = NULL, estimate_fun = x$inputs$estimate_fun, pvalue_fun = label_style_pvalue(prepend_p = TRUE), ... )
x |
( |
variable |
( |
level |
( |
pattern |
( |
time , prob
|
( |
column |
( |
estimate_fun |
( |
pvalue_fun |
( |
... |
These dots are for future extensions and must be empty. |
A string reporting results from a gtsummary table
Daniel D. Sjoberg
library(survival) # fit survfit fit1 <- survfit(Surv(ttdeath, death) ~ trt, trial) fit2 <- survfit(Surv(ttdeath, death) ~ 1, trial) # sumarize survfit objects tbl1 <- tbl_survfit( fit1, times = c(12, 24), label = ~"Treatment", label_header = "**{time} Month**" ) %>% add_p() tbl2 <- tbl_survfit( fit2, probs = 0.5, label_header = "**Median Survival**" ) # report results inline inline_text(tbl1, time = 24, level = "Drug B") inline_text(tbl1, time = 24, level = "Drug B", pattern = "{estimate} [95% CI {conf.low}, {conf.high}]") inline_text(tbl1, column = p.value) inline_text(tbl2, prob = 0.5)
library(survival) # fit survfit fit1 <- survfit(Surv(ttdeath, death) ~ trt, trial) fit2 <- survfit(Surv(ttdeath, death) ~ 1, trial) # sumarize survfit objects tbl1 <- tbl_survfit( fit1, times = c(12, 24), label = ~"Treatment", label_header = "**{time} Month**" ) %>% add_p() tbl2 <- tbl_survfit( fit2, probs = 0.5, label_header = "**Median Survival**" ) # report results inline inline_text(tbl1, time = 24, level = "Drug B") inline_text(tbl1, time = 24, level = "Drug B", pattern = "{estimate} [95% CI {conf.low}, {conf.high}]") inline_text(tbl1, column = p.value) inline_text(tbl2, prob = 0.5)
Extracts and returns statistics from a table created by the tbl_uvregression
function for inline reporting in an R markdown document.
Detailed examples in the
inline_text vignette
## S3 method for class 'tbl_uvregression' inline_text( x, variable, level = NULL, pattern = "{estimate} ({conf.level*100}% CI {conf.low}, {conf.high}; {p.value})", estimate_fun = x$inputs$estimate_fun, pvalue_fun = label_style_pvalue(prepend_p = TRUE), ... )
## S3 method for class 'tbl_uvregression' inline_text( x, variable, level = NULL, pattern = "{estimate} ({conf.level*100}% CI {conf.low}, {conf.high}; {p.value})", estimate_fun = x$inputs$estimate_fun, pvalue_fun = label_style_pvalue(prepend_p = TRUE), ... )
x |
( |
variable |
( |
level |
( |
pattern |
( |
estimate_fun |
( |
pvalue_fun |
function to style p-values and/or q-values.
Default is |
... |
These dots are for future extensions and must be empty. |
A string reporting results from a gtsummary table
The following items (and more) are available to print. Use print(x$table_body)
to
print the table the estimates are extracted from.
{estimate}
coefficient estimate formatted with 'estimate_fun'
{conf.low}
lower limit of confidence interval formatted with 'estimate_fun'
{conf.high}
upper limit of confidence interval formatted with 'estimate_fun'
{p.value}
p-value formatted with 'pvalue_fun'
{N}
number of observations in model
{label}
variable/variable level label
Daniel D. Sjoberg
inline_text_ex1 <- trial[c("response", "age", "grade")] %>% tbl_uvregression( method = glm, method.args = list(family = binomial), y = response, exponentiate = TRUE ) inline_text(inline_text_ex1, variable = age) inline_text(inline_text_ex1, variable = grade, level = "III")
inline_text_ex1 <- trial[c("response", "age", "grade")] %>% tbl_uvregression( method = glm, method.args = list(family = binomial), y = response, exponentiate = TRUE ) inline_text(inline_text_ex1, variable = age) inline_text(inline_text_ex1, variable = grade, level = "III")
Similar to the style_*()
family of functions, but these functions return
a style_*()
function rather than performing the styling.
label_style_number( digits = 0, big.mark = ifelse(decimal.mark == ",", " ", ","), decimal.mark = getOption("OutDec"), scale = 1, prefix = "", suffix = "", ... ) label_style_sigfig( digits = 2, scale = 1, big.mark = ifelse(decimal.mark == ",", " ", ","), decimal.mark = getOption("OutDec"), prefix = "", suffix = "", ... ) label_style_pvalue( digits = 1, prepend_p = FALSE, big.mark = ifelse(decimal.mark == ",", " ", ","), decimal.mark = getOption("OutDec"), ... ) label_style_ratio( digits = 2, big.mark = ifelse(decimal.mark == ",", " ", ","), decimal.mark = getOption("OutDec"), prefix = "", suffix = "", ... ) label_style_percent( prefix = "", suffix = "", digits = 0, big.mark = ifelse(decimal.mark == ",", " ", ","), decimal.mark = getOption("OutDec"), ... )
label_style_number( digits = 0, big.mark = ifelse(decimal.mark == ",", " ", ","), decimal.mark = getOption("OutDec"), scale = 1, prefix = "", suffix = "", ... ) label_style_sigfig( digits = 2, scale = 1, big.mark = ifelse(decimal.mark == ",", " ", ","), decimal.mark = getOption("OutDec"), prefix = "", suffix = "", ... ) label_style_pvalue( digits = 1, prepend_p = FALSE, big.mark = ifelse(decimal.mark == ",", " ", ","), decimal.mark = getOption("OutDec"), ... ) label_style_ratio( digits = 2, big.mark = ifelse(decimal.mark == ",", " ", ","), decimal.mark = getOption("OutDec"), prefix = "", suffix = "", ... ) label_style_percent( prefix = "", suffix = "", digits = 0, big.mark = ifelse(decimal.mark == ",", " ", ","), decimal.mark = getOption("OutDec"), ... )
digits , big.mark , decimal.mark , scale , prepend_p , prefix , suffix , ...
|
arguments
passed to the |
a function
Other style tools:
style_sigfig()
my_style <- label_style_number(digits = 1) my_style(3.14)
my_style <- label_style_number(digits = 1) my_style(3.14)
These functions assist with modifying the aesthetics/style of a table.
modify_header()
update column headers
modify_footnote()
update/add table footnotes
modify_spanning_header()
update/add spanning headers
The functions often require users to know the underlying column names.
Run show_header_names()
to print the column names to the console.
modify_header(x, ..., text_interpret = c("md", "html"), quiet, update) modify_footnote( x, ..., abbreviation = FALSE, text_interpret = c("md", "html"), update, quiet ) modify_spanning_header(x, ..., text_interpret = c("md", "html"), quiet, update) show_header_names(x, include_example, quiet)
modify_header(x, ..., text_interpret = c("md", "html"), quiet, update) modify_footnote( x, ..., abbreviation = FALSE, text_interpret = c("md", "html"), update, quiet ) modify_spanning_header(x, ..., text_interpret = c("md", "html"), quiet, update) show_header_names(x, include_example, quiet)
x |
( |
... |
Use Use the |
text_interpret |
( |
update , quiet
|
|
abbreviation |
(scalar |
include_example |
Updated gtsummary object
tbl_summary()
, tbl_svysummary()
, and tbl_cross()
When assigning column headers, footnotes, and spanning headers,
you may use {N}
to insert the number of observations.
tbl_svysummary
objects additionally have {N_unweighted}
available.
When there is a stratifying by=
argument present, the following fields are
additionally available to stratifying columns: {level}
, {n}
, and {p}
({n_unweighted}
and {p_unweighted}
for tbl_svysummary
objects)
Syntax follows glue::glue()
, e.g. all_stat_cols() ~ "**{level}**, N = {n}"
.
When assigning column headers for tbl_regression
tables,
you may use {N}
to insert the number of observations, and {N_event}
for the number of events (when applicable).
Daniel D. Sjoberg
# create summary table tbl <- trial |> tbl_summary(by = trt, missing = "no", include = c("age", "grade", "trt")) |> add_p() # print the column names that can be modified show_header_names(tbl) # Example 1 ---------------------------------- # updating column headers and footnote tbl |> modify_header(label = "**Variable**", p.value = "**P**") |> modify_footnote(all_stat_cols() ~ "median (IQR) for Age; n (%) for Grade") # Example 2 ---------------------------------- # updating headers, remove all footnotes, add spanning header tbl |> modify_header(all_stat_cols() ~ "**{level}**, N = {n} ({style_percent(p)}%)") |> modify_footnote(everything() ~ NA) |> modify_spanning_header(all_stat_cols() ~ "**Treatment Received**") # Example 3 ---------------------------------- # updating an abbreviation in table footnote glm(response ~ age + grade, trial, family = binomial) |> tbl_regression(exponentiate = TRUE) |> modify_footnote(conf.low = "CI = Credible Interval", abbreviation = TRUE)
# create summary table tbl <- trial |> tbl_summary(by = trt, missing = "no", include = c("age", "grade", "trt")) |> add_p() # print the column names that can be modified show_header_names(tbl) # Example 1 ---------------------------------- # updating column headers and footnote tbl |> modify_header(label = "**Variable**", p.value = "**P**") |> modify_footnote(all_stat_cols() ~ "median (IQR) for Age; n (%) for Grade") # Example 2 ---------------------------------- # updating headers, remove all footnotes, add spanning header tbl |> modify_header(all_stat_cols() ~ "**{level}**, N = {n} ({style_percent(p)}%)") |> modify_footnote(everything() ~ NA) |> modify_spanning_header(all_stat_cols() ~ "**Treatment Received**") # Example 3 ---------------------------------- # updating an abbreviation in table footnote glm(response ~ age + grade, trial, family = binomial) |> tbl_regression(exponentiate = TRUE) |> modify_footnote(conf.low = "CI = Credible Interval", abbreviation = TRUE)
Update column alignment/justification in a gtsummary table.
modify_column_alignment(x, columns, align = c("left", "right", "center"))
modify_column_alignment(x, columns, align = c("left", "right", "center"))
x |
( |
columns |
( |
align |
( |
# Example 1 ---------------------------------- lm(age ~ marker + grade, trial) %>% tbl_regression() %>% modify_column_alignment(columns = everything(), align = "left")
# Example 1 ---------------------------------- lm(age ~ marker + grade, trial) %>% tbl_regression() %>% modify_column_alignment(columns = everything(), align = "left")
Use these functions to hide or unhide columns in a gtsummary table.
Use show_header_names(show_hidden=TRUE)
to print available columns to update.
modify_column_hide(x, columns) modify_column_unhide(x, columns)
modify_column_hide(x, columns) modify_column_unhide(x, columns)
x |
( |
columns |
( |
Daniel D. Sjoberg
# Example 1 ---------------------------------- # hide 95% CI, and replace with standard error lm(age ~ marker + grade, trial) |> tbl_regression() |> modify_column_hide(conf.low) |> modify_column_unhide(columns = std.error)
# Example 1 ---------------------------------- # hide 95% CI, and replace with standard error lm(age ~ marker + grade, trial) |> tbl_regression() |> modify_column_hide(conf.low) |> modify_column_unhide(columns = std.error)
Add, increase, or reduce indentation for columns.
modify_column_indent(x, columns, rows = NULL, indent = 4L, double_indent, undo)
modify_column_indent(x, columns, rows = NULL, indent = 4L, double_indent, undo)
x |
( |
columns |
( |
rows |
(predicate |
indent |
( |
double_indent , undo
|
a gtsummary table
Other Advanced modifiers:
modify_column_merge()
,
modify_table_styling()
# remove indentation from `tbl_summary()` trial |> tbl_summary(include = grade) |> modify_column_indent(columns = label, indent = 0L) # increase indentation in `tbl_summary` trial |> tbl_summary(include = grade) |> modify_column_indent(columns = label, rows = !row_type %in% 'label', indent = 8L)
# remove indentation from `tbl_summary()` trial |> tbl_summary(include = grade) |> modify_column_indent(columns = label, indent = 0L) # increase indentation in `tbl_summary` trial |> tbl_summary(include = grade) |> modify_column_indent(columns = label, rows = !row_type %in% 'label', indent = 8L)
Merge two or more columns in a gtsummary table.
Use show_header_names()
to print underlying column names.
modify_column_merge(x, pattern, rows = NULL)
modify_column_merge(x, pattern, rows = NULL)
x |
( |
pattern |
glue syntax string indicating how to merge columns in
|
rows |
(predicate |
gtsummary table
Calling this function merely records the instructions to merge columns.
The actual merging occurs when the gtsummary table is printed or converted
with a function like as_gt()
.
Because the column merging is delayed, it is recommended to perform
major modifications to the table, such as those with tbl_merge()
and
tbl_stack()
, before assigning merging instructions. Otherwise,
unexpected formatting may occur in the final table.
If this functionality is used in conjunction with tbl_stack()
(which
includes tbl_uvregression()
), there may be potential issues with printing.
When columns are stack AND when the column-merging is
defined with a quosure, you may run into issues due to the loss of the
environment when 2 or more quosures are combined. If the expression
version of the quosure is the same as the quosure (i.e. no evaluated
objects), there should be no issues.
This function is used internally with care, and it is not recommended for users.
There are planned updates to the implementation of this function
with respect to the pattern=
argument.
Currently, this function replaces a numeric column with a
formatted character column following pattern=
.
Once gt::cols_merge()
gains the rows=
argument the
implementation will be updated to use it, which will keep
numeric columns numeric. For the vast majority of users,
the planned change will be go unnoticed.
Other Advanced modifiers:
modify_column_indent()
,
modify_table_styling()
# Example 1 ---------------------------------- trial |> tbl_summary(by = trt, missing = "no", include = c(age, marker, trt)) |> add_p(all_continuous() ~ "t.test", pvalue_fun = label_style_pvalue(prepend_p = TRUE)) |> modify_fmt_fun(statistic ~ label_style_sigfig()) |> modify_column_merge(pattern = "t = {statistic}; {p.value}") |> modify_header(statistic = "**t-test**") # Example 2 ---------------------------------- lm(marker ~ age + grade, trial) |> tbl_regression() |> modify_column_merge( pattern = "{estimate} ({conf.low}, {conf.high})", rows = !is.na(estimate) )
# Example 1 ---------------------------------- trial |> tbl_summary(by = trt, missing = "no", include = c(age, marker, trt)) |> add_p(all_continuous() ~ "t.test", pvalue_fun = label_style_pvalue(prepend_p = TRUE)) |> modify_fmt_fun(statistic ~ label_style_sigfig()) |> modify_column_merge(pattern = "t = {statistic}; {p.value}") |> modify_header(statistic = "**t-test**") # Example 2 ---------------------------------- lm(marker ~ age + grade, trial) |> tbl_regression() |> modify_column_merge( pattern = "{estimate} ({conf.low}, {conf.high})", rows = !is.na(estimate) )
Use this function to update the way numeric columns and rows of .$table_body
are formatted
modify_fmt_fun(x, ..., rows = NULL, update, quiet)
modify_fmt_fun(x, ..., rows = NULL, update, quiet)
x |
( |
... |
Use Use the |
rows |
(predicate |
update , quiet
|
The rows argument accepts a predicate expression that is used to specify
rows to apply formatting. The expression must evaluate to a logical when
evaluated in x$table_body
. For example, to apply formatting to the age rows
pass rows = variable == "age"
. A vector of row numbers is NOT acceptable.
A couple of things to note when using the rows
argument.
You can use saved objects to create the predicate argument, e.g.
rows = variable == letters[1]
.
The saved object cannot share a name with a column in x$table_body
.
The reason for this is that in tbl_merge()
the columns are renamed,
and the renaming process cannot disambiguate the variable
column from
an external object named variable
in the following expression
rows = .data$variable = .env$variable
.
# Example 1 ---------------------------------- # show 'grade' p-values to 3 decimal places and estimates to 4 sig figs lm(age ~ marker + grade, trial) |> tbl_regression() %>% modify_fmt_fun( p.value = label_style_pvalue(digits = 3), c(estimate, conf.low, conf.high) ~ label_style_sigfig(digits = 4), rows = variable == "grade" )
# Example 1 ---------------------------------- # show 'grade' p-values to 3 decimal places and estimates to 4 sig figs lm(age ~ marker + grade, trial) |> tbl_regression() %>% modify_fmt_fun( p.value = label_style_pvalue(digits = 3), c(estimate, conf.low, conf.high) ~ label_style_sigfig(digits = 4), rows = variable == "grade" )
Add and remove source notes from a table. Source notes are similar to footnotes, expect they are not linked to a cell in the table.
modify_source_note(x, source_note, text_interpret = c("md", "html")) remove_source_note(x, source_note_id)
modify_source_note(x, source_note, text_interpret = c("md", "html")) remove_source_note(x, source_note_id)
x |
( |
source_note |
( |
text_interpret |
( |
source_note_id |
( |
Source notes are not supported by as_kable_extra()
.
gtsummary object
Function is for advanced manipulation of gtsummary tables.
It allow users to modify the .$table_body
data frame included
in each gtsummary object.
If a new column is added to the table, default printing instructions will then
be added to .$table_styling
. By default, columns are hidden.
To show a column, add a column header with modify_header()
or call
modify_column_unhide()
.
modify_table_body(x, fun, ...)
modify_table_body(x, fun, ...)
x |
( |
fun |
( |
... |
Additional arguments passed on to the function |
A 'gtsummary' object
# Example 1 -------------------------------- # Add number of cases and controls to regression table trial |> tbl_uvregression( y = response, include = c(age, marker), method = glm, method.args = list(family = binomial), exponentiate = TRUE, hide_n = TRUE ) |> # adding number of non-events to table modify_table_body( ~ .x %>% dplyr::mutate(N_nonevent = N_obs - N_event) |> dplyr::relocate(c(N_event, N_nonevent), .before = estimate) ) |> # assigning header labels modify_header(N_nonevent = "**Control N**", N_event = "**Case N**") |> modify_fmt_fun(c(N_event, N_nonevent) ~ style_number)
# Example 1 -------------------------------- # Add number of cases and controls to regression table trial |> tbl_uvregression( y = response, include = c(age, marker), method = glm, method.args = list(family = binomial), exponentiate = TRUE, hide_n = TRUE ) |> # adding number of non-events to table modify_table_body( ~ .x %>% dplyr::mutate(N_nonevent = N_obs - N_event) |> dplyr::relocate(c(N_event, N_nonevent), .before = estimate) ) |> # assigning header labels modify_header(N_nonevent = "**Control N**", N_event = "**Case N**") |> modify_fmt_fun(c(N_event, N_nonevent) ~ style_number)
This is a function meant for advanced users to gain
more control over the characteristics of the resulting
gtsummary table by directly modifying .$table_styling
.
This function is primarily used in the development of other gtsummary
functions, and very little checking of the passed arguments is performed.
modify_table_styling( x, columns, rows = NULL, label = NULL, spanning_header = NULL, hide = NULL, footnote = NULL, footnote_abbrev = NULL, align = NULL, missing_symbol = NULL, fmt_fun = NULL, text_format = NULL, undo_text_format = NULL, indent = NULL, text_interpret = c("md", "html"), cols_merge_pattern = NULL )
modify_table_styling( x, columns, rows = NULL, label = NULL, spanning_header = NULL, hide = NULL, footnote = NULL, footnote_abbrev = NULL, align = NULL, missing_symbol = NULL, fmt_fun = NULL, text_format = NULL, undo_text_format = NULL, indent = NULL, text_interpret = c("md", "html"), cols_merge_pattern = NULL )
x |
( |
columns |
( |
rows |
(predicate |
label |
( |
spanning_header |
( |
hide |
(scalar |
footnote |
( |
footnote_abbrev |
( |
align |
( |
missing_symbol |
( |
fmt_fun |
( |
text_format , undo_text_format
|
( |
indent |
( |
text_interpret |
( |
cols_merge_pattern |
( |
Review the
gtsummary definition
vignette for information on .$table_styling
objects.
The rows argument accepts a predicate expression that is used to specify
rows to apply formatting. The expression must evaluate to a logical when
evaluated in x$table_body
. For example, to apply formatting to the age rows
pass rows = variable == "age"
. A vector of row numbers is NOT acceptable.
A couple of things to note when using the rows
argument.
You can use saved objects to create the predicate argument, e.g.
rows = variable == letters[1]
.
The saved object cannot share a name with a column in x$table_body
.
The reason for this is that in tbl_merge()
the columns are renamed,
and the renaming process cannot disambiguate the variable
column from
an external object named variable
in the following expression
rows = .data$variable = .env$variable
.
There are planned updates to the implementation of column merging.
Currently, this function replaces the numeric column with a
formatted character column following cols_merge_pattern=
.
Once gt::cols_merge()
gains the rows=
argument the
implementation will be updated to use it, which will keep
numeric columns numeric. For the vast majority of users,
the planned change will be go unnoticed.
If this functionality is used in conjunction with tbl_stack()
(which
includes tbl_uvregression()
), there is potential issue with printing.
When columns are stack AND when the column-merging is
defined with a quosure, you may run into issues due to the loss of the
environment when 2 or more quosures are combined. If the expression
version of the quosure is the same as the quosure (i.e. no evaluated
objects), there should be no issues. Regardless, this argument is used
internally with care, and it is not recommended for users.
See gtsummary internals vignette
Other Advanced modifiers:
modify_column_indent()
,
modify_column_merge()
The plot()
function extracts x$table_body
and passes the it to
ggstats::ggcoef_plot()
along with formatting options.
## S3 method for class 'tbl_regression' plot(x, remove_header_rows = TRUE, remove_reference_rows = FALSE, ...) ## S3 method for class 'tbl_uvregression' plot(x, remove_header_rows = TRUE, remove_reference_rows = FALSE, ...)
## S3 method for class 'tbl_regression' plot(x, remove_header_rows = TRUE, remove_reference_rows = FALSE, ...) ## S3 method for class 'tbl_uvregression' plot(x, remove_header_rows = TRUE, remove_reference_rows = FALSE, ...)
x |
( |
remove_header_rows |
(scalar |
remove_reference_rows |
(scalar |
... |
arguments passed to |
a ggplot
glm(response ~ marker + grade, trial, family = binomial) |> tbl_regression( add_estimate_to_reference_rows = TRUE, exponentiate = TRUE ) |> plot()
glm(response ~ marker + grade, trial, family = binomial) |> tbl_regression( add_estimate_to_reference_rows = TRUE, exponentiate = TRUE ) |> plot()
This helper, to be used with tbl_custom_summary()
, creates a function
computing a proportion and its confidence interval.
proportion_summary( variable, value, weights = NULL, na.rm = TRUE, conf.level = 0.95, method = c("wilson", "wilson.no.correct", "wald", "wald.no.correct", "exact", "agresti.coull", "jeffreys") )
proportion_summary( variable, value, weights = NULL, na.rm = TRUE, conf.level = 0.95, method = c("wilson", "wilson.no.correct", "wald", "wald.no.correct", "exact", "agresti.coull", "jeffreys") )
variable |
( |
value |
( |
weights |
( |
na.rm |
(scalar |
conf.level |
(scalar |
method |
( |
Computed statistics:
{n}
numerator, number of observations equal to values
{N}
denominator, number of observations
{prop}
proportion, i.e. n/N
{conf.low}
lower confidence interval
{conf.high}
upper confidence interval
Methods c("wilson", "wilson.no.correct")
are calculated with
stats::prop.test()
(with correct = c(TRUE, FALSE)
). The default method,
"wilson"
, includes the Yates continuity correction.
Methods c("exact", "asymptotic")
are calculated with Hmisc::binconf()
and the corresponding method.
Joseph Larmarange
# Example 1 ---------------------------------- Titanic |> as.data.frame() |> tbl_custom_summary( include = c("Age", "Class"), by = "Sex", stat_fns = ~ proportion_summary("Survived", "Yes", weights = "Freq"), statistic = ~ "{prop}% ({n}/{N}) [{conf.low}-{conf.high}]", digits = ~ list( prop = label_style_percent(digits = 1), n = 0, N = 0, conf.low = label_style_percent(), conf.high = label_style_percent() ), overall_row = TRUE, overall_row_last = TRUE ) |> bold_labels() |> modify_footnote(all_stat_cols() ~ "Proportion (%) of survivors (n/N) [95% CI]")
# Example 1 ---------------------------------- Titanic |> as.data.frame() |> tbl_custom_summary( include = c("Age", "Class"), by = "Sex", stat_fns = ~ proportion_summary("Survived", "Yes", weights = "Freq"), statistic = ~ "{prop}% ({n}/{N}) [{conf.low}-{conf.high}]", digits = ~ list( prop = label_style_percent(digits = 1), n = 0, N = 0, conf.low = label_style_percent(), conf.high = label_style_percent() ), overall_row = TRUE, overall_row_last = TRUE ) |> bold_labels() |> modify_footnote(all_stat_cols() ~ "Proportion (%) of survivors (n/N) [95% CI]")
This helper, to be used with tbl_custom_summary()
, creates a function
computing the ratio of two continuous variables and its confidence interval.
ratio_summary(numerator, denominator, na.rm = TRUE, conf.level = 0.95)
ratio_summary(numerator, denominator, na.rm = TRUE, conf.level = 0.95)
numerator |
( |
denominator |
( |
na.rm |
(scalar |
conf.level |
(scalar |
Computed statistics:
{num}
sum of the variable defined by numerator
{denom}
sum of the variable defined by denominator
{ratio}
ratio of num
by denom
{conf.low}
lower confidence interval
{conf.high}
upper confidence interval
Confidence interval is computed with stats::poisson.test()
, if and only if
num
is an integer.
Joseph Larmarange
# Example 1 ---------------------------------- trial |> tbl_custom_summary( include = c("stage", "grade"), by = "trt", stat_fns = ~ ratio_summary("response", "ttdeath"), statistic = ~"{ratio} [{conf.low}; {conf.high}] ({num}/{denom})", digits = ~ c(ratio = 3, conf.low = 2, conf.high = 2), overall_row = TRUE, overall_row_label = "All stages & grades" ) |> bold_labels() |> modify_footnote(all_stat_cols() ~ "Ratio [95% CI] (n/N)")
# Example 1 ---------------------------------- trial |> tbl_custom_summary( include = c("stage", "grade"), by = "trt", stat_fns = ~ ratio_summary("response", "ttdeath"), statistic = ~"{ratio} [{conf.low}; {conf.high}] ({num}/{denom})", digits = ~ c(ratio = 3, conf.low = 2, conf.high = 2), overall_row = TRUE, overall_row_label = "All stages & grades" ) |> bold_labels() |> modify_footnote(all_stat_cols() ~ "Ratio [95% CI] (n/N)")
Removes either the header, reference, or missing rows from a gtsummary table.
remove_row_type( x, variables = everything(), type = c("header", "reference", "missing", "level", "all"), level_value = NULL )
remove_row_type( x, variables = everything(), type = c("header", "reference", "missing", "level", "all"), level_value = NULL )
x |
( |
variables |
( |
type |
( |
level_value |
( |
Modified gtsummary table
# Example 1 ---------------------------------- trial |> dplyr::mutate( age60 = ifelse(age < 60, "<60", "60+") ) |> tbl_summary(by = trt, missing = "no", include = c(trt, age, age60)) |> remove_row_type(age60, type = "header")
# Example 1 ---------------------------------- trial |> dplyr::mutate( age60 = ifelse(age < 60, "<60", "60+") ) |> tbl_summary(by = trt, missing = "no", include = c(trt, age, age60)) |> remove_row_type(age60, type = "header")
Set of functions to supplement the {tidyselect} set of functions for selecting columns of data frames (and other items as well).
all_continuous()
selects continuous variables
all_continuous2()
selects only type "continuous2"
all_categorical()
selects categorical (including "dichotomous"
) variables
all_dichotomous()
selects only type "dichotomous"
all_tests()
selects variables by the name of the test performed
all_stat_cols()
selects columns from tbl_summary
/tbl_svysummary
object with summary statistics (i.e. "stat_0"
, "stat_1"
, "stat_2"
, etc.)
all_interaction()
selects interaction terms from a regression model
all_intercepts()
selects intercept terms from a regression model
all_contrasts()
selects variables in regression model based on their type of contrast
all_continuous(continuous2 = TRUE) all_continuous2() all_categorical(dichotomous = TRUE) all_dichotomous() all_tests(tests) all_intercepts() all_interaction() all_contrasts( contrasts_type = c("treatment", "sum", "poly", "helmert", "sdif", "other") ) all_stat_cols(stat_0 = TRUE)
all_continuous(continuous2 = TRUE) all_continuous2() all_categorical(dichotomous = TRUE) all_dichotomous() all_tests(tests) all_intercepts() all_interaction() all_contrasts( contrasts_type = c("treatment", "sum", "poly", "helmert", "sdif", "other") ) all_stat_cols(stat_0 = TRUE)
continuous2 |
(scalar |
dichotomous |
(scalar |
tests |
( |
contrasts_type |
( |
stat_0 |
(scalar |
A character vector of column names selected
Review list, formula, and selector syntax used throughout gtsummary
select_ex1 <- trial |> select(age, response, grade) |> tbl_summary( statistic = all_continuous() ~ "{mean} ({sd})", type = all_dichotomous() ~ "categorical" )
select_ex1 <- trial |> select(age, response, grade) |> tbl_summary( statistic = all_continuous() ~ "{mean} ({sd})", type = all_dichotomous() ~ "categorical" )
The usual presentation of footnotes for p-values on a gtsummary table is
to have a single footnote that lists all statistical tests that were used to
compute p-values on a given table. The separate_p_footnotes()
function
separates aggregated p-value footnotes to individual footnotes that denote
the specific test used for each of the p-values.
separate_p_footnotes(x)
separate_p_footnotes(x)
x |
( |
# Example 1 ---------------------------------- trial |> tbl_summary(by = trt, include = c(age, grade)) |> add_p() |> separate_p_footnotes()
# Example 1 ---------------------------------- trial |> tbl_summary(by = trt, include = c(age, grade)) |> add_p() |> separate_p_footnotes()
Functions to set, reset, get, and evaluate with gtsummary themes.
set_gtsummary_theme()
set a theme
reset_gtsummary_theme()
reset themes
get_gtsummary_theme()
get a named list with all active theme elements
with_gtsummary_theme()
evaluate an expression with a theme temporarily set
check_gtsummary_theme()
checks if passed theme is valid
set_gtsummary_theme(x, quiet) reset_gtsummary_theme() get_gtsummary_theme() with_gtsummary_theme( x, expr, env = rlang::caller_env(), msg_ignored_elements = NULL ) check_gtsummary_theme(x)
set_gtsummary_theme(x, quiet) reset_gtsummary_theme() get_gtsummary_theme() with_gtsummary_theme( x, expr, env = rlang::caller_env(), msg_ignored_elements = NULL ) check_gtsummary_theme(x)
x |
(named |
quiet |
|
expr |
( |
env |
( |
msg_ignored_elements |
( |
The default formatting and styling throughout the gtsummary package are taken from the published reporting guidelines of the top four urology journals: European Urology, The Journal of Urology, Urology and the British Journal of Urology International. Use this function to change the default reporting style to match another journal, or your own personal style.
Available gtsummary themes
# Setting JAMA theme for gtsummary set_gtsummary_theme(theme_gtsummary_journal("jama")) # Themes can be combined by including more than one set_gtsummary_theme(theme_gtsummary_compact()) set_gtsummary_theme_ex1 <- trial |> tbl_summary(by = trt, include = c(age, grade, trt)) |> add_stat_label() |> as_gt() # reset gtsummary theme reset_gtsummary_theme()
# Setting JAMA theme for gtsummary set_gtsummary_theme(theme_gtsummary_journal("jama")) # Themes can be combined by including more than one set_gtsummary_theme(theme_gtsummary_compact()) set_gtsummary_theme_ex1 <- trial |> tbl_summary(by = trt, include = c(age, grade, trt)) |> add_stat_label() |> as_gt() # reset gtsummary theme reset_gtsummary_theme()
Sort/filter by p-values
sort_p(x, q = FALSE) filter_p(x, q = FALSE, t = 0.05)
sort_p(x, q = FALSE) filter_p(x, q = FALSE, t = 0.05)
x |
( |
q |
(scalar |
t |
(scalar |
Karissa Whiting, Daniel D. Sjoberg
# Example 1 ---------------------------------- trial %>% select(age, grade, response, trt) %>% tbl_summary(by = trt) %>% add_p() %>% filter_p(t = 0.8) %>% sort_p() # Example 2 ---------------------------------- glm(response ~ trt + grade, trial, family = binomial(link = "logit")) %>% tbl_regression(exponentiate = TRUE) %>% sort_p()
# Example 1 ---------------------------------- trial %>% select(age, grade, response, trt) %>% tbl_summary(by = trt) %>% add_p() %>% filter_p(t = 0.8) %>% sort_p() # Example 2 ---------------------------------- glm(response ~ trt + grade, trial, family = binomial(link = "logit")) %>% tbl_regression(exponentiate = TRUE) %>% sort_p()
Style numbers
style_number( x, digits = 0, big.mark = ifelse(decimal.mark == ",", " ", ","), decimal.mark = getOption("OutDec"), scale = 1, prefix = "", suffix = "", ... )
style_number( x, digits = 0, big.mark = ifelse(decimal.mark == ",", " ", ","), decimal.mark = getOption("OutDec"), scale = 1, prefix = "", suffix = "", ... )
x |
( |
digits |
(non-negative |
big.mark |
( |
decimal.mark |
( |
scale |
(scalar |
prefix |
( |
suffix |
( |
... |
Arguments passed on to |
formatted character vector
c(0.111, 12.3) |> style_number(digits = 1) c(0.111, 12.3) |> style_number(digits = c(1, 0))
c(0.111, 12.3) |> style_number(digits = 1) c(0.111, 12.3) |> style_number(digits = c(1, 0))
Style percentages
style_percent( x, digits = 0, big.mark = ifelse(decimal.mark == ",", " ", ","), decimal.mark = getOption("OutDec"), prefix = "", suffix = "", symbol, ... )
style_percent( x, digits = 0, big.mark = ifelse(decimal.mark == ",", " ", ","), decimal.mark = getOption("OutDec"), prefix = "", suffix = "", symbol, ... )
x |
numeric vector of percentages |
digits |
number of digits to round large percentages (i.e. greater than 10%).
Smaller percentages are rounded to |
big.mark |
( |
decimal.mark |
( |
prefix |
( |
suffix |
( |
symbol |
Logical indicator to include percent symbol in output.
Default is |
... |
Arguments passed on to |
A character vector of styled percentages
Daniel D. Sjoberg
percent_vals <- c(-1, 0, 0.0001, 0.005, 0.01, 0.10, 0.45356, 0.99, 1.45) style_percent(percent_vals) style_percent(percent_vals, suffix = "%", digits = 1)
percent_vals <- c(-1, 0, 0.0001, 0.005, 0.01, 0.10, 0.45356, 0.99, 1.45) style_percent(percent_vals) style_percent(percent_vals, suffix = "%", digits = 1)
Style p-values
style_pvalue( x, digits = 1, prepend_p = FALSE, big.mark = ifelse(decimal.mark == ",", " ", ","), decimal.mark = getOption("OutDec"), ... )
style_pvalue( x, digits = 1, prepend_p = FALSE, big.mark = ifelse(decimal.mark == ",", " ", ","), decimal.mark = getOption("OutDec"), ... )
x |
( |
digits |
( |
prepend_p |
(scalar |
big.mark |
( |
decimal.mark |
( |
... |
Arguments passed on to |
A character vector of styled p-values
Daniel D. Sjoberg
pvals <- c( 1.5, 1, 0.999, 0.5, 0.25, 0.2, 0.197, 0.12, 0.10, 0.0999, 0.06, 0.03, 0.002, 0.001, 0.00099, 0.0002, 0.00002, -1 ) style_pvalue(pvals) style_pvalue(pvals, digits = 2, prepend_p = TRUE)
pvals <- c( 1.5, 1, 0.999, 0.5, 0.25, 0.2, 0.197, 0.12, 0.10, 0.0999, 0.06, 0.03, 0.002, 0.001, 0.00099, 0.0002, 0.00002, -1 ) style_pvalue(pvals) style_pvalue(pvals, digits = 2, prepend_p = TRUE)
When reporting ratios, such as relative risk or an odds ratio, we'll often
want the rounding to be similar on each side of the number 1. For example,
if we report an odds ratio of 0.95 with a confidence interval of 0.70 to 1.24,
we would want to round to two decimal places for all values. In other words,
2 significant figures for numbers less than 1 and 3 significant figures 1 and
larger. style_ratio()
performs significant figure-like rounding in this manner.
style_ratio( x, digits = 2, big.mark = ifelse(decimal.mark == ",", " ", ","), decimal.mark = getOption("OutDec"), prefix = "", suffix = "", ... )
style_ratio( x, digits = 2, big.mark = ifelse(decimal.mark == ",", " ", ","), decimal.mark = getOption("OutDec"), prefix = "", suffix = "", ... )
x |
( |
digits |
( |
big.mark |
( |
decimal.mark |
( |
prefix |
( |
suffix |
( |
... |
Arguments passed on to |
A character vector of styled ratios
Daniel D. Sjoberg
c(0.123, 0.9, 1.1234, 12.345, 101.234, -0.123, -0.9, -1.1234, -12.345, -101.234) |> style_ratio()
c(0.123, 0.9, 1.1234, 12.345, 101.234, -0.123, -0.9, -1.1234, -12.345, -101.234) |> style_ratio()
Converts a numeric argument into a string that has been rounded to a significant figure-like number. Scientific notation output is avoided, however, and additional significant figures may be displayed for large numbers. For example, if the number of significant digits requested is 2, 123 will be displayed (rather than 120 or 1.2x10^2).
style_sigfig( x, digits = 2, scale = 1, big.mark = ifelse(decimal.mark == ",", " ", ","), decimal.mark = getOption("OutDec"), prefix = "", suffix = "", ... )
style_sigfig( x, digits = 2, scale = 1, big.mark = ifelse(decimal.mark == ",", " ", ","), decimal.mark = getOption("OutDec"), prefix = "", suffix = "", ... )
x |
Numeric vector |
digits |
Integer specifying the minimum number of significant digits to display |
scale |
(scalar |
big.mark |
( |
decimal.mark |
( |
prefix |
( |
suffix |
( |
... |
Arguments passed on to |
A character vector of styled numbers
Scientific notation output is avoided.
If 2 significant figures are requested, the number is rounded to no more than 2 decimal places.
For example, a number will be rounded to 2 decimals places when abs(x) < 1
,
1 decimal place when abs(x) >= 1 & abs(x) < 10
,
and to the nearest integer when abs(x) >= 10
.
Additional significant figures may be displayed for large numbers. For example, if the number of significant digits requested is 2, 123 will be displayed (rather than 120 or 1.2x10^2).
Daniel D. Sjoberg
Other style tools:
label_style
c(0.123, 0.9, 1.1234, 12.345, -0.123, -0.9, -1.1234, -132.345, NA, -0.001) %>% style_sigfig()
c(0.123, 0.9, 1.1234, 12.345, -0.123, -0.9, -1.1234, -132.345, NA, -0.001) %>% style_sigfig()
Summarize a continuous variable by one or more categorical variables
tbl_ard_continuous( cards, variable, include, by = NULL, label = NULL, statistic = everything() ~ "{median} ({p25}, {p75})", value = NULL )
tbl_ard_continuous( cards, variable, include, by = NULL, label = NULL, statistic = everything() ~ "{median} ({p25}, {p75})", value = NULL )
cards |
( |
variable |
( |
include |
( |
by |
( |
label |
( |
statistic |
( |
value |
( |
a gtsummary table of class "tbl_ard_summary"
library(cards) # Example 1 ---------------------------------- # the primary ARD with the results ard_continuous( # the order variables are passed is important for the `by` variable. # 'trt' is the column stratifying variable and needs to be listed first. trial, by = c(trt, grade), variables = age ) |> # adding OPTIONAL information about the summary variables bind_ard( # add univariate trt tabulation ard_categorical(trial, variables = trt), # add missing and attributes ARD ard_missing(trial, by = c(trt, grade), variables = age), ard_attributes(trial, variables = c(trt, grade, age)) ) |> tbl_ard_continuous(by = "trt", variable = "age", include = "grade") # Example 2 ---------------------------------- # the primary ARD with the results ard_continuous(trial, by = grade, variables = age) |> # adding OPTIONAL information about the summary variables bind_ard( # add missing and attributes ARD ard_missing(trial, by = grade, variables = age), ard_attributes(trial, variables = c(grade, age)) ) |> tbl_ard_continuous(variable = "age", include = "grade")
library(cards) # Example 1 ---------------------------------- # the primary ARD with the results ard_continuous( # the order variables are passed is important for the `by` variable. # 'trt' is the column stratifying variable and needs to be listed first. trial, by = c(trt, grade), variables = age ) |> # adding OPTIONAL information about the summary variables bind_ard( # add univariate trt tabulation ard_categorical(trial, variables = trt), # add missing and attributes ARD ard_missing(trial, by = c(trt, grade), variables = age), ard_attributes(trial, variables = c(trt, grade, age)) ) |> tbl_ard_continuous(by = "trt", variable = "age", include = "grade") # Example 2 ---------------------------------- # the primary ARD with the results ard_continuous(trial, by = grade, variables = age) |> # adding OPTIONAL information about the summary variables bind_ard( # add missing and attributes ARD ard_missing(trial, by = grade, variables = age), ard_attributes(trial, variables = c(grade, age)) ) |> tbl_ard_continuous(variable = "age", include = "grade")
This is an preview of this function. There will be changes in the coming releases, and changes will not undergo a formal deprecation cycle.
Constructs tables from nested or hierarchical data structures (e.g. adverse events).
tbl_ard_hierarchical( cards, variables, by = NULL, include = everything(), statistic = ~"{n} ({p}%)", label = NULL )
tbl_ard_hierarchical( cards, variables, by = NULL, include = everything(), statistic = ~"{n} ({p}%)", label = NULL )
cards |
( |
variables |
( |
by |
( |
include |
( |
statistic |
( |
label |
( |
a gtsummary table of class "tbl_ard_hierarchical"
ADAE_subset <- cards::ADAE |> dplyr::filter( AESOC %in% unique(cards::ADAE$AESOC)[1:5], AETERM %in% unique(cards::ADAE$AETERM)[1:5] ) # Example 1: Event Rates -------------------- # First, build the ARD ard <- cards::ard_stack_hierarchical( data = ADAE_subset, variables = c(AESOC, AETERM), by = TRTA, denominator = cards::ADSL |> mutate(TRTA = ARM), id = USUBJID ) # Second, build table from the ARD tbl_ard_hierarchical( cards = ard, variables = c(AESOC, AETERM), by = TRTA ) # Example 2: Event Counts ------------------- ard <- cards::ard_stack_hierarchical_count( data = ADAE_subset, variables = c(AESOC, AETERM), by = TRTA, denominator = cards::ADSL |> mutate(TRTA = ARM) ) tbl_ard_hierarchical( cards = ard, variables = c(AESOC, AETERM), by = TRTA, statistic = ~"{n}" )
ADAE_subset <- cards::ADAE |> dplyr::filter( AESOC %in% unique(cards::ADAE$AESOC)[1:5], AETERM %in% unique(cards::ADAE$AETERM)[1:5] ) # Example 1: Event Rates -------------------- # First, build the ARD ard <- cards::ard_stack_hierarchical( data = ADAE_subset, variables = c(AESOC, AETERM), by = TRTA, denominator = cards::ADSL |> mutate(TRTA = ARM), id = USUBJID ) # Second, build table from the ARD tbl_ard_hierarchical( cards = ard, variables = c(AESOC, AETERM), by = TRTA ) # Example 2: Event Counts ------------------- ard <- cards::ard_stack_hierarchical_count( data = ADAE_subset, variables = c(AESOC, AETERM), by = TRTA, denominator = cards::ADSL |> mutate(TRTA = ARM) ) tbl_ard_hierarchical( cards = ard, variables = c(AESOC, AETERM), by = TRTA, statistic = ~"{n}" )
The tbl_ard_summary()
function tables descriptive statistics for
continuous, categorical, and dichotomous variables.
The functions accepts an ARD object.
tbl_ard_summary( cards, by = NULL, statistic = list(all_continuous() ~ "{median} ({p25}, {p75})", all_categorical() ~ "{n} ({p}%)"), type = NULL, label = NULL, missing = c("no", "ifany", "always"), missing_text = "Unknown", missing_stat = "{N_miss}", include = everything(), overall = FALSE )
tbl_ard_summary( cards, by = NULL, statistic = list(all_continuous() ~ "{median} ({p25}, {p75})", all_categorical() ~ "{n} ({p}%)"), type = NULL, label = NULL, missing = c("no", "ifany", "always"), missing_text = "Unknown", missing_stat = "{N_miss}", include = everything(), overall = FALSE )
cards |
( |
by |
( |
statistic |
( |
type |
( |
label |
( |
missing , missing_text , missing_stat
|
Arguments dictating how and if missing values are presented:
|
include |
( |
overall |
(scalar |
There are three types of additional data that can be included in the ARD to improve the default appearance of the table.
Attributes: When attributes are included, the default labels will be
the variable labels, when available. Attributes can be included in an ARD
with cards::ard_attributes()
or ard_stack(.attributes = TRUE)
.
Missing: When missing results are included, users can include
missing counts or rates for variables with tbl_ard_summary(missing = c("ifany", "always"))
.
The missing statistics can be included in an ARD with
cards::ard_missing()
or ard_stack(.missing = TRUE)
.
Total N: The total N is saved internally when available, and it can
be calculated with cards::ard_total_n()
or ard_stack(.total_n = TRUE)
.
a gtsummary table of class "tbl_ard_summary"
library(cards) ard_stack( data = ADSL, ard_categorical(variables = "AGEGR1"), ard_continuous(variables = "AGE"), .attributes = TRUE, .missing = TRUE, .total_n = TRUE ) |> tbl_ard_summary() ard_stack( data = ADSL, .by = ARM, ard_categorical(variables = "AGEGR1"), ard_continuous(variables = "AGE"), .attributes = TRUE, .missing = TRUE, .total_n = TRUE ) |> tbl_ard_summary(by = ARM) ard_stack( data = ADSL, .by = ARM, ard_categorical(variables = "AGEGR1"), ard_continuous(variables = "AGE"), .attributes = TRUE, .missing = TRUE, .total_n = TRUE, .overall = TRUE ) |> tbl_ard_summary(by = ARM, overall = TRUE)
library(cards) ard_stack( data = ADSL, ard_categorical(variables = "AGEGR1"), ard_continuous(variables = "AGE"), .attributes = TRUE, .missing = TRUE, .total_n = TRUE ) |> tbl_ard_summary() ard_stack( data = ADSL, .by = ARM, ard_categorical(variables = "AGEGR1"), ard_continuous(variables = "AGE"), .attributes = TRUE, .missing = TRUE, .total_n = TRUE ) |> tbl_ard_summary(by = ARM) ard_stack( data = ADSL, .by = ARM, ard_categorical(variables = "AGEGR1"), ard_continuous(variables = "AGE"), .attributes = TRUE, .missing = TRUE, .total_n = TRUE, .overall = TRUE ) |> tbl_ard_summary(by = ARM, overall = TRUE)
This function is similar to tbl_ard_summary()
, but places summary statistics
wide, in separate columns.
All included variables must be of the same summary type, e.g. all continuous
summaries or all categorical summaries (which encompasses dichotomous variables).
tbl_ard_wide_summary( cards, statistic = switch(type[[1]], continuous = c("{median}", "{p25}, {p75}"), c("{n}", "{p}%")), type = NULL, label = NULL, value = NULL, include = everything() )
tbl_ard_wide_summary( cards, statistic = switch(type[[1]], continuous = c("{median}", "{p25}, {p75}"), c("{n}", "{p}%")), type = NULL, label = NULL, value = NULL, include = everything() )
cards |
( |
statistic |
( |
type |
( |
label |
( |
value |
( |
include |
( |
a gtsummary table of class 'tbl_wide_summary'
library(cards) ard_stack( trial, ard_continuous(variables = age), .missing = TRUE, .attributes = TRUE, .total_n = TRUE ) |> tbl_ard_wide_summary() ard_stack( trial, ard_dichotomous(variables = response), ard_categorical(variables = grade), .missing = TRUE, .attributes = TRUE, .total_n = TRUE ) |> tbl_ard_wide_summary()
library(cards) ard_stack( trial, ard_continuous(variables = age), .missing = TRUE, .attributes = TRUE, .total_n = TRUE ) |> tbl_ard_wide_summary() ard_stack( trial, ard_dichotomous(variables = response), ard_categorical(variables = grade), .missing = TRUE, .attributes = TRUE, .total_n = TRUE ) |> tbl_ard_wide_summary()
Some gtsummary objects can become large and the size becomes cumbersome
when working with the object.
The function removes all elements from a gtsummary object, except those
required to print the table. This may result in gtsummary functions
that add information or modify the table, such as add_global_p()
,
will no longer execute
after the excess elements have been removed (aka butchered). Of note,
the majority of inline_text()
calls will continue to execute
properly.
tbl_butcher(x, include = c("table_body", "table_styling"))
tbl_butcher(x, include = c("table_body", "table_styling"))
x |
( |
include |
( |
a gtsummary object
tbl_large <- trial |> tbl_uvregression( y = age, method = lm ) tbl_butchered <- tbl_large |> tbl_butcher() # size comparison object.size(tbl_large) |> format(units = "Mb") object.size(tbl_butchered)|> format(units = "Mb")
tbl_large <- trial |> tbl_uvregression( y = age, method = lm ) tbl_butchered <- tbl_large |> tbl_butcher() # size comparison object.size(tbl_large) |> format(units = "Mb") object.size(tbl_butchered)|> format(units = "Mb")
Summarize a continuous variable by one or more categorical variables
tbl_continuous( data, variable, include = everything(), digits = NULL, by = NULL, statistic = everything() ~ "{median} ({p25}, {p75})", label = NULL, value = NULL )
tbl_continuous( data, variable, include = everything(), digits = NULL, by = NULL, statistic = everything() ~ "{median} ({p25}, {p75})", label = NULL, value = NULL )
data |
( |
variable |
( |
include |
( |
digits |
( |
by |
( |
statistic |
( |
label |
( |
value |
( |
a gtsummary table
# Example 1 ---------------------------------- tbl_continuous( data = trial, variable = age, by = trt, include = grade ) # Example 2 ---------------------------------- trial |> dplyr::mutate(all_subjects = 1) |> tbl_continuous( variable = age, statistic = ~"{mean} ({sd})", by = trt, include = c(all_subjects, stage, grade), value = all_subjects ~ 1, label = list(all_subjects = "All Subjects") )
# Example 1 ---------------------------------- tbl_continuous( data = trial, variable = age, by = trt, include = grade ) # Example 2 ---------------------------------- trial |> dplyr::mutate(all_subjects = 1) |> tbl_continuous( variable = age, statistic = ~"{mean} ({sd})", by = trt, include = c(all_subjects, stage, grade), value = all_subjects ~ 1, label = list(all_subjects = "All Subjects") )
The function creates a cross table of categorical variables.
tbl_cross( data, row = 1L, col = 2L, label = NULL, statistic = ifelse(percent == "none", "{n}", "{n} ({p}%)"), digits = NULL, percent = c("none", "column", "row", "cell"), margin = c("column", "row"), missing = c("ifany", "always", "no"), missing_text = "Unknown", margin_text = "Total" )
tbl_cross( data, row = 1L, col = 2L, label = NULL, statistic = ifelse(percent == "none", "{n}", "{n} ({p}%)"), digits = NULL, percent = c("none", "column", "row", "cell"), margin = c("column", "row"), missing = c("ifany", "always", "no"), missing_text = "Unknown", margin_text = "Total" )
data |
( |
row |
( |
col |
( |
label |
( |
statistic |
( |
digits |
( |
percent |
( |
margin |
( |
missing |
( |
missing_text |
( |
margin_text |
( |
A tbl_cross
object
Karissa Whiting, Daniel D. Sjoberg
# Example 1 ---------------------------------- trial |> tbl_cross(row = trt, col = response) |> bold_labels() # Example 2 ---------------------------------- trial |> tbl_cross(row = stage, col = trt, percent = "cell") |> add_p() |> bold_labels()
# Example 1 ---------------------------------- trial |> tbl_cross(row = trt, col = response) |> bold_labels() # Example 2 ---------------------------------- trial |> tbl_cross(row = stage, col = trt, percent = "cell") |> add_p() |> bold_labels()
The tbl_custom_summary()
function calculates descriptive statistics for
continuous, categorical, and dichotomous variables.
This function is similar to tbl_summary()
but allows you to provide
a custom function in charge of computing the statistics (see Details).
tbl_custom_summary( data, by = NULL, label = NULL, stat_fns, statistic, digits = NULL, type = NULL, value = NULL, missing = c("ifany", "no", "always"), missing_text = "Unknown", missing_stat = "{N_miss}", include = everything(), overall_row = FALSE, overall_row_last = FALSE, overall_row_label = "Overall" )
tbl_custom_summary( data, by = NULL, label = NULL, stat_fns, statistic, digits = NULL, type = NULL, value = NULL, missing = c("ifany", "no", "always"), missing_text = "Unknown", missing_stat = "{N_miss}", include = everything(), overall_row = FALSE, overall_row_last = FALSE, overall_row_label = "Overall" )
data |
( |
by |
( |
label |
( |
stat_fns |
( |
statistic |
( |
digits |
( |
type |
( |
value |
( |
missing , missing_text , missing_stat
|
Arguments dictating how and if missing values are presented:
|
include |
( |
overall_row |
(scalar |
overall_row_last |
(scalar |
overall_row_label |
( |
A tbl_custom_summary
object
tbl_summary()
Please refer to the help file of tbl_summary()
regarding the use of select
helpers, and arguments include
, by
, type
, value
, digits
, missing
and
missing_text
.
stat_fns
argumentThe stat_fns
argument specify the custom function(s) to be used for computing
the summary statistics. For example, stat_fns = everything() ~ foo
.
Each function may take the following arguments:
foo(data, full_data, variable, by, type, ...)
data=
is the input data frame passed to tbl_custom_summary()
, subset
according to the level of by
or variable
if any, excluding NA
values of the current variable
full_data=
is the full input data frame passed to tbl_custom_summary()
variable=
is a string indicating the variable to perform the
calculation on
by=
is a string indicating the by variable from tbl_custom_summary=
,
if present
type=
is a string indicating the type of variable
(continuous, categorical, ...)
stat_display=
a string indicating the statistic to display (for the
statistic
argument, for that variable)
The user-defined does not need to utilize each of these inputs. It's
encouraged the user-defined function accept ...
as each of the arguments
will be passed to the function, even if not all inputs are utilized by
the user's function, e.g. foo(data, ...)
(see examples).
The user-defined function should return a one row dplyr::tibble()
with
one column per summary statistics (see examples).
The statistic argument specifies the statistics presented in the table. The
input is a list of formulas that specify the statistics to report. For example,
statistic = list(age ~ "{mean} ({sd})")
.
A statistic name that appears between curly brackets
will be replaced with the numeric statistic (see glue::glue()
).
All the statistics indicated in the statistic argument should be returned
by the functions defined in the stat_fns
argument.
When the summary type is "continuous2"
, pass a vector of statistics. Each element
of the vector will result in a separate row in the summary table.
For both categorical and continuous variables, statistics on the number of missing and non-missing observations and their proportions are also available to display.
{N_obs}
total number of observations
{N_miss}
number of missing observations
{N_nonmiss}
number of non-missing observations
{p_miss}
percentage of observations missing
{p_nonmiss}
percentage of observations not missing
Note that for categorical variables, {N_obs}
, {N_miss}
and {N_nonmiss}
refer
to the total number, number missing and number non missing observations
in the denominator, not at each level of the categorical variable.
It is recommended to use modify_footnote()
to properly describe the
displayed statistics (see examples).
The returned table is compatible with all gtsummary
features applicable
to a tbl_summary
object, like add_overall()
, modify_footnote()
or
bold_labels()
.
However, some of them could be inappropriate in such case. In particular,
add_p()
do not take into account the type of displayed statistics and
always return the p-value of a comparison test of the current variable
according to the by
groups, which may be incorrect if the displayed
statistics refer to a third variable.
Joseph Larmarange
# Example 1 ---------------------------------- my_stats <- function(data, ...) { marker_sum <- sum(data$marker, na.rm = TRUE) mean_age <- mean(data$age, na.rm = TRUE) dplyr::tibble( marker_sum = marker_sum, mean_age = mean_age ) } my_stats(trial) trial |> tbl_custom_summary( include = c("stage", "grade"), by = "trt", stat_fns = everything() ~ my_stats, statistic = everything() ~ "A: {mean_age} - S: {marker_sum}", digits = everything() ~ c(1, 0), overall_row = TRUE, overall_row_label = "All stages & grades" ) |> add_overall(last = TRUE) |> modify_footnote( all_stat_cols() ~ "A: mean age - S: sum of marker" ) |> bold_labels() # Example 2 ---------------------------------- # Use `data[[variable]]` to access the current variable mean_ci <- function(data, variable, ...) { test <- t.test(data[[variable]]) dplyr::tibble( mean = test$estimate, conf.low = test$conf.int[1], conf.high = test$conf.int[2] ) } trial |> tbl_custom_summary( include = c("marker", "ttdeath"), by = "trt", stat_fns = ~ mean_ci, statistic = ~ "{mean} [{conf.low}; {conf.high}]" ) |> add_overall(last = TRUE) |> modify_footnote( all_stat_cols() ~ "mean [95% CI]" ) # Example 3 ---------------------------------- # Use `full_data` to access the full datasets # Returned statistic can also be a character diff_to_great_mean <- function(data, full_data, ...) { mean <- mean(data$marker, na.rm = TRUE) great_mean <- mean(full_data$marker, na.rm = TRUE) diff <- mean - great_mean dplyr::tibble( mean = mean, great_mean = great_mean, diff = diff, level = ifelse(diff > 0, "high", "low") ) } trial |> tbl_custom_summary( include = c("grade", "stage"), by = "trt", stat_fns = ~ diff_to_great_mean, statistic = ~ "{mean} ({level}, diff: {diff})", overall_row = TRUE ) |> bold_labels()
# Example 1 ---------------------------------- my_stats <- function(data, ...) { marker_sum <- sum(data$marker, na.rm = TRUE) mean_age <- mean(data$age, na.rm = TRUE) dplyr::tibble( marker_sum = marker_sum, mean_age = mean_age ) } my_stats(trial) trial |> tbl_custom_summary( include = c("stage", "grade"), by = "trt", stat_fns = everything() ~ my_stats, statistic = everything() ~ "A: {mean_age} - S: {marker_sum}", digits = everything() ~ c(1, 0), overall_row = TRUE, overall_row_label = "All stages & grades" ) |> add_overall(last = TRUE) |> modify_footnote( all_stat_cols() ~ "A: mean age - S: sum of marker" ) |> bold_labels() # Example 2 ---------------------------------- # Use `data[[variable]]` to access the current variable mean_ci <- function(data, variable, ...) { test <- t.test(data[[variable]]) dplyr::tibble( mean = test$estimate, conf.low = test$conf.int[1], conf.high = test$conf.int[2] ) } trial |> tbl_custom_summary( include = c("marker", "ttdeath"), by = "trt", stat_fns = ~ mean_ci, statistic = ~ "{mean} [{conf.low}; {conf.high}]" ) |> add_overall(last = TRUE) |> modify_footnote( all_stat_cols() ~ "mean [95% CI]" ) # Example 3 ---------------------------------- # Use `full_data` to access the full datasets # Returned statistic can also be a character diff_to_great_mean <- function(data, full_data, ...) { mean <- mean(data$marker, na.rm = TRUE) great_mean <- mean(full_data$marker, na.rm = TRUE) diff <- mean - great_mean dplyr::tibble( mean = mean, great_mean = great_mean, diff = diff, level = ifelse(diff > 0, "high", "low") ) } trial |> tbl_custom_summary( include = c("grade", "stage"), by = "trt", stat_fns = ~ diff_to_great_mean, statistic = ~ "{mean} ({level}, diff: {diff})", overall_row = TRUE ) |> bold_labels()
This is an preview of this function. There will be changes in the coming releases, and changes will not undergo a formal deprecation cycle.
Use these functions to generate hierarchical tables.
tbl_hierarchical()
: Calculates rates of events (e.g. adverse events)
utilizing the denominator
and id
arguments to identify the rows in data
to include in each rate calculation. If variables
contains more than one
variable and the last variable in variables
is an ordered factor, then
rates of events by highest level will be calculated.
tbl_hierarchical_count()
: Calculates counts of events utilizing
all rows for each tabulation.
tbl_hierarchical( data, variables, id, denominator, by = NULL, include = everything(), statistic = everything() ~ "{n} ({p}%)", overall_row = FALSE, label = NULL, digits = NULL ) tbl_hierarchical_count( data, variables, denominator = NULL, by = NULL, include = everything(), overall_row = FALSE, statistic = everything() ~ "{n}", label = NULL, digits = NULL )
tbl_hierarchical( data, variables, id, denominator, by = NULL, include = everything(), statistic = everything() ~ "{n} ({p}%)", overall_row = FALSE, label = NULL, digits = NULL ) tbl_hierarchical_count( data, variables, denominator = NULL, by = NULL, include = everything(), overall_row = FALSE, statistic = everything() ~ "{n}", label = NULL, digits = NULL )
data |
( |
variables |
( |
id |
( |
denominator |
( |
by |
( |
include |
( |
statistic |
( |
overall_row |
(scalar |
label |
( |
digits |
( |
a gtsummary table of class "tbl_hierarchical"
(for tbl_hierarchical()
) or "tbl_hierarchical_count"
(for tbl_hierarchical_count()
).
An overall row can be added to the table as the first row by specifying overall_row = TRUE
. Assuming that each row
in data
corresponds to one event record, this row will count the overall number of events recorded when used in
tbl_hierarchical_count()
, or the overall number of patients recorded with any event when used in
tbl_hierarchical()
.
A label for this overall row can be specified by passing an '..ard_hierarchical_overall..'
element in label
.
Similarly, the rounding for statistics in the overall row can be modified using the digits
argument,
again referencing the '..ard_hierarchical_overall..'
name.
ADAE_subset <- cards::ADAE |> dplyr::filter( AESOC %in% unique(cards::ADAE$AESOC)[1:5], AETERM %in% unique(cards::ADAE$AETERM)[1:5] ) # Example 1 - Event Rates -------------------- tbl_hierarchical( data = ADAE_subset, variables = c(AESOC, AETERM), by = TRTA, denominator = cards::ADSL |> mutate(TRTA = ARM), id = USUBJID, digits = everything() ~ list(p = 1), overall_row = TRUE, label = list(..ard_hierarchical_overall.. = "Any Adverse Event") ) # Example 2 - Rates by Highest Severity ------ tbl_hierarchical( data = ADAE_subset |> mutate(AESEV = factor(AESEV, ordered = TRUE)), variables = c(AESOC, AESEV), by = TRTA, id = USUBJID, denominator = cards::ADSL |> mutate(TRTA = ARM), include = AESEV, label = list(AESEV = "Highest Severity") ) # Example 3 - Event Counts ------------------- tbl_hierarchical_count( data = ADAE_subset, variables = c(AESOC, AETERM, AESEV), by = TRTA, overall_row = TRUE, label = list(..ard_hierarchical_overall.. = "Total Number of AEs") )
ADAE_subset <- cards::ADAE |> dplyr::filter( AESOC %in% unique(cards::ADAE$AESOC)[1:5], AETERM %in% unique(cards::ADAE$AETERM)[1:5] ) # Example 1 - Event Rates -------------------- tbl_hierarchical( data = ADAE_subset, variables = c(AESOC, AETERM), by = TRTA, denominator = cards::ADSL |> mutate(TRTA = ARM), id = USUBJID, digits = everything() ~ list(p = 1), overall_row = TRUE, label = list(..ard_hierarchical_overall.. = "Any Adverse Event") ) # Example 2 - Rates by Highest Severity ------ tbl_hierarchical( data = ADAE_subset |> mutate(AESEV = factor(AESEV, ordered = TRUE)), variables = c(AESOC, AESEV), by = TRTA, id = USUBJID, denominator = cards::ADSL |> mutate(TRTA = ARM), include = AESEV, label = list(AESEV = "Highest Severity") ) # Example 3 - Event Counts ------------------- tbl_hierarchical_count( data = ADAE_subset, variables = c(AESOC, AETERM, AESEV), by = TRTA, overall_row = TRUE, label = list(..ard_hierarchical_overall.. = "Total Number of AEs") )
Create a table of ordered categorical variables in a wide format.
tbl_likert( data, statistic = ~"{n} ({p}%)", label = NULL, digits = NULL, include = everything(), sort = c("ascending", "descending") )
tbl_likert( data, statistic = ~"{n} ({p}%)", label = NULL, digits = NULL, include = everything(), sort = c("ascending", "descending") )
data |
( |
statistic |
( |
label |
( |
digits |
( |
include |
( |
sort |
( |
a 'tbl_likert' gtsummary table
levels <- c("Strongly Disagree", "Disagree", "Agree", "Strongly Agree") df_likert <- data.frame( recommend_friend = sample(levels, size = 20, replace = TRUE) |> factor(levels = levels), regret_purchase = sample(levels, size = 20, replace = TRUE) |> factor(levels = levels) ) # Example 1 ---------------------------------- tbl_likert_ex1 <- df_likert |> tbl_likert(include = c(recommend_friend, regret_purchase)) |> add_n() tbl_likert_ex1 # Example 2 ---------------------------------- # Add continuous summary of the likert scores list( tbl_likert_ex1, tbl_wide_summary( df_likert |> dplyr::mutate(dplyr::across(everything(), as.numeric)), statistic = c("{mean}", "{sd}"), type = ~"continuous", include = c(recommend_friend, regret_purchase) ) ) |> tbl_merge(tab_spanner = FALSE)
levels <- c("Strongly Disagree", "Disagree", "Agree", "Strongly Agree") df_likert <- data.frame( recommend_friend = sample(levels, size = 20, replace = TRUE) |> factor(levels = levels), regret_purchase = sample(levels, size = 20, replace = TRUE) |> factor(levels = levels) ) # Example 1 ---------------------------------- tbl_likert_ex1 <- df_likert |> tbl_likert(include = c(recommend_friend, regret_purchase)) |> add_n() tbl_likert_ex1 # Example 2 ---------------------------------- # Add continuous summary of the likert scores list( tbl_likert_ex1, tbl_wide_summary( df_likert |> dplyr::mutate(dplyr::across(everything(), as.numeric)), statistic = c("{mean}", "{sd}"), type = ~"continuous", include = c(recommend_friend, regret_purchase) ) ) |> tbl_merge(tab_spanner = FALSE)
Merge gtsummary tables, e.g. tbl_regression
, tbl_uvregression
, tbl_stack
,
tbl_summary
, tbl_svysummary
, etc.
tbl_merge(tbls, tab_spanner = NULL)
tbl_merge(tbls, tab_spanner = NULL)
tbls |
( |
tab_spanner |
( |
A 'tbl_merge'
object
Daniel D. Sjoberg
# Example 1 ---------------------------------- # Side-by-side Regression Models library(survival) t1 <- glm(response ~ trt + grade + age, trial, family = binomial) %>% tbl_regression(exponentiate = TRUE) t2 <- coxph(Surv(ttdeath, death) ~ trt + grade + age, trial) %>% tbl_regression(exponentiate = TRUE) tbl_merge( tbls = list(t1, t2), tab_spanner = c("**Tumor Response**", "**Time to Death**") ) # Example 2 ---------------------------------- # Descriptive statistics alongside univariate regression, with no spanning header t3 <- trial[c("age", "grade", "response")] %>% tbl_summary(missing = "no") %>% add_n() %>% modify_header(stat_0 ~ "**Summary Statistics**") t4 <- tbl_uvregression( trial[c("ttdeath", "death", "age", "grade", "response")], method = coxph, y = Surv(ttdeath, death), exponentiate = TRUE, hide_n = TRUE ) tbl_merge(tbls = list(t3, t4)) %>% modify_spanning_header(everything() ~ NA_character_)
# Example 1 ---------------------------------- # Side-by-side Regression Models library(survival) t1 <- glm(response ~ trt + grade + age, trial, family = binomial) %>% tbl_regression(exponentiate = TRUE) t2 <- coxph(Surv(ttdeath, death) ~ trt + grade + age, trial) %>% tbl_regression(exponentiate = TRUE) tbl_merge( tbls = list(t1, t2), tab_spanner = c("**Tumor Response**", "**Time to Death**") ) # Example 2 ---------------------------------- # Descriptive statistics alongside univariate regression, with no spanning header t3 <- trial[c("age", "grade", "response")] %>% tbl_summary(missing = "no") %>% add_n() %>% modify_header(stat_0 ~ "**Summary Statistics**") t4 <- tbl_uvregression( trial[c("ttdeath", "death", "age", "grade", "response")], method = coxph, y = Surv(ttdeath, death), exponentiate = TRUE, hide_n = TRUE ) tbl_merge(tbls = list(t3, t4)) %>% modify_spanning_header(everything() ~ NA_character_)
This function takes a regression model object and returns a formatted table
that is publication-ready. The function is customizable
allowing the user to create bespoke regression model summary tables.
Review the
tbl_regression()
vignette
for detailed examples.
tbl_regression(x, ...) ## Default S3 method: tbl_regression( x, label = NULL, exponentiate = FALSE, include = everything(), show_single_row = NULL, conf.level = 0.95, intercept = FALSE, estimate_fun = ifelse(exponentiate, label_style_ratio(), label_style_sigfig()), pvalue_fun = label_style_pvalue(digits = 1), tidy_fun = broom.helpers::tidy_with_broom_or_parameters, add_estimate_to_reference_rows = FALSE, conf.int = TRUE, ... )
tbl_regression(x, ...) ## Default S3 method: tbl_regression( x, label = NULL, exponentiate = FALSE, include = everything(), show_single_row = NULL, conf.level = 0.95, intercept = FALSE, estimate_fun = ifelse(exponentiate, label_style_ratio(), label_style_sigfig()), pvalue_fun = label_style_pvalue(digits = 1), tidy_fun = broom.helpers::tidy_with_broom_or_parameters, add_estimate_to_reference_rows = FALSE, conf.int = TRUE, ... )
x |
(regression model) |
... |
Additional arguments passed to |
label |
( |
exponentiate |
(scalar |
include |
( |
show_single_row |
( |
conf.level |
(scalar |
intercept |
(scalar |
estimate_fun |
( |
pvalue_fun |
( |
tidy_fun |
( |
add_estimate_to_reference_rows |
(scalar |
conf.int |
(scalar |
A tbl_regression
object
The default method for tbl_regression()
model summary uses broom::tidy(x)
to perform the initial tidying of the model object. There are, however,
a few models that use modifications.
"parsnip/workflows"
: If the model was prepared using parsnip/workflows,
the original model fit is extracted and the original x=
argument
is replaced with the model fit. This will typically go unnoticed; however,if you've
provided a custom tidier in tidy_fun=
the tidier will be applied to the model
fit object and not the parsnip/workflows object.
"survreg"
: The scale parameter is removed, broom::tidy(x) %>% dplyr::filter(term != "Log(scale)")
"multinom"
: This multinomial outcome is complex, with one line per covariate per outcome (less the reference group)
"gam"
: Uses the internal tidier tidy_gam()
to print both parametric and smooth terms.
"lmerMod"
, "glmerMod"
, "glmmTMB"
, "glmmadmb"
, "stanreg"
, "brmsfit"
: These mixed effects
models use broom.mixed::tidy(x, effects = "fixed")
. Specify tidy_fun = broom.mixed::tidy
to print the random components.
Daniel D. Sjoberg
# Example 1 ---------------------------------- glm(response ~ age + grade, trial, family = binomial()) |> tbl_regression(exponentiate = TRUE)
# Example 1 ---------------------------------- glm(response ~ age + grade, trial, family = binomial()) |> tbl_regression(exponentiate = TRUE)
The tbl_split
function splits a single gtsummary table into multiple tables.
Updates to the print method are expected.
tbl_split(x, ...) ## S3 method for class 'gtsummary' tbl_split(x, variables, ...) ## S3 method for class 'tbl_split' print(x, ...)
tbl_split(x, ...) ## S3 method for class 'gtsummary' tbl_split(x, variables, ...) ## S3 method for class 'tbl_split' print(x, ...)
x |
( |
... |
These dots are for future extensions and must be empty. |
variables |
( |
tbl_split
object
tbl <- tbl_summary(trial) |> tbl_split(variables = c(marker, grade))
tbl <- tbl_summary(trial) |> tbl_split(variables = c(marker, grade))
Assists in patching together more complex tables. tbl_stack()
appends two
or more gtsummary tables.
Column attributes, including number formatting and column footnotes, are
retained from the first passed gtsummary object.
tbl_stack(tbls, group_header = NULL, quiet = FALSE)
tbl_stack(tbls, group_header = NULL, quiet = FALSE)
tbls |
( |
group_header |
( |
quiet |
(scalar |
A tbl_stack
object
Daniel D. Sjoberg
# Example 1 ---------------------------------- # stacking two tbl_regression objects t1 <- glm(response ~ trt, trial, family = binomial) %>% tbl_regression( exponentiate = TRUE, label = list(trt ~ "Treatment (unadjusted)") ) t2 <- glm(response ~ trt + grade + stage + marker, trial, family = binomial) %>% tbl_regression( include = "trt", exponentiate = TRUE, label = list(trt ~ "Treatment (adjusted)") ) tbl_stack(list(t1, t2)) # Example 2 ---------------------------------- # stacking two tbl_merge objects library(survival) t3 <- coxph(Surv(ttdeath, death) ~ trt, trial) %>% tbl_regression( exponentiate = TRUE, label = list(trt ~ "Treatment (unadjusted)") ) t4 <- coxph(Surv(ttdeath, death) ~ trt + grade + stage + marker, trial) %>% tbl_regression( include = "trt", exponentiate = TRUE, label = list(trt ~ "Treatment (adjusted)") ) # first merging, then stacking row1 <- tbl_merge(list(t1, t3), tab_spanner = c("Tumor Response", "Death")) row2 <- tbl_merge(list(t2, t4)) tbl_stack(list(row1, row2), group_header = c("Unadjusted Analysis", "Adjusted Analysis"))
# Example 1 ---------------------------------- # stacking two tbl_regression objects t1 <- glm(response ~ trt, trial, family = binomial) %>% tbl_regression( exponentiate = TRUE, label = list(trt ~ "Treatment (unadjusted)") ) t2 <- glm(response ~ trt + grade + stage + marker, trial, family = binomial) %>% tbl_regression( include = "trt", exponentiate = TRUE, label = list(trt ~ "Treatment (adjusted)") ) tbl_stack(list(t1, t2)) # Example 2 ---------------------------------- # stacking two tbl_merge objects library(survival) t3 <- coxph(Surv(ttdeath, death) ~ trt, trial) %>% tbl_regression( exponentiate = TRUE, label = list(trt ~ "Treatment (unadjusted)") ) t4 <- coxph(Surv(ttdeath, death) ~ trt + grade + stage + marker, trial) %>% tbl_regression( include = "trt", exponentiate = TRUE, label = list(trt ~ "Treatment (adjusted)") ) # first merging, then stacking row1 <- tbl_merge(list(t1, t3), tab_spanner = c("Tumor Response", "Death")) row2 <- tbl_merge(list(t2, t4)) tbl_stack(list(row1, row2), group_header = c("Unadjusted Analysis", "Adjusted Analysis"))
Build a stratified gtsummary table. Any gtsummary table that accepts
a data frame as its first argument can be stratified.
In tbl_strata()
, the stratified or subset data frame is passed to the
function in .tbl_fun=
, e.g. purrr::map(data, .tbl_fun)
.
In tbl_strata2()
, both the stratified data frame and the strata level
are passed to .tbl_fun=
, e.g. purrr::map2(data, strata, .tbl_fun)
tbl_strata( data, strata, .tbl_fun, ..., .sep = ", ", .combine_with = c("tbl_merge", "tbl_stack"), .combine_args = NULL, .header = ifelse(.combine_with == "tbl_merge", "**{strata}**", "{strata}"), .stack_group_header = NULL, .quiet = NULL ) tbl_strata2( data, strata, .tbl_fun, ..., .sep = ", ", .combine_with = c("tbl_merge", "tbl_stack"), .combine_args = NULL, .header = ifelse(.combine_with == "tbl_merge", "**{strata}**", "{strata}"), .stack_group_header = NULL, .quiet = TRUE )
tbl_strata( data, strata, .tbl_fun, ..., .sep = ", ", .combine_with = c("tbl_merge", "tbl_stack"), .combine_args = NULL, .header = ifelse(.combine_with == "tbl_merge", "**{strata}**", "{strata}"), .stack_group_header = NULL, .quiet = NULL ) tbl_strata2( data, strata, .tbl_fun, ..., .sep = ", ", .combine_with = c("tbl_merge", "tbl_stack"), .combine_args = NULL, .header = ifelse(.combine_with == "tbl_merge", "**{strata}**", "{strata}"), .stack_group_header = NULL, .quiet = TRUE )
data |
( |
strata |
( |
.tbl_fun |
( |
... |
Additional arguments passed on to the |
.sep |
( |
.combine_with |
( |
.combine_args |
(named |
.header |
( The evaluated value of |
.stack_group_header |
|
.quiet |
tbl_summary()
The number of digits continuous variables are rounded to is determined
separately within each stratum of the data frame. Set the digits=
argument to ensure continuous variables are rounded to the same number
of decimal places.
If some levels of a categorical variable are unobserved within a stratum, convert the variable to a factor to ensure all levels appear in each stratum's summary table.
Daniel D. Sjoberg
# Example 1 ---------------------------------- trial |> select(age, grade, stage, trt) |> mutate(grade = paste("Grade", grade)) |> tbl_strata( strata = grade, .tbl_fun = ~ .x |> tbl_summary(by = trt, missing = "no") |> add_n(), .header = "**{strata}**, N = {n}" ) # Example 2 ---------------------------------- trial |> select(grade, response) |> mutate(grade = paste("Grade", grade)) |> tbl_strata2( strata = grade, .tbl_fun = ~ .x %>% tbl_summary( label = list(response = .y), missing = "no", statistic = response ~ "{p}%" ) |> add_ci(pattern = "{stat} ({ci})") |> modify_header(stat_0 = "**Rate (95% CI)**") |> modify_footnote(stat_0 = NA), .combine_with = "tbl_stack", .combine_args = list(group_header = NULL) ) |> modify_caption("**Response Rate by Grade**")
# Example 1 ---------------------------------- trial |> select(age, grade, stage, trt) |> mutate(grade = paste("Grade", grade)) |> tbl_strata( strata = grade, .tbl_fun = ~ .x |> tbl_summary(by = trt, missing = "no") |> add_n(), .header = "**{strata}**, N = {n}" ) # Example 2 ---------------------------------- trial |> select(grade, response) |> mutate(grade = paste("Grade", grade)) |> tbl_strata2( strata = grade, .tbl_fun = ~ .x %>% tbl_summary( label = list(response = .y), missing = "no", statistic = response ~ "{p}%" ) |> add_ci(pattern = "{stat} ({ci})") |> modify_header(stat_0 = "**Rate (95% CI)**") |> modify_footnote(stat_0 = NA), .combine_with = "tbl_stack", .combine_args = list(group_header = NULL) ) |> modify_caption("**Response Rate by Grade**")
The tbl_summary()
function calculates descriptive statistics for
continuous, categorical, and dichotomous variables.
Review the
tbl_summary vignette
for detailed examples.
tbl_summary( data, by = NULL, label = NULL, statistic = list(all_continuous() ~ "{median} ({p25}, {p75})", all_categorical() ~ "{n} ({p}%)"), digits = NULL, type = NULL, value = NULL, missing = c("ifany", "no", "always"), missing_text = "Unknown", missing_stat = "{N_miss}", sort = all_categorical(FALSE) ~ "alphanumeric", percent = c("column", "row", "cell"), include = everything() )
tbl_summary( data, by = NULL, label = NULL, statistic = list(all_continuous() ~ "{median} ({p25}, {p75})", all_categorical() ~ "{n} ({p}%)"), digits = NULL, type = NULL, value = NULL, missing = c("ifany", "no", "always"), missing_text = "Unknown", missing_stat = "{N_miss}", sort = all_categorical(FALSE) ~ "alphanumeric", percent = c("column", "row", "cell"), include = everything() )
data |
( |
by |
( |
label |
( |
statistic |
( |
digits |
( |
type |
( |
value |
( |
missing , missing_text , missing_stat
|
Arguments dictating how and if missing values are presented:
|
sort |
( |
percent |
( |
include |
( |
a gtsummary table of class "tbl_summary"
A table of class c('tbl_summary', 'gtsummary')
The statistic argument specifies the statistics presented in the table. The
input dictates the summary statistics presented in the table. For example,
statistic = list(age ~ "{mean} ({sd})")
would report the mean and
standard deviation for age; statistic = list(all_continuous() ~ "{mean} ({sd})")
would report the mean and standard deviation for all continuous variables.
The values are interpreted using glue::glue()
syntax:
a name that appears between curly brackets will be interpreted as a function
name and the formatted result of that function will be placed in the table.
For categorical variables, the following statistics are available to display:
{n}
(frequency), {N}
(denominator), {p}
(percent).
For continuous variables, any univariate function may be used.
The most commonly used functions are {median}
, {mean}
, {sd}
, {min}
,
and {max}
.
Additionally, {p##}
is available for percentiles, where ##
is an integer from 0 to 100.
For example, p25: quantile(probs=0.25, type=2)
.
When the summary type is "continuous2"
, pass a vector of statistics.
Each element of the vector will result in a separate row in the summary table.
For both categorical and continuous variables, statistics on the number of missing and non-missing observations and their proportions are available to display.
{N_obs}
total number of observations
{N_miss}
number of missing observations
{N_nonmiss}
number of non-missing observations
{p_miss}
percentage of observations missing
{p_nonmiss}
percentage of observations not missing
The digits argument specifies the the number of digits (or formatting function) statistics are rounded to.
The values passed can either be a single integer, a vector of integers, a
function, or a list of functions. If a single integer or function is passed,
it is recycled to the length of the number of statistics presented.
For example, if the statistic is "{mean} ({sd})"
, it is equivalent to
pass 1
, c(1, 1)
, label_style_number(digits=1)
, and
list(label_style_number(digits=1), label_style_number(digits=1))
.
Named lists are also accepted to change the default formatting for a single
statistic, e.g. list(sd = label_style_number(digits=1))
.
There are four summary types. Use the type
argument to change the default summary types.
"continuous"
summaries are shown on a single row. Most numeric
variables default to summary type continuous.
"continuous2"
summaries are shown on 2 or more rows
"categorical"
multi-line summaries of nominal data. Character variables,
factor variables, and numeric variables with fewer than 10 unique levels default to
type categorical. To change a numeric variable to continuous that
defaulted to categorical, use type = list(varname ~ "continuous")
"dichotomous"
categorical variables that are displayed on a single row,
rather than one row per level of the variable.
Variables coded as TRUE
/FALSE
, 0
/1
, or yes
/no
are assumed to be dichotomous,
and the TRUE
, 1
, and yes
rows are displayed.
Otherwise, the value to display must be specified in the value
argument, e.g. value = list(varname ~ "level to show")
Daniel D. Sjoberg
See tbl_summary vignette for detailed tutorial
See table gallery for additional examples
Review list, formula, and selector syntax used throughout gtsummary
# Example 1 ---------------------------------- trial |> select(age, grade, response) |> tbl_summary() # Example 2 ---------------------------------- trial |> select(age, grade, response, trt) |> tbl_summary( by = trt, label = list(age = "Patient Age"), statistic = list(all_continuous() ~ "{mean} ({sd})"), digits = list(age = c(0, 1)) ) # Example 3 ---------------------------------- trial |> select(age, marker) |> tbl_summary( type = all_continuous() ~ "continuous2", statistic = all_continuous() ~ c("{median} ({p25}, {p75})", "{min}, {max}"), missing = "no" )
# Example 1 ---------------------------------- trial |> select(age, grade, response) |> tbl_summary() # Example 2 ---------------------------------- trial |> select(age, grade, response, trt) |> tbl_summary( by = trt, label = list(age = "Patient Age"), statistic = list(all_continuous() ~ "{mean} ({sd})"), digits = list(age = c(0, 1)) ) # Example 3 ---------------------------------- trial |> select(age, marker) |> tbl_summary( type = all_continuous() ~ "continuous2", statistic = all_continuous() ~ c("{median} ({p25}, {p75})", "{min}, {max}"), missing = "no" )
Function takes a survfit
object as an argument, and provides a
formatted summary table of the results
tbl_survfit(x, ...) ## S3 method for class 'survfit' tbl_survfit(x, ...) ## S3 method for class 'data.frame' tbl_survfit(x, y, include = everything(), conf.level = 0.95, ...) ## S3 method for class 'list' tbl_survfit( x, times = NULL, probs = NULL, statistic = "{estimate} ({conf.low}, {conf.high})", label = NULL, label_header = ifelse(!is.null(times), "**Time {time}**", "**{style_sigfig(prob, scale=100)}% Percentile**"), estimate_fun = ifelse(!is.null(times), label_style_percent(suffix = "%"), label_style_sigfig()), missing = "--", type = NULL, reverse = FALSE, quiet = TRUE, ... )
tbl_survfit(x, ...) ## S3 method for class 'survfit' tbl_survfit(x, ...) ## S3 method for class 'data.frame' tbl_survfit(x, y, include = everything(), conf.level = 0.95, ...) ## S3 method for class 'list' tbl_survfit( x, times = NULL, probs = NULL, statistic = "{estimate} ({conf.low}, {conf.high})", label = NULL, label_header = ifelse(!is.null(times), "**Time {time}**", "**{style_sigfig(prob, scale=100)}% Percentile**"), estimate_fun = ifelse(!is.null(times), label_style_percent(suffix = "%"), label_style_sigfig()), missing = "--", type = NULL, reverse = FALSE, quiet = TRUE, ... )
x |
( |
|||||||||
... |
For |
|||||||||
y |
outcome call, e.g. |
|||||||||
include |
Variable to include as stratifying variables. |
|||||||||
conf.level |
(scalar |
|||||||||
times |
( |
|||||||||
probs |
( |
|||||||||
statistic |
( |
|||||||||
label |
( |
|||||||||
label_header |
( |
|||||||||
estimate_fun |
( |
|||||||||
missing |
( |
|||||||||
type |
(
|
|||||||||
reverse |
||||||||||
quiet |
Daniel D. Sjoberg
library(survival) # Example 1 ---------------------------------- # Pass single survfit() object tbl_survfit( survfit(Surv(ttdeath, death) ~ trt, trial), times = c(12, 24), label_header = "**{time} Month**" ) # Example 2 ---------------------------------- # Pass a data frame tbl_survfit( trial, y = "Surv(ttdeath, death)", include = c(trt, grade), probs = 0.5, label_header = "**Median Survival**" ) # Example 3 ---------------------------------- # Pass a list of survfit() objects list(survfit(Surv(ttdeath, death) ~ 1, trial), survfit(Surv(ttdeath, death) ~ trt, trial)) |> tbl_survfit(times = c(12, 24)) # Example 4 Competing Events Example --------- # adding a competing event for death (cancer vs other causes) set.seed(1123) library(dplyr, warn.conflicts = FALSE, quietly = TRUE) trial2 <- trial |> dplyr::mutate( death_cr = dplyr::case_when( death == 0 ~ "censor", runif(n()) < 0.5 ~ "death from cancer", TRUE ~ "death other causes" ) |> factor() ) survfit(Surv(ttdeath, death_cr) ~ grade, data = trial2) |> tbl_survfit(times = c(12, 24), label = "Tumor Grade")
library(survival) # Example 1 ---------------------------------- # Pass single survfit() object tbl_survfit( survfit(Surv(ttdeath, death) ~ trt, trial), times = c(12, 24), label_header = "**{time} Month**" ) # Example 2 ---------------------------------- # Pass a data frame tbl_survfit( trial, y = "Surv(ttdeath, death)", include = c(trt, grade), probs = 0.5, label_header = "**Median Survival**" ) # Example 3 ---------------------------------- # Pass a list of survfit() objects list(survfit(Surv(ttdeath, death) ~ 1, trial), survfit(Surv(ttdeath, death) ~ trt, trial)) |> tbl_survfit(times = c(12, 24)) # Example 4 Competing Events Example --------- # adding a competing event for death (cancer vs other causes) set.seed(1123) library(dplyr, warn.conflicts = FALSE, quietly = TRUE) trial2 <- trial |> dplyr::mutate( death_cr = dplyr::case_when( death == 0 ~ "censor", runif(n()) < 0.5 ~ "death from cancer", TRUE ~ "death other causes" ) |> factor() ) survfit(Surv(ttdeath, death_cr) ~ grade, data = trial2) |> tbl_survfit(times = c(12, 24), label = "Tumor Grade")
The tbl_svysummary()
function calculates descriptive statistics for
continuous, categorical, and dichotomous variables taking into account survey weights and design.
tbl_svysummary( data, by = NULL, label = NULL, statistic = list(all_continuous() ~ "{median} ({p25}, {p75})", all_categorical() ~ "{n} ({p}%)"), digits = NULL, type = NULL, value = NULL, missing = c("ifany", "no", "always"), missing_text = "Unknown", missing_stat = "{N_miss}", sort = all_categorical(FALSE) ~ "alphanumeric", percent = c("column", "row", "cell"), include = everything() )
tbl_svysummary( data, by = NULL, label = NULL, statistic = list(all_continuous() ~ "{median} ({p25}, {p75})", all_categorical() ~ "{n} ({p}%)"), digits = NULL, type = NULL, value = NULL, missing = c("ifany", "no", "always"), missing_text = "Unknown", missing_stat = "{N_miss}", sort = all_categorical(FALSE) ~ "alphanumeric", percent = c("column", "row", "cell"), include = everything() )
data |
( |
by |
( |
label |
( |
statistic |
( |
digits |
( |
type |
( |
value |
( |
missing , missing_text , missing_stat
|
Arguments dictating how and if missing values are presented:
|
sort |
( |
percent |
( |
include |
( |
A 'tbl_svysummary'
object
The statistic argument specifies the statistics presented in the table. The
input is a list of formulas that specify the statistics to report. For example,
statistic = list(age ~ "{mean} ({sd})")
would report the mean and
standard deviation for age; statistic = list(all_continuous() ~ "{mean} ({sd})")
would report the mean and standard deviation for all continuous variables.
A statistic name that appears between curly brackets
will be replaced with the numeric statistic (see glue::glue()
).
For categorical variables the following statistics are available to display.
{n}
frequency
{N}
denominator, or cohort size
{p}
proportion
{p.std.error}
standard error of the sample proportion (on the 0 to 1 scale) computed with survey::svymean()
{deff}
design effect of the sample proportion computed with survey::svymean()
{n_unweighted}
unweighted frequency
{N_unweighted}
unweighted denominator
{p_unweighted}
unweighted formatted percentage
For continuous variables the following statistics are available to display.
{median}
median
{mean}
mean
{mean.std.error}
standard error of the sample mean computed with survey::svymean()
{deff}
design effect of the sample mean computed with survey::svymean()
{sd}
standard deviation
{var}
variance
{min}
minimum
{max}
maximum
{p##}
any integer percentile, where ##
is an integer from 0 to 100
{sum}
sum
Unlike tbl_summary()
, it is not possible to pass a custom function.
For both categorical and continuous variables, statistics on the number of missing and non-missing observations and their proportions are available to display.
{N_obs}
total number of observations
{N_miss}
number of missing observations
{N_nonmiss}
number of non-missing observations
{p_miss}
percentage of observations missing
{p_nonmiss}
percentage of observations not missing
{N_obs_unweighted}
unweighted total number of observations
{N_miss_unweighted}
unweighted number of missing observations
{N_nonmiss_unweighted}
unweighted number of non-missing observations
{p_miss_unweighted}
unweighted percentage of observations missing
{p_nonmiss_unweighted}
unweighted percentage of observations not missing
Note that for categorical variables, {N_obs}
, {N_miss}
and {N_nonmiss}
refer
to the total number, number missing and number non missing observations
in the denominator, not at each level of the categorical variable.
There are four summary types. Use the type
argument to change the default summary types.
"continuous"
summaries are shown on a single row. Most numeric
variables default to summary type continuous.
"continuous2"
summaries are shown on 2 or more rows
"categorical"
multi-line summaries of nominal data. Character variables,
factor variables, and numeric variables with fewer than 10 unique levels default to
type categorical. To change a numeric variable to continuous that
defaulted to categorical, use type = list(varname ~ "continuous")
"dichotomous"
categorical variables that are displayed on a single row,
rather than one row per level of the variable.
Variables coded as TRUE
/FALSE
, 0
/1
, or yes
/no
are assumed to be dichotomous,
and the TRUE
, 1
, and yes
rows are displayed.
Otherwise, the value to display must be specified in the value
argument, e.g. value = list(varname ~ "level to show")
Joseph Larmarange
# Example 1 ---------------------------------- survey::svydesign(~1, data = as.data.frame(Titanic), weights = ~Freq) |> tbl_svysummary(by = Survived, percent = "row", include = c(Class, Age)) # Example 2 ---------------------------------- # A dataset with a complex design data(api, package = "survey") survey::svydesign(id = ~dnum, weights = ~pw, data = apiclus1, fpc = ~fpc) |> tbl_svysummary(by = "both", include = c(api00, stype)) |> modify_spanning_header(all_stat_cols() ~ "**Survived**")
# Example 1 ---------------------------------- survey::svydesign(~1, data = as.data.frame(Titanic), weights = ~Freq) |> tbl_svysummary(by = Survived, percent = "row", include = c(Class, Age)) # Example 2 ---------------------------------- # A dataset with a complex design data(api, package = "survey") survey::svydesign(id = ~dnum, weights = ~pw, data = apiclus1, fpc = ~fpc) |> tbl_svysummary(by = "both", include = c(api00, stype)) |> modify_spanning_header(all_stat_cols() ~ "**Survived**")
This function estimates univariable regression models and returns them in a publication-ready table. It can create regression models holding either a covariate or an outcome constant.
tbl_uvregression(data, ...) ## S3 method for class 'data.frame' tbl_uvregression( data, y = NULL, x = NULL, method, method.args = list(), exponentiate = FALSE, label = NULL, include = everything(), tidy_fun = broom.helpers::tidy_with_broom_or_parameters, hide_n = FALSE, show_single_row = NULL, conf.level = 0.95, estimate_fun = ifelse(exponentiate, label_style_ratio(), label_style_sigfig()), pvalue_fun = label_style_pvalue(digits = 1), formula = "{y} ~ {x}", add_estimate_to_reference_rows = FALSE, conf.int = TRUE, ... ) ## S3 method for class 'survey.design' tbl_uvregression( data, y = NULL, x = NULL, method, method.args = list(), exponentiate = FALSE, label = NULL, include = everything(), tidy_fun = broom.helpers::tidy_with_broom_or_parameters, hide_n = FALSE, show_single_row = NULL, conf.level = 0.95, estimate_fun = ifelse(exponentiate, label_style_ratio(), label_style_sigfig()), pvalue_fun = label_style_pvalue(digits = 1), formula = "{y} ~ {x}", add_estimate_to_reference_rows = FALSE, conf.int = TRUE, ... )
tbl_uvregression(data, ...) ## S3 method for class 'data.frame' tbl_uvregression( data, y = NULL, x = NULL, method, method.args = list(), exponentiate = FALSE, label = NULL, include = everything(), tidy_fun = broom.helpers::tidy_with_broom_or_parameters, hide_n = FALSE, show_single_row = NULL, conf.level = 0.95, estimate_fun = ifelse(exponentiate, label_style_ratio(), label_style_sigfig()), pvalue_fun = label_style_pvalue(digits = 1), formula = "{y} ~ {x}", add_estimate_to_reference_rows = FALSE, conf.int = TRUE, ... ) ## S3 method for class 'survey.design' tbl_uvregression( data, y = NULL, x = NULL, method, method.args = list(), exponentiate = FALSE, label = NULL, include = everything(), tidy_fun = broom.helpers::tidy_with_broom_or_parameters, hide_n = FALSE, show_single_row = NULL, conf.level = 0.95, estimate_fun = ifelse(exponentiate, label_style_ratio(), label_style_sigfig()), pvalue_fun = label_style_pvalue(digits = 1), formula = "{y} ~ {x}", add_estimate_to_reference_rows = FALSE, conf.int = TRUE, ... )
data |
( |
... |
Additional arguments passed to |
y , x
|
( |
method |
( |
method.args |
(named |
exponentiate |
(scalar |
label |
( |
include |
( |
tidy_fun |
( |
hide_n |
(scalar |
show_single_row |
( |
conf.level |
(scalar |
estimate_fun |
( |
pvalue_fun |
( |
formula |
( |
add_estimate_to_reference_rows |
(scalar |
conf.int |
(scalar |
A tbl_uvregression
object
x
and y
argumentsFor models holding outcome constant, the function takes as arguments a data frame,
the type of regression model, and the outcome variable y=
. Each column in the
data frame is regressed on the specified outcome. The tbl_uvregression()
function arguments are similar to the tbl_regression()
arguments. Review the
tbl_uvregression vignette
for detailed examples.
You may alternatively hold a single covariate constant. For this, pass a data
frame, the type of regression model, and a single
covariate in the x=
argument. Each column of the data frame will serve as
the outcome in a univariate regression model. Take care using the x
argument
that each of the columns in the data frame are appropriate for the same type
of model, e.g. they are all continuous variables appropriate for lm, or
dichotomous variables appropriate for logistic regression with glm.
The default method for tbl_regression()
model summary uses broom::tidy(x)
to perform the initial tidying of the model object. There are, however,
a few models that use modifications.
"parsnip/workflows"
: If the model was prepared using parsnip/workflows,
the original model fit is extracted and the original x=
argument
is replaced with the model fit. This will typically go unnoticed; however,if you've
provided a custom tidier in tidy_fun=
the tidier will be applied to the model
fit object and not the parsnip/workflows object.
"survreg"
: The scale parameter is removed, broom::tidy(x) %>% dplyr::filter(term != "Log(scale)")
"multinom"
: This multinomial outcome is complex, with one line per covariate per outcome (less the reference group)
"gam"
: Uses the internal tidier tidy_gam()
to print both parametric and smooth terms.
"lmerMod"
, "glmerMod"
, "glmmTMB"
, "glmmadmb"
, "stanreg"
, "brmsfit"
: These mixed effects
models use broom.mixed::tidy(x, effects = "fixed")
. Specify tidy_fun = broom.mixed::tidy
to print the random components.
Daniel D. Sjoberg
See tbl_regression vignette for detailed examples
# Example 1 ---------------------------------- tbl_uvregression( trial, method = glm, y = response, method.args = list(family = binomial), exponentiate = TRUE, include = c("age", "grade") ) # Example 2 ---------------------------------- # rounding pvalues to 2 decimal places library(survival) tbl_uvregression( trial, method = coxph, y = Surv(ttdeath, death), exponentiate = TRUE, include = c("age", "grade", "response"), pvalue_fun = label_style_pvalue(digits = 2) )
# Example 1 ---------------------------------- tbl_uvregression( trial, method = glm, y = response, method.args = list(family = binomial), exponentiate = TRUE, include = c("age", "grade") ) # Example 2 ---------------------------------- # rounding pvalues to 2 decimal places library(survival) tbl_uvregression( trial, method = coxph, y = Surv(ttdeath, death), exponentiate = TRUE, include = c("age", "grade", "response"), pvalue_fun = label_style_pvalue(digits = 2) )
This function is similar to tbl_summary()
, but places summary statistics
wide, in separate columns.
All included variables must be of the same summary type, e.g. all continuous
summaries or all categorical summaries (which encompasses dichotomous variables).
tbl_wide_summary( data, label = NULL, statistic = switch(type[[1]], continuous = c("{median}", "{p25}, {p75}"), c("{n}", "{p}%")), digits = NULL, type = NULL, value = NULL, sort = all_categorical(FALSE) ~ "alphanumeric", include = everything() )
tbl_wide_summary( data, label = NULL, statistic = switch(type[[1]], continuous = c("{median}", "{p25}, {p75}"), c("{n}", "{p}%")), digits = NULL, type = NULL, value = NULL, sort = all_categorical(FALSE) ~ "alphanumeric", include = everything() )
data |
( |
label |
( |
statistic |
( |
digits |
( |
type |
( |
value |
( |
sort |
( |
include |
( |
a gtsummary table of class 'tbl_wide_summary'
trial |> tbl_wide_summary(include = c(response, grade)) trial |> tbl_strata( strata = trt, ~tbl_wide_summary(.x, include = c(age, marker)) )
trial |> tbl_wide_summary(include = c(response, grade)) trial |> tbl_strata( strata = trt, ~tbl_wide_summary(.x, include = c(age, marker)) )
The following themes are available to use within the gtsummary package.
Print theme elements with theme_gtsummary_journal(set_theme = FALSE) |> print()
.
Review the themes vignette
for details.
theme_gtsummary_journal( journal = c("jama", "lancet", "nejm", "qjecon"), set_theme = TRUE ) theme_gtsummary_compact(set_theme = TRUE, font_size = NULL) theme_gtsummary_printer( print_engine = c("gt", "kable", "kable_extra", "flextable", "huxtable", "tibble"), set_theme = TRUE ) theme_gtsummary_language( language = c("de", "en", "es", "fr", "gu", "hi", "is", "ja", "kr", "mr", "nl", "no", "pt", "se", "zh-cn", "zh-tw"), decimal.mark = NULL, big.mark = NULL, iqr.sep = NULL, ci.sep = NULL, set_theme = TRUE ) theme_gtsummary_continuous2( statistic = "{median} ({p25}, {p75})", set_theme = TRUE ) theme_gtsummary_mean_sd(set_theme = TRUE) theme_gtsummary_eda(set_theme = TRUE)
theme_gtsummary_journal( journal = c("jama", "lancet", "nejm", "qjecon"), set_theme = TRUE ) theme_gtsummary_compact(set_theme = TRUE, font_size = NULL) theme_gtsummary_printer( print_engine = c("gt", "kable", "kable_extra", "flextable", "huxtable", "tibble"), set_theme = TRUE ) theme_gtsummary_language( language = c("de", "en", "es", "fr", "gu", "hi", "is", "ja", "kr", "mr", "nl", "no", "pt", "se", "zh-cn", "zh-tw"), decimal.mark = NULL, big.mark = NULL, iqr.sep = NULL, ci.sep = NULL, set_theme = TRUE ) theme_gtsummary_continuous2( statistic = "{median} ({p25}, {p75})", set_theme = TRUE ) theme_gtsummary_mean_sd(set_theme = TRUE) theme_gtsummary_eda(set_theme = TRUE)
journal |
String indicating the journal theme to follow. One of
|
set_theme |
(scalar |
font_size |
(scalar |
print_engine |
String indicating the print method. Must be one of
|
language |
( If a language is missing a translation for a word or phrase, please feel free to reach out on GitHub with the translated text. |
decimal.mark |
( |
big.mark |
( |
iqr.sep |
( |
ci.sep |
( |
statistic |
Default statistic continuous variables |
theme_gtsummary_journal(journal)
"jama"
The Journal of the American Medical Association
Round large p-values to 2 decimal places; separate confidence intervals with "ll to ul"
.
tbl_summary()
Doesn't show percent symbol; use em-dash to separate IQR; run add_stat_label()
tbl_regression()
/tbl_uvregression()
show coefficient and CI in same column
"lancet"
The Lancet
Use mid-point as decimal separator; round large p-values to 2 decimal places; separate confidence intervals with "ll to ul"
.
tbl_summary()
Doesn't show percent symbol; use em-dash to separate IQR
"nejm"
The New England Journal of Medicine
Round large p-values to 2 decimal places; separate confidence intervals with "ll to ul"
.
tbl_summary()
Doesn't show percent symbol; use em-dash to separate IQR
"qjecon"
The Quarterly Journal of Economics
tbl_summary()
all percentages rounded to one decimal place
tbl_regression()
,tbl_uvregression()
add significance stars with add_significance_stars()
;
hides CI and p-value from output
For flextable and huxtable output, the coefficients' standard error is placed below. For gt, it is placed to the right.
theme_gtsummary_compact()
tables printed with gt, flextable, kableExtra, or huxtable will be compact with smaller font size and reduced cell padding
theme_gtsummary_printer(print_engine)
Use this theme to permanently change the default printer.
theme_gtsummary_continuous2()
Set all continuous variables to summary type "continuous2"
by default
theme_gtsummary_mean_sd()
Set default summary statistics to mean and standard deviation in tbl_summary()
Set default continuous tests in add_p()
to t-test and ANOVA
theme_gtsummary_eda()
Set all continuous variables to summary type "continuous2"
by default
In tbl_summary()
show the median, mean, IQR, SD, and Range by default
Use reset_gtsummary_theme()
to restore the default settings
Review the themes vignette to create your own themes.
set_gtsummary_theme()
, reset_gtsummary_theme()
# Setting JAMA theme for gtsummary theme_gtsummary_journal("jama") # Themes can be combined by including more than one theme_gtsummary_compact() trial |> select(age, grade, trt) |> tbl_summary(by = trt) |> as_gt() # reset gtsummary themes reset_gtsummary_theme()
# Setting JAMA theme for gtsummary theme_gtsummary_journal("jama") # Themes can be combined by including more than one theme_gtsummary_compact() trial |> select(age, grade, trt) |> tbl_summary(by = trt) |> as_gt() # reset gtsummary themes reset_gtsummary_theme()
A dataset containing the baseline characteristics of 200 patients who received Drug A or Drug B. Dataset also contains the outcome of tumor response to the treatment.
trial
trial
A data frame with 200 rows–one row per patient
Chemotherapy Treatment
Age
Marker Level (ng/mL)
T Stage
Grade
Tumor Response
Patient Died
Months to Death/Censor