Title: | Data Visualization for Statistics in Social Science |
---|---|
Description: | Collection of plotting and table output functions for data visualization. Results of various statistical analyses (that are commonly used in social sciences) can be visualized using this package, including simple and cross tabulated frequencies, histograms, box plots, (generalized) linear models, mixed effects models, principal component analysis and correlation matrices, cluster analyses, scatter plots, stacked scales, effects plots of regression models (including interaction terms) and much more. This package supports labelled data. |
Authors: | Daniel Lüdecke [aut, cre] , Alexander Bartel [ctb] , Carsten Schwemmer [ctb], Chuck Powell [ctb] , Amir Djalovski [ctb], Johannes Titz [ctb] |
Maintainer: | Daniel Lüdecke <[email protected]> |
License: | GPL-3 |
Version: | 2.8.16.2 |
Built: | 2024-11-25 16:58:34 UTC |
Source: | https://github.com/strengejacke/sjPlot |
Collection of plotting and table output functions for data visualization. Results of various statistical analyses (that are commonly used in social sciences) can be visualized using this package, including simple and cross tabulated frequencies, histograms, box plots, (generalized) linear models, mixed effects models, PCA and correlation matrices, cluster analyses, scatter plots, Likert scales, effects plots of interaction terms in regression models, constructing index or score variables and much more.
The package supports labelled data, i.e. value and variable labels from labelled data (like vectors or data frames) are automatically used to label the output. Own labels can be specified as well.
What does this package do?
In short, the functions in this package mostly do two things:
compute basic or advanced statistical analyses
either plot the results as ggplot-figure or print them as html-table
How does this package help me?
One of the more challenging tasks when working with R is to get nicely formatted output of statistical analyses, either in graphical or table format. The sjPlot-package takes over these tasks and makes it easy to create beautiful figures or tables.
There are many examples for each function in the related help files and a comprehensive online documentation at https://strengejacke.github.io/sjPlot/.
A note on the package functions
The main functions follow specific naming conventions, hence starting with a specific prefix, which indicates what kind of task these functions perform.
sjc
- cluster analysis functions
sjp
- plotting functions
sjt
- (HTML) table output functions
Daniel Lüdecke [email protected]
This function plots a simple chi-squared distribution or a chi-squared distribution with shaded areas that indicate at which chi-squared value a significant p-level is reached.
dist_chisq( chi2 = NULL, deg.f = NULL, p = NULL, xmax = NULL, geom.colors = NULL, geom.alpha = 0.7 )
dist_chisq( chi2 = NULL, deg.f = NULL, p = NULL, xmax = NULL, geom.colors = NULL, geom.alpha = 0.7 )
chi2 |
Numeric, optional. If specified, a chi-squared distribution with |
deg.f |
Numeric. The degrees of freedom for the chi-squared distribution. Needs to be specified. |
p |
Numeric, optional. If specified, a chi-squared distribution with |
xmax |
Numeric, optional. Specifies the maximum x-axis-value. If not specified, the x-axis ranges to a value where a p-level of 0.00001 is reached. |
geom.colors |
user defined color for geoms. See 'Details' in |
geom.alpha |
Specifies the alpha-level of the shaded area. Default is 0.7, range between 0 to 1. |
# a simple chi-squared distribution # for 6 degrees of freedom dist_chisq(deg.f = 6) # a chi-squared distribution for 6 degrees of freedom, # and a shaded area starting at chi-squared value of ten. # With a df of 6, a chi-squared value of 12.59 would be "significant", # thus the shaded area from 10 to 12.58 is filled as "non-significant", # while the area starting from chi-squared value 12.59 is filled as # "significant" dist_chisq(chi2 = 10, deg.f = 6) # a chi-squared distribution for 6 degrees of freedom, # and a shaded area starting at that chi-squared value, which has # a p-level of about 0.125 (which equals a chi-squared value of about 10). # With a df of 6, a chi-squared value of 12.59 would be "significant", # thus the shaded area from 10 to 12.58 (p-level 0.125 to p-level 0.05) # is filled as "non-significant", while the area starting from chi-squared # value 12.59 (p-level < 0.05) is filled as "significant". dist_chisq(p = 0.125, deg.f = 6)
# a simple chi-squared distribution # for 6 degrees of freedom dist_chisq(deg.f = 6) # a chi-squared distribution for 6 degrees of freedom, # and a shaded area starting at chi-squared value of ten. # With a df of 6, a chi-squared value of 12.59 would be "significant", # thus the shaded area from 10 to 12.58 is filled as "non-significant", # while the area starting from chi-squared value 12.59 is filled as # "significant" dist_chisq(chi2 = 10, deg.f = 6) # a chi-squared distribution for 6 degrees of freedom, # and a shaded area starting at that chi-squared value, which has # a p-level of about 0.125 (which equals a chi-squared value of about 10). # With a df of 6, a chi-squared value of 12.59 would be "significant", # thus the shaded area from 10 to 12.58 (p-level 0.125 to p-level 0.05) # is filled as "non-significant", while the area starting from chi-squared # value 12.59 (p-level < 0.05) is filled as "significant". dist_chisq(p = 0.125, deg.f = 6)
This function plots a simple F distribution or an F distribution with shaded areas that indicate at which F value a significant p-level is reached.
dist_f( f = NULL, deg.f1 = NULL, deg.f2 = NULL, p = NULL, xmax = NULL, geom.colors = NULL, geom.alpha = 0.7 )
dist_f( f = NULL, deg.f1 = NULL, deg.f2 = NULL, p = NULL, xmax = NULL, geom.colors = NULL, geom.alpha = 0.7 )
f |
Numeric, optional. If specified, an F distribution with |
deg.f1 |
Numeric. The first degrees of freedom for the F distribution. Needs to be specified. |
deg.f2 |
Numeric. The second degrees of freedom for the F distribution. Needs to be specified. |
p |
Numeric, optional. If specified, a F distribution with |
xmax |
Numeric, optional. Specifies the maximum x-axis-value. If not specified, the x-axis ranges to a value where a p-level of 0.00001 is reached. |
geom.colors |
user defined color for geoms. See 'Details' in |
geom.alpha |
Specifies the alpha-level of the shaded area. Default is 0.7, range between 0 to 1. |
# a simple F distribution for 6 and 45 degrees of freedom dist_f(deg.f1 = 6, deg.f2 = 45) # F distribution for 6 and 45 degrees of freedom, # and a shaded area starting at F value of two. # F-values equal or greater than 2.31 are "significant" dist_f(f = 2, deg.f1 = 6, deg.f2 = 45) # F distribution for 6 and 45 degrees of freedom, # and a shaded area starting at a p-level of 0.2 # (F-Value about 1.5). dist_f(p = 0.2, deg.f1 = 6, deg.f2 = 45)
# a simple F distribution for 6 and 45 degrees of freedom dist_f(deg.f1 = 6, deg.f2 = 45) # F distribution for 6 and 45 degrees of freedom, # and a shaded area starting at F value of two. # F-values equal or greater than 2.31 are "significant" dist_f(f = 2, deg.f1 = 6, deg.f2 = 45) # F distribution for 6 and 45 degrees of freedom, # and a shaded area starting at a p-level of 0.2 # (F-Value about 1.5). dist_f(p = 0.2, deg.f1 = 6, deg.f2 = 45)
This function plots a simple normal distribution or a normal distribution with shaded areas that indicate at which value a significant p-level is reached.
dist_norm( norm = NULL, mean = 0, sd = 1, p = NULL, xmax = NULL, geom.colors = NULL, geom.alpha = 0.7 )
dist_norm( norm = NULL, mean = 0, sd = 1, p = NULL, xmax = NULL, geom.colors = NULL, geom.alpha = 0.7 )
norm |
Numeric, optional. If specified, a normal distribution with |
mean |
Numeric. Mean value for normal distribution. By default 0. |
sd |
Numeric. Standard deviation for normal distribution. By default 1. |
p |
Numeric, optional. If specified, a normal distribution with |
xmax |
Numeric, optional. Specifies the maximum x-axis-value. If not specified, the x-axis ranges to a value where a p-level of 0.00001 is reached. |
geom.colors |
user defined color for geoms. See 'Details' in |
geom.alpha |
Specifies the alpha-level of the shaded area. Default is 0.7, range between 0 to 1. |
# a simple normal distribution dist_norm() # a simple normal distribution with different mean and sd. # note that curve looks similar to above plot, but axis range # has changed. dist_norm(mean = 2, sd = 4) # a simple normal distribution dist_norm(norm = 1) # a simple normal distribution dist_norm(p = 0.2)
# a simple normal distribution dist_norm() # a simple normal distribution with different mean and sd. # note that curve looks similar to above plot, but axis range # has changed. dist_norm(mean = 2, sd = 4) # a simple normal distribution dist_norm(norm = 1) # a simple normal distribution dist_norm(p = 0.2)
This function plots a simple t-distribution or a t-distribution with shaded areas that indicate at which t-value a significant p-level is reached.
dist_t( t = NULL, deg.f = NULL, p = NULL, xmax = NULL, geom.colors = NULL, geom.alpha = 0.7 )
dist_t( t = NULL, deg.f = NULL, p = NULL, xmax = NULL, geom.colors = NULL, geom.alpha = 0.7 )
t |
Numeric, optional. If specified, a t-distribution with |
deg.f |
Numeric. The degrees of freedom for the t-distribution. Needs to be specified. |
p |
Numeric, optional. If specified, a t-distribution with |
xmax |
Numeric, optional. Specifies the maximum x-axis-value. If not specified, the x-axis ranges to a value where a p-level of 0.00001 is reached. |
geom.colors |
user defined color for geoms. See 'Details' in |
geom.alpha |
Specifies the alpha-level of the shaded area. Default is 0.7, range between 0 to 1. |
# a simple t-distribution # for 6 degrees of freedom dist_t(deg.f = 6) # a t-distribution for 6 degrees of freedom, # and a shaded area starting at t-value of one. # With a df of 6, a t-value of 1.94 would be "significant". dist_t(t = 1, deg.f = 6) # a t-distribution for 6 degrees of freedom, # and a shaded area starting at p-level of 0.4 # (t-value of about 0.26). dist_t(p = 0.4, deg.f = 6)
# a simple t-distribution # for 6 degrees of freedom dist_t(deg.f = 6) # a t-distribution for 6 degrees of freedom, # and a shaded area starting at t-value of one. # With a df of 6, a t-value of 1.94 would be "significant". dist_t(t = 1, deg.f = 6) # a t-distribution for 6 degrees of freedom, # and a shaded area starting at p-level of 0.4 # (t-value of about 0.26). dist_t(p = 0.4, deg.f = 6)
A SPSS sample data set, imported with the read_spss
function.
Plot frequencies of a variable as bar graph, histogram, box plot etc.
plot_frq( data, ..., title = "", weight.by = NULL, title.wtd.suffix = NULL, sort.frq = c("none", "asc", "desc"), type = c("bar", "dot", "histogram", "line", "density", "boxplot", "violin"), geom.size = NULL, geom.colors = "#336699", errorbar.color = "darkred", axis.title = NULL, axis.labels = NULL, xlim = NULL, ylim = NULL, wrap.title = 50, wrap.labels = 20, grid.breaks = NULL, expand.grid = FALSE, show.values = TRUE, show.n = TRUE, show.prc = TRUE, show.axis.values = TRUE, show.ci = FALSE, show.na = FALSE, show.mean = FALSE, show.mean.val = TRUE, show.sd = TRUE, drop.empty = TRUE, mean.line.type = 2, mean.line.size = 0.5, inner.box.width = 0.15, inner.box.dotsize = 3, normal.curve = FALSE, normal.curve.color = "red", normal.curve.size = 0.8, normal.curve.alpha = 0.4, auto.group = NULL, coord.flip = FALSE, vjust = "bottom", hjust = "center", y.offset = NULL )
plot_frq( data, ..., title = "", weight.by = NULL, title.wtd.suffix = NULL, sort.frq = c("none", "asc", "desc"), type = c("bar", "dot", "histogram", "line", "density", "boxplot", "violin"), geom.size = NULL, geom.colors = "#336699", errorbar.color = "darkred", axis.title = NULL, axis.labels = NULL, xlim = NULL, ylim = NULL, wrap.title = 50, wrap.labels = 20, grid.breaks = NULL, expand.grid = FALSE, show.values = TRUE, show.n = TRUE, show.prc = TRUE, show.axis.values = TRUE, show.ci = FALSE, show.na = FALSE, show.mean = FALSE, show.mean.val = TRUE, show.sd = TRUE, drop.empty = TRUE, mean.line.type = 2, mean.line.size = 0.5, inner.box.width = 0.15, inner.box.dotsize = 3, normal.curve = FALSE, normal.curve.color = "red", normal.curve.size = 0.8, normal.curve.alpha = 0.4, auto.group = NULL, coord.flip = FALSE, vjust = "bottom", hjust = "center", y.offset = NULL )
data |
A data frame, or a grouped data frame. |
... |
Optional, unquoted names of variables that should be selected for
further processing. Required, if |
title |
Character vector, used as plot title. By default,
|
weight.by |
Vector of weights that will be applied to weight all cases.
Must be a vector of same length as the input vector. Default is
|
title.wtd.suffix |
Suffix (as string) for the title, if |
sort.frq |
Determines whether categories should be sorted
according to their frequencies or not. Default is |
type |
Specifies the plot type. May be abbreviated.
|
geom.size |
size resp. width of the geoms (bar width, line thickness or point size, depending on plot type and function). Note that bar and bin widths mostly need smaller values than dot sizes. |
geom.colors |
User defined color for geoms, e.g. |
errorbar.color |
Color of confidence interval bars (error bars).
Only applies to |
axis.title |
Character vector of length one or two (depending on
the plot function and type), used as title(s) for the x and y axis.
If not specified, a default labelling is chosen.
Note: Some plot types do not support this argument. In such
cases, use the return value and add axis titles manually with
|
axis.labels |
character vector with labels used as axis labels. Optional argument, since in most cases, axis labels are set automatically. |
xlim |
Numeric vector of length two, defining lower and upper axis limits
of the x scale. By default, this argument is set to |
ylim |
numeric vector of length two, defining lower and upper axis limits
of the y scale. By default, this argument is set to |
wrap.title |
Numeric, determines how many chars of the plot title are displayed in one line and when a line break is inserted. |
wrap.labels |
numeric, determines how many chars of the value, variable or axis labels are displayed in one line and when a line break is inserted. |
grid.breaks |
numeric; sets the distance between breaks for the axis,
i.e. at every |
expand.grid |
logical, if |
show.values |
Logical, whether values should be plotted or not. |
show.n |
logical, if |
show.prc |
logical, if |
show.axis.values |
logical, whether category, count or percentage values for the axis should be printed or not. |
show.ci |
Logical, if |
show.na |
logical, if |
show.mean |
Logical, if |
show.mean.val |
Logical, if |
show.sd |
Logical, if |
drop.empty |
Logical, if |
mean.line.type |
Numeric value, indicating the linetype of the mean
intercept line. Only applies to histogram-charts and
when |
mean.line.size |
Numeric, size of the mean intercept line. Only
applies to histogram-charts and when |
inner.box.width |
width of the inner box plot that is plotted inside of violin plots. Only applies
if |
inner.box.dotsize |
size of mean dot insie a violin or box plot. Applies only
when |
normal.curve |
Logical, if |
normal.curve.color |
Color of the normal curve line. Only
applies if |
normal.curve.size |
Numeric, size of the normal curve line. Only
applies if |
normal.curve.alpha |
Transparancy level (alpha value) of the normal curve. Only
applies if |
auto.group |
numeric value, indicating the minimum amount of unique values
in the count variable, at which automatic grouping into smaller units
is done (see |
coord.flip |
logical, if |
vjust |
character vector, indicating the vertical position of value
labels. Allowed are same values as for |
hjust |
character vector, indicating the horizontal position of value
labels. Allowed are same values as for |
y.offset |
numeric, offset for text labels when their alignment is adjusted
to the top/bottom of the geom (see |
A ggplot-object.
This function only works with variables with integer values (or numeric factor levels), i.e. scales / centered variables with fractional part may result in unexpected behaviour.
library(sjlabelled) data(efc) data(iris) # simple plots, two different notations plot_frq(iris, Species) plot_frq(efc$tot_sc_e) # boxplot plot_frq(efc$e17age, type = "box") if (require("dplyr")) { # histogram, pipe-workflow efc %>% dplyr::select(e17age, c160age) %>% plot_frq(type = "hist", show.mean = TRUE) # bar plot(s) plot_frq(efc, e42dep, c172code) } if (require("dplyr") && require("gridExtra")) { # grouped data frame, all panels in one plot efc %>% group_by(e42dep) %>% plot_frq(c161sex) %>% plot_grid() } library(sjmisc) # grouped variable ageGrp <- group_var(efc$e17age) ageGrpLab <- group_labels(efc$e17age) plot_frq(ageGrp, title = get_label(efc$e17age), axis.labels = ageGrpLab) # plotting confidence intervals. expand grid and v/hjust for text labels plot_frq( efc$e15relat, type = "dot", show.ci = TRUE, sort.frq = "desc", coord.flip = TRUE, expand.grid = TRUE, vjust = "bottom", hjust = "left" ) # histogram with overlayed normal curve plot_frq(efc$c160age, type = "h", show.mean = TRUE, show.mean.val = TRUE, normal.curve = TRUE, show.sd = TRUE, normal.curve.color = "blue", normal.curve.size = 3, ylim = c(0,50))
library(sjlabelled) data(efc) data(iris) # simple plots, two different notations plot_frq(iris, Species) plot_frq(efc$tot_sc_e) # boxplot plot_frq(efc$e17age, type = "box") if (require("dplyr")) { # histogram, pipe-workflow efc %>% dplyr::select(e17age, c160age) %>% plot_frq(type = "hist", show.mean = TRUE) # bar plot(s) plot_frq(efc, e42dep, c172code) } if (require("dplyr") && require("gridExtra")) { # grouped data frame, all panels in one plot efc %>% group_by(e42dep) %>% plot_frq(c161sex) %>% plot_grid() } library(sjmisc) # grouped variable ageGrp <- group_var(efc$e17age) ageGrpLab <- group_labels(efc$e17age) plot_frq(ageGrp, title = get_label(efc$e17age), axis.labels = ageGrpLab) # plotting confidence intervals. expand grid and v/hjust for text labels plot_frq( efc$e15relat, type = "dot", show.ci = TRUE, sort.frq = "desc", coord.flip = TRUE, expand.grid = TRUE, vjust = "bottom", hjust = "left" ) # histogram with overlayed normal curve plot_frq(efc$c160age, type = "h", show.mean = TRUE, show.mean.val = TRUE, normal.curve = TRUE, show.sd = TRUE, normal.curve.color = "blue", normal.curve.size = 3, ylim = c(0,50))
Plot grouped proportional crosstables, where the proportion of
each level of x
for the highest category in y
is plotted, for each subgroup of grp
.
plot_gpt( data, x, y, grp, colors = "metro", geom.size = 2.5, shape.fill.color = "#f0f0f0", shapes = c(15, 16, 17, 18, 21, 22, 23, 24, 25, 7, 8, 9, 10, 12), title = NULL, axis.labels = NULL, axis.titles = NULL, legend.title = NULL, legend.labels = NULL, wrap.title = 50, wrap.labels = 15, wrap.legend.title = 20, wrap.legend.labels = 20, axis.lim = NULL, grid.breaks = NULL, show.total = TRUE, annotate.total = TRUE, show.p = TRUE, show.n = TRUE )
plot_gpt( data, x, y, grp, colors = "metro", geom.size = 2.5, shape.fill.color = "#f0f0f0", shapes = c(15, 16, 17, 18, 21, 22, 23, 24, 25, 7, 8, 9, 10, 12), title = NULL, axis.labels = NULL, axis.titles = NULL, legend.title = NULL, legend.labels = NULL, wrap.title = 50, wrap.labels = 15, wrap.legend.title = 20, wrap.legend.labels = 20, axis.lim = NULL, grid.breaks = NULL, show.total = TRUE, annotate.total = TRUE, show.p = TRUE, show.n = TRUE )
data |
A data frame, or a grouped data frame. |
x |
Categorical variable, where the proportion of each category in
|
y |
Categorical or numeric variable. If not a binary variable, |
grp |
Grouping variable, which will define the y-axis |
colors |
May be a character vector of color values in hex-format, valid
color value names (see
|
geom.size |
size resp. width of the geoms (bar width, line thickness or point size, depending on plot type and function). Note that bar and bin widths mostly need smaller values than dot sizes. |
shape.fill.color |
Optional color vector, fill-color for non-filled shapes |
shapes |
Numeric vector with shape styles, used to map the different
categories of |
title |
Character vector, used as plot title. By default,
|
axis.labels |
character vector with labels used as axis labels. Optional argument, since in most cases, axis labels are set automatically. |
axis.titles |
character vector of length one or two, defining the title(s) for the x-axis and y-axis. |
legend.title |
Character vector, used as legend title for plots that have a legend. |
legend.labels |
character vector with labels for the guide/legend. |
wrap.title |
Numeric, determines how many chars of the plot title are displayed in one line and when a line break is inserted. |
wrap.labels |
numeric, determines how many chars of the value, variable or axis labels are displayed in one line and when a line break is inserted. |
wrap.legend.title |
numeric, determines how many chars of the legend's title are displayed in one line and when a line break is inserted. |
wrap.legend.labels |
numeric, determines how many chars of the legend labels are displayed in one line and when a line break is inserted. |
axis.lim |
Numeric vector of length 2, defining the range of the plot axis.
Depending on plot type, may effect either x- or y-axis, or both.
For multiple plot outputs (e.g., from |
grid.breaks |
numeric; sets the distance between breaks for the axis,
i.e. at every |
show.total |
Logical, if |
annotate.total |
Logical, if |
show.p |
Logical, adds significance levels to values, or value and variable labels. |
show.n |
logical, if |
The p-values are based on chisq.test
of x
and y
for each grp
.
A ggplot-object.
if (requireNamespace("haven")) { data(efc) # the proportion of dependency levels in female # elderly, for each family carer's relationship # to elderly plot_gpt(efc, e42dep, e16sex, e15relat) # proportion of educational levels in highest # dependency category of elderly, for different # care levels plot_gpt(efc, c172code, e42dep, n4pstu) }
if (requireNamespace("haven")) { data(efc) # the proportion of dependency levels in female # elderly, for each family carer's relationship # to elderly plot_gpt(efc, e42dep, e16sex, e15relat) # proportion of educational levels in highest # dependency category of elderly, for different # care levels plot_gpt(efc, c172code, e42dep, n4pstu) }
Plot multiple ggplot-objects as a grid-arranged single plot.
plot_grid(x, margin = c(1, 1, 1, 1), tags = NULL)
plot_grid(x, margin = c(1, 1, 1, 1), tags = NULL)
x |
A list of ggplot-objects. See 'Details'. |
margin |
A numeric vector of length 4, indicating the top, right, bottom and left margin for each plot, in centimetres. |
tags |
Add tags to your subfigures. Can be |
This function takes a list
of ggplot-objects as argument.
Plotting functions of this package that produce multiple plot
objects (e.g., when there is an argument facet.grid
) usually
return multiple plots as list (the return value is named plot.list
).
To arrange these plots as grid as a single plot, use plot_grid
.
An object of class gtable
.
if (require("dplyr") && require("gridExtra")) { library(ggeffects) data(efc) # fit model fit <- glm( tot_sc_e ~ c12hour + e17age + e42dep + neg_c_7, data = efc, family = poisson ) # plot marginal effects for each predictor, each as single plot p1 <- ggpredict(fit, "c12hour") %>% plot(show.y.title = FALSE, show.title = FALSE) p2 <- ggpredict(fit, "e17age") %>% plot(show.y.title = FALSE, show.title = FALSE) p3 <- ggpredict(fit, "e42dep") %>% plot(show.y.title = FALSE, show.title = FALSE) p4 <- ggpredict(fit, "neg_c_7") %>% plot(show.y.title = FALSE, show.title = FALSE) # plot grid plot_grid(list(p1, p2, p3, p4)) # plot grid plot_grid(list(p1, p2, p3, p4), tags = TRUE) }
if (require("dplyr") && require("gridExtra")) { library(ggeffects) data(efc) # fit model fit <- glm( tot_sc_e ~ c12hour + e17age + e42dep + neg_c_7, data = efc, family = poisson ) # plot marginal effects for each predictor, each as single plot p1 <- ggpredict(fit, "c12hour") %>% plot(show.y.title = FALSE, show.title = FALSE) p2 <- ggpredict(fit, "e17age") %>% plot(show.y.title = FALSE, show.title = FALSE) p3 <- ggpredict(fit, "e42dep") %>% plot(show.y.title = FALSE, show.title = FALSE) p4 <- ggpredict(fit, "neg_c_7") %>% plot(show.y.title = FALSE, show.title = FALSE) # plot grid plot_grid(list(p1, p2, p3, p4)) # plot grid plot_grid(list(p1, p2, p3, p4), tags = TRUE) }
Plot grouped or stacked frequencies of variables as bar/dot, box or violin plots, or line plot.
plot_grpfrq( var.cnt, var.grp, type = c("bar", "dot", "line", "boxplot", "violin"), bar.pos = c("dodge", "stack"), weight.by = NULL, intr.var = NULL, title = "", title.wtd.suffix = NULL, legend.title = NULL, axis.titles = NULL, axis.labels = NULL, legend.labels = NULL, intr.var.labels = NULL, wrap.title = 50, wrap.labels = 15, wrap.legend.title = 20, wrap.legend.labels = 20, geom.size = NULL, geom.spacing = 0.15, geom.colors = "Paired", show.values = TRUE, show.n = TRUE, show.prc = TRUE, show.axis.values = TRUE, show.ci = FALSE, show.grpcnt = FALSE, show.legend = TRUE, show.na = FALSE, show.summary = FALSE, drop.empty = TRUE, auto.group = NULL, ylim = NULL, grid.breaks = NULL, expand.grid = FALSE, inner.box.width = 0.15, inner.box.dotsize = 3, smooth.lines = FALSE, emph.dots = TRUE, summary.pos = "r", facet.grid = FALSE, coord.flip = FALSE, y.offset = NULL, vjust = "bottom", hjust = "center" )
plot_grpfrq( var.cnt, var.grp, type = c("bar", "dot", "line", "boxplot", "violin"), bar.pos = c("dodge", "stack"), weight.by = NULL, intr.var = NULL, title = "", title.wtd.suffix = NULL, legend.title = NULL, axis.titles = NULL, axis.labels = NULL, legend.labels = NULL, intr.var.labels = NULL, wrap.title = 50, wrap.labels = 15, wrap.legend.title = 20, wrap.legend.labels = 20, geom.size = NULL, geom.spacing = 0.15, geom.colors = "Paired", show.values = TRUE, show.n = TRUE, show.prc = TRUE, show.axis.values = TRUE, show.ci = FALSE, show.grpcnt = FALSE, show.legend = TRUE, show.na = FALSE, show.summary = FALSE, drop.empty = TRUE, auto.group = NULL, ylim = NULL, grid.breaks = NULL, expand.grid = FALSE, inner.box.width = 0.15, inner.box.dotsize = 3, smooth.lines = FALSE, emph.dots = TRUE, summary.pos = "r", facet.grid = FALSE, coord.flip = FALSE, y.offset = NULL, vjust = "bottom", hjust = "center" )
var.cnt |
Vector of counts, for which frequencies or means will be plotted or printed. |
var.grp |
Factor with the cross-classifying variable, where |
type |
Specifies the plot type. May be abbreviated.
|
bar.pos |
Indicates whether bars should be positioned side-by-side (default),
or stacked ( |
weight.by |
Vector of weights that will be applied to weight all cases.
Must be a vector of same length as the input vector. Default is
|
intr.var |
An interaction variable which can be used for box plots. Divides each category indicated
by |
title |
character vector, used as plot title. Depending on plot type and function,
will be set automatically. If |
title.wtd.suffix |
Suffix (as string) for the title, if |
legend.title |
character vector, used as title for the plot legend. |
axis.titles |
character vector of length one or two, defining the title(s) for the x-axis and y-axis. |
axis.labels |
character vector with labels used as axis labels. Optional argument, since in most cases, axis labels are set automatically. |
legend.labels |
character vector with labels for the guide/legend. |
intr.var.labels |
a character vector with labels for the x-axis breaks
when having interaction variables included.
These labels replace the |
wrap.title |
numeric, determines how many chars of the plot title are displayed in one line and when a line break is inserted. |
wrap.labels |
numeric, determines how many chars of the value, variable or axis labels are displayed in one line and when a line break is inserted. |
wrap.legend.title |
numeric, determines how many chars of the legend's title are displayed in one line and when a line break is inserted. |
wrap.legend.labels |
numeric, determines how many chars of the legend labels are displayed in one line and when a line break is inserted. |
geom.size |
size resp. width of the geoms (bar width, line thickness or point size, depending on plot type and function). Note that bar and bin widths mostly need smaller values than dot sizes. |
geom.spacing |
the spacing between geoms (i.e. bar spacing) |
geom.colors |
user defined color for geoms. See 'Details' in |
show.values |
Logical, whether values should be plotted or not. |
show.n |
logical, if |
show.prc |
logical, if |
show.axis.values |
logical, whether category, count or percentage values for the axis should be printed or not. |
show.ci |
Logical, if |
show.grpcnt |
logical, if |
show.legend |
logical, if |
show.na |
logical, if |
show.summary |
logical, if |
drop.empty |
Logical, if |
auto.group |
numeric value, indicating the minimum amount of unique values
in the count variable, at which automatic grouping into smaller units
is done (see |
ylim |
numeric vector of length two, defining lower and upper axis limits
of the y scale. By default, this argument is set to |
grid.breaks |
numeric; sets the distance between breaks for the axis,
i.e. at every |
expand.grid |
logical, if |
inner.box.width |
width of the inner box plot that is plotted inside of violin plots. Only applies
if |
inner.box.dotsize |
size of mean dot insie a violin or box plot. Applies only
when |
smooth.lines |
prints a smooth line curve. Only applies, when argument |
emph.dots |
logical, if |
summary.pos |
position of the model summary which is printed when |
facet.grid |
|
coord.flip |
logical, if |
y.offset |
numeric, offset for text labels when their alignment is adjusted
to the top/bottom of the geom (see |
vjust |
character vector, indicating the vertical position of value
labels. Allowed are same values as for |
hjust |
character vector, indicating the horizontal position of value
labels. Allowed are same values as for |
geom.colors
may be a character vector of color values
in hex-format, valid color value names (see demo("colors")
or
a name of a color brewer palette.
Following options are valid for the geom.colors
argument:
If not specified, a default color brewer palette will be used, which is suitable for the plot style (i.e. diverging for likert scales, qualitative for grouped bars etc.).
If "gs"
, a greyscale will be used.
If "bw"
, and plot-type is a line-plot, the plot is black/white and uses different line types to distinguish groups (see this package-vignette).
If geom.colors
is any valid color brewer palette name, the related palette will be used. Use RColorBrewer::display.brewer.all()
to view all available palette names.
Else specify own color values or names as vector (e.g. geom.colors = c("#f00000", "#00ff00")
).
A ggplot-object.
data(efc) plot_grpfrq(efc$e17age, efc$e16sex, show.values = FALSE) # boxplot plot_grpfrq(efc$e17age, efc$e42dep, type = "box") # grouped bars plot_grpfrq(efc$e42dep, efc$e16sex, title = NULL) # box plots with interaction variable plot_grpfrq(efc$e17age, efc$e42dep, intr.var = efc$e16sex, type = "box") # Grouped bar plot plot_grpfrq(efc$neg_c_7, efc$e42dep, show.values = FALSE) # same data as line plot plot_grpfrq(efc$neg_c_7, efc$e42dep, type = "line") # show ony categories where we have data (i.e. drop zero-counts) library(dplyr) efc <- dplyr::filter(efc, e42dep %in% c(3,4)) plot_grpfrq(efc$c161sex, efc$e42dep, drop.empty = TRUE) # show all categories, even if not in data plot_grpfrq(efc$c161sex, efc$e42dep, drop.empty = FALSE)
data(efc) plot_grpfrq(efc$e17age, efc$e16sex, show.values = FALSE) # boxplot plot_grpfrq(efc$e17age, efc$e42dep, type = "box") # grouped bars plot_grpfrq(efc$e42dep, efc$e16sex, title = NULL) # box plots with interaction variable plot_grpfrq(efc$e17age, efc$e42dep, intr.var = efc$e16sex, type = "box") # Grouped bar plot plot_grpfrq(efc$neg_c_7, efc$e42dep, show.values = FALSE) # same data as line plot plot_grpfrq(efc$neg_c_7, efc$e42dep, type = "line") # show ony categories where we have data (i.e. drop zero-counts) library(dplyr) efc <- dplyr::filter(efc, e42dep %in% c(3,4)) plot_grpfrq(efc$c161sex, efc$e42dep, drop.empty = TRUE) # show all categories, even if not in data plot_grpfrq(efc$c161sex, efc$e42dep, drop.empty = FALSE)
This function plots the aggregated residuals of k-fold cross-validated models against the outcome. This allows to evaluate how the model performs according over- or underestimation of the outcome.
plot_kfold_cv(data, formula, k = 5, fit)
plot_kfold_cv(data, formula, k = 5, fit)
data |
A data frame, used to split the data into |
formula |
A model formula, used to fit linear models ( |
k |
Number of folds. |
fit |
Model object, which will be used to compute cross validation. If
|
This function, first, generates k
cross-validated test-training
pairs and
fits the same model, specified in the formula
- or fit
-
argument, over all training data sets.
Then, the test data is used to predict the outcome from all
models that have been fit on the training data, and the residuals
from all test data is plotted against the observed values (outcome)
from the test data (note: for poisson or negative binomial models, the
deviance residuals are calculated). This plot can be used to validate the model
and see, whether it over- (residuals > 0) or underestimates
(residuals < 0) the model's outcome.
Currently, only linear, poisson and negative binomial regression models are supported.
data(efc) plot_kfold_cv(efc, neg_c_7 ~ e42dep + c172code + c12hour) plot_kfold_cv(mtcars, mpg ~.) # for poisson models. need to fit a model and use 'fit'-argument fit <- glm(tot_sc_e ~ neg_c_7 + c172code, data = efc, family = poisson) plot_kfold_cv(efc, fit = fit) # and for negative binomial models fit <- MASS::glm.nb(tot_sc_e ~ neg_c_7 + c172code, data = efc) plot_kfold_cv(efc, fit = fit)
data(efc) plot_kfold_cv(efc, neg_c_7 ~ e42dep + c172code + c12hour) plot_kfold_cv(mtcars, mpg ~.) # for poisson models. need to fit a model and use 'fit'-argument fit <- glm(tot_sc_e ~ neg_c_7 + c172code, data = efc, family = poisson) plot_kfold_cv(efc, fit = fit) # and for negative binomial models fit <- MASS::glm.nb(tot_sc_e ~ neg_c_7 + c172code, data = efc) plot_kfold_cv(efc, fit = fit)
Plot likert scales as centered stacked bars.
plot_likert( items, groups = NULL, groups.titles = "auto", title = NULL, legend.title = NULL, legend.labels = NULL, axis.titles = NULL, axis.labels = NULL, catcount = NULL, cat.neutral = NULL, sort.frq = NULL, weight.by = NULL, title.wtd.suffix = NULL, wrap.title = 50, wrap.labels = 30, wrap.legend.title = 30, wrap.legend.labels = 28, geom.size = 0.6, geom.colors = "BrBG", cat.neutral.color = "grey70", intercept.line.color = "grey50", reverse.colors = FALSE, values = "show", show.n = TRUE, show.legend = TRUE, show.prc.sign = FALSE, grid.range = 1, grid.breaks = 0.2, expand.grid = TRUE, digits = 1, reverse.scale = FALSE, coord.flip = TRUE, sort.groups = TRUE, legend.pos = "bottom", rel_heights = 1, group.legend.options = list(nrow = NULL, byrow = TRUE), cowplot.options = list(label_x = 0.01, hjust = 0, align = "v") )
plot_likert( items, groups = NULL, groups.titles = "auto", title = NULL, legend.title = NULL, legend.labels = NULL, axis.titles = NULL, axis.labels = NULL, catcount = NULL, cat.neutral = NULL, sort.frq = NULL, weight.by = NULL, title.wtd.suffix = NULL, wrap.title = 50, wrap.labels = 30, wrap.legend.title = 30, wrap.legend.labels = 28, geom.size = 0.6, geom.colors = "BrBG", cat.neutral.color = "grey70", intercept.line.color = "grey50", reverse.colors = FALSE, values = "show", show.n = TRUE, show.legend = TRUE, show.prc.sign = FALSE, grid.range = 1, grid.breaks = 0.2, expand.grid = TRUE, digits = 1, reverse.scale = FALSE, coord.flip = TRUE, sort.groups = TRUE, legend.pos = "bottom", rel_heights = 1, group.legend.options = list(nrow = NULL, byrow = TRUE), cowplot.options = list(label_x = 0.01, hjust = 0, align = "v") )
items |
Data frame, or a grouped data frame, with each column representing one item. |
groups |
(optional) Must be a vector of same length as |
groups.titles |
(optional, only used if groups are supplied) Titles for each factor group that will be used as table caption for each
component-table. Must be a character vector of same length as |
title |
character vector, used as plot title. Depending on plot type and function,
will be set automatically. If |
legend.title |
character vector, used as title for the plot legend. |
legend.labels |
character vector with labels for the guide/legend. |
axis.titles |
character vector of length one or two, defining the title(s) for the x-axis and y-axis. |
axis.labels |
character vector with labels used as axis labels. Optional argument, since in most cases, axis labels are set automatically. |
catcount |
optional, amount of categories of |
cat.neutral |
If there's a neutral category (like "don't know" etc.), specify
the index number (value) for this category. Else, set |
sort.frq |
Indicates whether the items of
|
weight.by |
Vector of weights that will be applied to weight all cases.
Must be a vector of same length as the input vector. Default is
|
title.wtd.suffix |
Suffix (as string) for the title, if |
wrap.title |
numeric, determines how many chars of the plot title are displayed in one line and when a line break is inserted. |
wrap.labels |
numeric, determines how many chars of the value, variable or axis labels are displayed in one line and when a line break is inserted. |
wrap.legend.title |
numeric, determines how many chars of the legend's title are displayed in one line and when a line break is inserted. |
wrap.legend.labels |
numeric, determines how many chars of the legend labels are displayed in one line and when a line break is inserted. |
geom.size |
size resp. width of the geoms (bar width, line thickness or point size, depending on plot type and function). Note that bar and bin widths mostly need smaller values than dot sizes. |
geom.colors |
user defined color for geoms. See 'Details' in |
cat.neutral.color |
Color of the neutral category, if plotted (see |
intercept.line.color |
Color of the vertical intercept line that divides positive and negative values. |
reverse.colors |
logical, if |
values |
Determines style and position of percentage value labels on the bars:
|
show.n |
logical, if |
show.legend |
logical, if |
show.prc.sign |
logical, if |
grid.range |
Numeric, limits of the x-axis-range, as proportion of 100.
Default is 1, so the x-scale ranges from zero to 100% on both sides from the center.
Can alternatively be supplied as a vector of 2 positive numbers (e.g. |
grid.breaks |
numeric; sets the distance between breaks for the axis,
i.e. at every |
expand.grid |
logical, if |
digits |
Numeric, amount of digits after decimal point when rounding estimates or values. |
reverse.scale |
logical, if |
coord.flip |
logical, if |
sort.groups |
(optional, only used if groups are supplied) logical, if groups should be sorted according to the values supplied to |
legend.pos |
(optional, only used if groups are supplied) Defines the legend position. Possible values are |
rel_heights |
(optional, only used if groups are supplied) This option can be used to adjust the height of the subplots. The bars in subplots can have different heights due to a differing number of items or due to legend placement. This can be adjusted here. Takes a vector of numbers, one for each plot. Values are evaluated relative to each other. |
group.legend.options |
(optional, only used if groups are supplied) List of options to be passed to |
cowplot.options |
(optional, only used if groups are supplied) List of label options to be passed to |
A ggplot-object.
Note that only even numbers of categories are possible to plot, so the "positive"
and "negative" values can be splitted into two halfs. A neutral category (like "don't know")
can be used, but must be indicated by cat.neutral
.
The catcount
-argument indicates how many item categories are in the
Likert scale. Normally, this argument can be ignored because the amount of
valid categories is retrieved automatically. However, sometimes (for instance,
if a certain category is missing in all items), auto-detection of the amount
of categories fails. In such cases, specify the amount of categories
with the catcount
-argument.
if (requireNamespace("ggrepel") && requireNamespace("sjmisc")) { library(sjmisc) data(efc) # find all variables from COPE-Index, which all have a "cop" in their # variable name, and then plot that subset as likert-plot mydf <- find_var(efc, pattern = "cop", out = "df") plot_likert(mydf) plot_likert( mydf, grid.range = c(1.2, 1.4), expand.grid = FALSE, values = "sum.outside", show.prc.sign = TRUE ) # Plot in groups plot_likert(mydf, c(2,1,1,1,1,2,2,2,1)) if (require("parameters") && require("nFactors")) { groups <- parameters::principal_components(mydf) plot_likert(mydf, groups = parameters::closest_component(groups)) } plot_likert(mydf, c(rep("B", 4), rep("A", 5)), sort.groups = FALSE, grid.range = c(0.9, 1.1), geom.colors = "RdBu", rel_heights = c(6, 8), wrap.labels = 40, reverse.scale = TRUE) # control legend items six_cat_example = data.frame( matrix(sample(1:6, 600, replace = TRUE), ncol = 6) ) ## Not run: six_cat_example <- six_cat_example %>% dplyr::mutate_all(~ordered(.,labels = c("+++","++","+","-","--","---"))) # Old default plot_likert( six_cat_example, groups = c(1, 1, 1, 2, 2, 2), group.legend.options = list(nrow = 2, byrow = FALSE) ) # New default plot_likert(six_cat_example, groups = c(1, 1, 1, 2, 2, 2)) # Single row plot_likert( six_cat_example, groups = c(1, 1, 1, 2, 2, 2), group.legend.options = list(nrow = 1) ) ## End(Not run) }
if (requireNamespace("ggrepel") && requireNamespace("sjmisc")) { library(sjmisc) data(efc) # find all variables from COPE-Index, which all have a "cop" in their # variable name, and then plot that subset as likert-plot mydf <- find_var(efc, pattern = "cop", out = "df") plot_likert(mydf) plot_likert( mydf, grid.range = c(1.2, 1.4), expand.grid = FALSE, values = "sum.outside", show.prc.sign = TRUE ) # Plot in groups plot_likert(mydf, c(2,1,1,1,1,2,2,2,1)) if (require("parameters") && require("nFactors")) { groups <- parameters::principal_components(mydf) plot_likert(mydf, groups = parameters::closest_component(groups)) } plot_likert(mydf, c(rep("B", 4), rep("A", 5)), sort.groups = FALSE, grid.range = c(0.9, 1.1), geom.colors = "RdBu", rel_heights = c(6, 8), wrap.labels = 40, reverse.scale = TRUE) # control legend items six_cat_example = data.frame( matrix(sample(1:6, 600, replace = TRUE), ncol = 6) ) ## Not run: six_cat_example <- six_cat_example %>% dplyr::mutate_all(~ordered(.,labels = c("+++","++","+","-","--","---"))) # Old default plot_likert( six_cat_example, groups = c(1, 1, 1, 2, 2, 2), group.legend.options = list(nrow = 2, byrow = FALSE) ) # New default plot_likert(six_cat_example, groups = c(1, 1, 1, 2, 2, 2)) # Single row plot_likert( six_cat_example, groups = c(1, 1, 1, 2, 2, 2), group.legend.options = list(nrow = 1) ) ## End(Not run) }
plot_model()
creates plots from regression models, either
estimates (as so-called forest or dot whisker plots) or marginal effects.
plot_model( model, type = c("est", "re", "eff", "emm", "pred", "int", "std", "std2", "slope", "resid", "diag"), transform, terms = NULL, sort.est = NULL, rm.terms = NULL, group.terms = NULL, order.terms = NULL, pred.type = c("fe", "re"), mdrt.values = c("minmax", "meansd", "zeromax", "quart", "all"), ri.nr = NULL, title = NULL, axis.title = NULL, axis.labels = NULL, legend.title = NULL, wrap.title = 50, wrap.labels = 25, axis.lim = NULL, grid.breaks = NULL, ci.lvl = NULL, se = NULL, robust = FALSE, vcov.fun = NULL, vcov.type = NULL, vcov.args = NULL, colors = "Set1", show.intercept = FALSE, show.values = FALSE, show.p = TRUE, show.data = FALSE, show.legend = TRUE, show.zeroinf = TRUE, value.offset = NULL, value.size, jitter = NULL, digits = 2, dot.size = NULL, line.size = NULL, vline.color = NULL, p.threshold = c(0.05, 0.01, 0.001), p.val = NULL, p.adjust = NULL, grid, case, auto.label = TRUE, prefix.labels = c("none", "varname", "label"), bpe = "median", bpe.style = "line", bpe.color = "white", ci.style = c("whisker", "bar"), std.response = TRUE, ... ) get_model_data( model, type = c("est", "re", "eff", "pred", "int", "std", "std2", "slope", "resid", "diag"), transform, terms = NULL, sort.est = NULL, rm.terms = NULL, group.terms = NULL, order.terms = NULL, pred.type = c("fe", "re"), ri.nr = NULL, ci.lvl = NULL, colors = "Set1", grid, case = "parsed", digits = 2, ... )
plot_model( model, type = c("est", "re", "eff", "emm", "pred", "int", "std", "std2", "slope", "resid", "diag"), transform, terms = NULL, sort.est = NULL, rm.terms = NULL, group.terms = NULL, order.terms = NULL, pred.type = c("fe", "re"), mdrt.values = c("minmax", "meansd", "zeromax", "quart", "all"), ri.nr = NULL, title = NULL, axis.title = NULL, axis.labels = NULL, legend.title = NULL, wrap.title = 50, wrap.labels = 25, axis.lim = NULL, grid.breaks = NULL, ci.lvl = NULL, se = NULL, robust = FALSE, vcov.fun = NULL, vcov.type = NULL, vcov.args = NULL, colors = "Set1", show.intercept = FALSE, show.values = FALSE, show.p = TRUE, show.data = FALSE, show.legend = TRUE, show.zeroinf = TRUE, value.offset = NULL, value.size, jitter = NULL, digits = 2, dot.size = NULL, line.size = NULL, vline.color = NULL, p.threshold = c(0.05, 0.01, 0.001), p.val = NULL, p.adjust = NULL, grid, case, auto.label = TRUE, prefix.labels = c("none", "varname", "label"), bpe = "median", bpe.style = "line", bpe.color = "white", ci.style = c("whisker", "bar"), std.response = TRUE, ... ) get_model_data( model, type = c("est", "re", "eff", "pred", "int", "std", "std2", "slope", "resid", "diag"), transform, terms = NULL, sort.est = NULL, rm.terms = NULL, group.terms = NULL, order.terms = NULL, pred.type = c("fe", "re"), ri.nr = NULL, ci.lvl = NULL, colors = "Set1", grid, case = "parsed", digits = 2, ... )
model |
A regression model object. Depending on the |
type |
Type of plot. There are three groups of plot-types:
Marginal Effects (related vignette)
Model diagnostics
Note: For mixed models, the diagnostic plots like linear relationship or check for Homoscedasticity, do not take the uncertainty of random effects into account, but is only based on the fixed effects part of the model. |
transform |
A character vector, naming a function that will be applied
on estimates and confidence intervals. By default, |
terms |
Character vector with the names of those terms from
|
sort.est |
Determines in which way estimates are sorted in the plot:
|
rm.terms |
Character vector with names that indicate which terms should
be removed from the plot. Counterpart to |
group.terms |
Numeric vector with group indices, to group coefficients. Each group of coefficients gets its own color (see 'Examples'). |
order.terms |
Numeric vector, indicating in which order the coefficients should be plotted. See examples in this package-vignette. |
pred.type |
Character, only applies for Marginal Effects plots
with mixed effects models. Indicates whether predicted values should be
conditioned on random effects ( |
mdrt.values |
Indicates which values of the moderator variable should be
used when plotting interaction terms (i.e.
|
ri.nr |
Numeric vector. If |
title |
Character vector, used as plot title. By default,
|
axis.title |
Character vector of length one or two (depending on the
plot function and type), used as title(s) for the x and y axis. If not
specified, a default labelling is chosen. Note: Some plot types
may not support this argument sufficiently. In such cases, use the returned
ggplot-object and add axis titles manually with
|
axis.labels |
Character vector with labels for the model terms, used as
axis labels. By default, |
legend.title |
Character vector, used as legend title for plots that have a legend. |
wrap.title |
Numeric, determines how many chars of the plot title are displayed in one line and when a line break is inserted. |
wrap.labels |
Numeric, determines how many chars of the value, variable or axis labels are displayed in one line and when a line break is inserted. |
axis.lim |
Numeric vector of length 2, defining the range of the plot
axis. Depending on plot-type, may effect either x- or y-axis. For
Marginal Effects plots, |
grid.breaks |
Numeric value or vector; if |
ci.lvl |
Numeric, the level of the confidence intervals (error bars).
Use |
se |
Logical, if |
robust |
Deprecated. Please use |
vcov.fun |
Variance-covariance matrix used to compute uncertainty
estimates (e.g., for robust standard errors). This argument accepts a
covariance matrix, a function which returns a covariance matrix, or a
string which identifies the function to be used to compute the covariance
matrix. See |
vcov.type |
Deprecated. The |
vcov.args |
List of arguments to be passed to the function identified by
the |
colors |
May be a character vector of color values in hex-format, valid
color value names (see
|
show.intercept |
Logical, if |
show.values |
Logical, whether values should be plotted or not. |
show.p |
Logical, adds asterisks that indicate the significance level of estimates to the value labels. |
show.data |
Logical, for Marginal Effects plots, also plots the raw data points. |
show.legend |
For Marginal Effects plots, shows or hides the legend. |
show.zeroinf |
Logical, if |
value.offset |
Numeric, offset for text labels to adjust their position relative to the dots or lines. |
value.size |
Numeric, indicates the size of value labels. Can be used
for all plot types where the argument |
jitter |
Numeric, between 0 and 1. If |
digits |
Numeric, amount of digits after decimal point when rounding estimates or values. |
dot.size |
Numeric, size of the dots that indicate the point estimates. |
line.size |
Numeric, size of the lines that indicate the error bars. |
vline.color |
Color of the vertical "zero effect" line. Default color is inherited from the current theme. |
p.threshold |
Numeric vector of length 3, indicating the treshold for
annotating p-values with asterisks. Only applies if
|
p.val |
Character specifying method to be used to calculate p-values. Defaults to "profile" for glm/polr models, otherwise "wald". |
p.adjust |
Character vector, if not |
grid |
Logical, if |
case |
Desired target case. Labels will automatically converted into the
specified character case. See |
auto.label |
Logical, if |
prefix.labels |
Indicates whether the value labels of categorical variables
should be prefixed, e.g. with the variable name or variable label. See
argument |
bpe |
For Stan-models (fitted with the rstanarm- or
brms-package), the Bayesian point estimate is, by default, the median
of the posterior distribution. Use |
bpe.style |
For Stan-models (fitted with the rstanarm- or
brms-package), the Bayesian point estimate is indicated as a small,
vertical line by default. Use |
bpe.color |
Character vector, indicating the color of the Bayesian
point estimate. Setting |
ci.style |
Character vector, defining whether inner and outer intervals
for Bayesion models are shown in boxplot-style ( |
std.response |
Logical, whether the response variable will also be
standardized if standardized coefficients are requested. Setting both
|
... |
Other arguments, passed down to various functions. Here is a list of supported arguments and their description in detail.
|
type = "std"
Plots standardized estimates. See details below.
type = "std2"
Plots standardized estimates, however, standardization follows Gelman's (2008) suggestion, rescaling the estimates by dividing them by two standard deviations instead of just one. Resulting coefficients are then directly comparable for untransformed binary predictors.
type = "pred"
Plots estimated marginal means (or marginal effects).
Simply wraps ggpredict
. See also
this package-vignette.
type = "eff"
Plots estimated marginal means (or marginal effects).
Simply wraps ggeffect
. See also
this package-vignette.
type = "int"
A shortcut for marginal effects plots, where
interaction terms are automatically detected and used as
terms
-argument. Furthermore, if the moderator variable (the second
- and third - term in an interaction) is continuous, type = "int"
automatically chooses useful values based on the mdrt.values
-argument,
which are passed to terms
. Then, ggpredict
is called. type = "int"
plots the interaction term that appears
first in the formula along the x-axis, while the second (and possibly
third) variable in an interaction is used as grouping factor(s)
(moderating variable). Use type = "pred"
or type = "eff"
and specify a certain order in the terms
-argument to indicate
which variable(s) should be used as moderator. See also
this package-vignette.
type = "slope"
and type = "resid"
Simple diagnostic-plots, where a linear model for each single predictor is plotted against the response variable, or the model's residuals. Additionally, a loess-smoothed line is added to the plot. The main purpose of these plots is to check whether the relationship between outcome (or residuals) and a predictor is roughly linear or not. Since the plots are based on a simple linear regression with only one model predictor at the moment, the slopes (i.e. coefficients) may differ from the coefficients of the complete model.
type = "diag"
For Stan-models, plots the prior versus posterior samples. For linear (mixed) models, plots for multicollinearity-check (Variance Inflation Factors), QQ-plots, checks for normal distribution of residuals and homoscedasticity (constant variance of residuals) are shown. For generalized linear mixed models, returns the QQ-plot for random effects.
Default standardization is done by completely refitting the model on the
standardized data. Hence, this approach is equal to standardizing the
variables before fitting the model, which is particularly recommended for
complex models that include interactions or transformations (e.g., polynomial
or spline terms). When type = "std2"
, standardization of estimates
follows Gelman's (2008)
suggestion, rescaling the estimates by dividing them by two standard deviations
instead of just one. Resulting coefficients are then directly comparable for
untransformed binary predictors.
Depending on the plot-type, plot_model()
returns a
ggplot
-object or a list of such objects. get_model_data
returns the associated data with the plot-object as tidy data frame, or
(depending on the plot-type) a list of such data frames.
Gelman A (2008) "Scaling regression inputs by dividing by two
standard deviations." Statistics in Medicine 27: 2865-2873.
http://www.stat.columbia.edu/~gelman/research/published/standardizing7.pdf
Aiken and West (1991). Multiple Regression: Testing and Interpreting Interactions.
# prepare data if (requireNamespace("haven")) { library(sjmisc) data(efc) efc <- to_factor(efc, c161sex, e42dep, c172code) m <- lm(neg_c_7 ~ pos_v_4 + c12hour + e42dep + c172code, data = efc) # simple forest plot plot_model(m) # grouped coefficients plot_model(m, group.terms = c(1, 2, 3, 3, 3, 4, 4)) # keep only selected terms in the model: pos_v_4, the # levels 3 and 4 of factor e42dep and levels 2 and 3 for c172code plot_model(m, terms = c("pos_v_4", "e42dep [3,4]", "c172code [2,3]")) } # multiple plots, as returned from "diagnostic"-plot type, # can be arranged with 'plot_grid()' ## Not run: p <- plot_model(m, type = "diag") plot_grid(p) ## End(Not run) # plot random effects if (require("lme4") && require("glmmTMB")) { m <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy) plot_model(m, type = "re") # plot marginal effects plot_model(m, type = "pred", terms = "Days") } # plot interactions ## Not run: m <- glm( tot_sc_e ~ c161sex + c172code * neg_c_7, data = efc, family = poisson() ) # type = "int" automatically selects groups for continuous moderator # variables - see argument 'mdrt.values'. The following function call is # identical to: # plot_model(m, type = "pred", terms = c("c172code", "neg_c_7 [7,28]")) plot_model(m, type = "int") # switch moderator plot_model(m, type = "pred", terms = c("neg_c_7", "c172code")) # same as # ggeffects::ggpredict(m, terms = c("neg_c_7", "c172code")) ## End(Not run) # plot Stan-model ## Not run: if (require("rstanarm")) { data(mtcars) m <- stan_glm(mpg ~ wt + am + cyl + gear, data = mtcars, chains = 1) plot_model(m, bpe.style = "dot") } ## End(Not run)
# prepare data if (requireNamespace("haven")) { library(sjmisc) data(efc) efc <- to_factor(efc, c161sex, e42dep, c172code) m <- lm(neg_c_7 ~ pos_v_4 + c12hour + e42dep + c172code, data = efc) # simple forest plot plot_model(m) # grouped coefficients plot_model(m, group.terms = c(1, 2, 3, 3, 3, 4, 4)) # keep only selected terms in the model: pos_v_4, the # levels 3 and 4 of factor e42dep and levels 2 and 3 for c172code plot_model(m, terms = c("pos_v_4", "e42dep [3,4]", "c172code [2,3]")) } # multiple plots, as returned from "diagnostic"-plot type, # can be arranged with 'plot_grid()' ## Not run: p <- plot_model(m, type = "diag") plot_grid(p) ## End(Not run) # plot random effects if (require("lme4") && require("glmmTMB")) { m <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy) plot_model(m, type = "re") # plot marginal effects plot_model(m, type = "pred", terms = "Days") } # plot interactions ## Not run: m <- glm( tot_sc_e ~ c161sex + c172code * neg_c_7, data = efc, family = poisson() ) # type = "int" automatically selects groups for continuous moderator # variables - see argument 'mdrt.values'. The following function call is # identical to: # plot_model(m, type = "pred", terms = c("c172code", "neg_c_7 [7,28]")) plot_model(m, type = "int") # switch moderator plot_model(m, type = "pred", terms = c("neg_c_7", "c172code")) # same as # ggeffects::ggpredict(m, terms = c("neg_c_7", "c172code")) ## End(Not run) # plot Stan-model ## Not run: if (require("rstanarm")) { data(mtcars) m <- stan_glm(mpg ~ wt + am + cyl + gear, data = mtcars, chains = 1) plot_model(m, bpe.style = "dot") } ## End(Not run)
Plot and compare regression coefficients with confidence intervals of multiple regression models in one plot.
plot_models( ..., transform = NULL, std.est = NULL, std.response = TRUE, rm.terms = NULL, title = NULL, m.labels = NULL, legend.title = "Dependent Variables", legend.pval.title = "p-level", axis.labels = NULL, axis.title = NULL, axis.lim = NULL, wrap.title = 50, wrap.labels = 25, wrap.legend.title = 20, grid.breaks = NULL, dot.size = 3, line.size = NULL, value.size = NULL, spacing = 0.4, colors = "Set1", show.values = FALSE, show.legend = TRUE, show.intercept = FALSE, show.p = TRUE, p.shape = FALSE, p.threshold = c(0.05, 0.01, 0.001), p.adjust = NULL, ci.lvl = 0.95, robust = FALSE, vcov.fun = NULL, vcov.type = c("HC3", "const", "HC", "HC0", "HC1", "HC2", "HC4", "HC4m", "HC5"), vcov.args = NULL, vline.color = NULL, digits = 2, grid = FALSE, auto.label = TRUE, prefix.labels = c("none", "varname", "label") )
plot_models( ..., transform = NULL, std.est = NULL, std.response = TRUE, rm.terms = NULL, title = NULL, m.labels = NULL, legend.title = "Dependent Variables", legend.pval.title = "p-level", axis.labels = NULL, axis.title = NULL, axis.lim = NULL, wrap.title = 50, wrap.labels = 25, wrap.legend.title = 20, grid.breaks = NULL, dot.size = 3, line.size = NULL, value.size = NULL, spacing = 0.4, colors = "Set1", show.values = FALSE, show.legend = TRUE, show.intercept = FALSE, show.p = TRUE, p.shape = FALSE, p.threshold = c(0.05, 0.01, 0.001), p.adjust = NULL, ci.lvl = 0.95, robust = FALSE, vcov.fun = NULL, vcov.type = c("HC3", "const", "HC", "HC0", "HC1", "HC2", "HC4", "HC4m", "HC5"), vcov.args = NULL, vline.color = NULL, digits = 2, grid = FALSE, auto.label = TRUE, prefix.labels = c("none", "varname", "label") )
... |
One or more regression models, including glm's or mixed models.
May also be a |
transform |
A character vector, naming a function that will be applied
on estimates and confidence intervals. By default, |
std.est |
Choose whether standardized coefficients should be used
for plotting. Default is no standardization ( |
std.response |
Logical, whether the response variable will also be
standardized if standardized coefficients are requested. Setting both
|
rm.terms |
Character vector with names that indicate which terms should
be removed from the plot. Counterpart to |
title |
Character vector, used as plot title. By default,
|
m.labels |
Character vector, used to indicate the different models in the plot's legend. If not specified, the labels of the dependent variables for each model are used. |
legend.title |
Character vector, used as legend title for plots that have a legend. |
legend.pval.title |
Character vector, used as title of the plot legend that
indicates the p-values. Default is |
axis.labels |
Character vector with labels for the model terms, used as
axis labels. By default, |
axis.title |
Character vector of length one or two (depending on the
plot function and type), used as title(s) for the x and y axis. If not
specified, a default labelling is chosen. Note: Some plot types
may not support this argument sufficiently. In such cases, use the returned
ggplot-object and add axis titles manually with
|
axis.lim |
Numeric vector of length 2, defining the range of the plot
axis. Depending on plot-type, may effect either x- or y-axis. For
Marginal Effects plots, |
wrap.title |
Numeric, determines how many chars of the plot title are displayed in one line and when a line break is inserted. |
wrap.labels |
Numeric, determines how many chars of the value, variable or axis labels are displayed in one line and when a line break is inserted. |
wrap.legend.title |
numeric, determines how many chars of the legend's title are displayed in one line and when a line break is inserted. |
grid.breaks |
Numeric value or vector; if |
dot.size |
Numeric, size of the dots that indicate the point estimates. |
line.size |
Numeric, size of the lines that indicate the error bars. |
value.size |
Numeric, indicates the size of value labels. Can be used
for all plot types where the argument |
spacing |
Numeric, spacing between the dots and error bars of the plotted fitted models. Default is 0.3. |
colors |
May be a character vector of color values in hex-format, valid
color value names (see
|
show.values |
Logical, whether values should be plotted or not. |
show.legend |
For Marginal Effects plots, shows or hides the legend. |
show.intercept |
Logical, if |
show.p |
Logical, adds asterisks that indicate the significance level of estimates to the value labels. |
p.shape |
Logical, if |
p.threshold |
Numeric vector of length 3, indicating the treshold for
annotating p-values with asterisks. Only applies if
|
p.adjust |
Character vector, if not |
ci.lvl |
Numeric, the level of the confidence intervals (error bars).
Use |
robust |
Deprecated. Please use |
vcov.fun |
Variance-covariance matrix used to compute uncertainty
estimates (e.g., for robust standard errors). This argument accepts a
covariance matrix, a function which returns a covariance matrix, or a
string which identifies the function to be used to compute the covariance
matrix. See |
vcov.type |
Deprecated. The |
vcov.args |
List of arguments to be passed to the function identified by
the |
vline.color |
Color of the vertical "zero effect" line. Default color is inherited from the current theme. |
digits |
Numeric, amount of digits after decimal point when rounding estimates or values. |
grid |
Logical, if |
auto.label |
Logical, if |
prefix.labels |
Indicates whether the value labels of categorical variables
should be prefixed, e.g. with the variable name or variable label. See
argument |
A ggplot-object.
data(efc) # fit three models fit1 <- lm(barthtot ~ c160age + c12hour + c161sex + c172code, data = efc) fit2 <- lm(neg_c_7 ~ c160age + c12hour + c161sex + c172code, data = efc) fit3 <- lm(tot_sc_e ~ c160age + c12hour + c161sex + c172code, data = efc) # plot multiple models plot_models(fit1, fit2, fit3, grid = TRUE) # plot multiple models with legend labels and # point shapes instead of value labels plot_models( fit1, fit2, fit3, axis.labels = c( "Carer's Age", "Hours of Care", "Carer's Sex", "Educational Status" ), m.labels = c("Barthel Index", "Negative Impact", "Services used"), show.values = FALSE, show.p = FALSE, p.shape = TRUE ) ## Not run: # plot multiple models from nested lists argument all.models <- list() all.models[[1]] <- fit1 all.models[[2]] <- fit2 all.models[[3]] <- fit3 plot_models(all.models) # plot multiple models with different predictors (stepwise inclusion), # standardized estimates fit1 <- lm(mpg ~ wt + cyl + disp + gear, data = mtcars) fit2 <- update(fit1, . ~ . + hp) fit3 <- update(fit2, . ~ . + am) plot_models(fit1, fit2, fit3, std.est = "std2") ## End(Not run)
data(efc) # fit three models fit1 <- lm(barthtot ~ c160age + c12hour + c161sex + c172code, data = efc) fit2 <- lm(neg_c_7 ~ c160age + c12hour + c161sex + c172code, data = efc) fit3 <- lm(tot_sc_e ~ c160age + c12hour + c161sex + c172code, data = efc) # plot multiple models plot_models(fit1, fit2, fit3, grid = TRUE) # plot multiple models with legend labels and # point shapes instead of value labels plot_models( fit1, fit2, fit3, axis.labels = c( "Carer's Age", "Hours of Care", "Carer's Sex", "Educational Status" ), m.labels = c("Barthel Index", "Negative Impact", "Services used"), show.values = FALSE, show.p = FALSE, p.shape = TRUE ) ## Not run: # plot multiple models from nested lists argument all.models <- list() all.models[[1]] <- fit1 all.models[[2]] <- fit2 all.models[[3]] <- fit3 plot_models(all.models) # plot multiple models with different predictors (stepwise inclusion), # standardized estimates fit1 <- lm(mpg ~ wt + cyl + disp + gear, data = mtcars) fit2 <- update(fit1, . ~ . + hp) fit3 <- update(fit2, . ~ . + am) plot_models(fit1, fit2, fit3, std.est = "std2") ## End(Not run)
This function plots observed and predicted values of the response of linear (mixed) models for each coefficient and highlights the observed values according to their distance (residuals) to the predicted values. This allows to investigate how well actual and predicted values of the outcome fit across the predictor variables.
plot_residuals( fit, geom.size = 2, remove.estimates = NULL, show.lines = TRUE, show.resid = TRUE, show.pred = TRUE, show.ci = FALSE )
plot_residuals( fit, geom.size = 2, remove.estimates = NULL, show.lines = TRUE, show.resid = TRUE, show.pred = TRUE, show.ci = FALSE )
fit |
Fitted linear (mixed) regression model (including objects of class
|
geom.size |
size resp. width of the geoms (bar width, line thickness or point size, depending on plot type and function). Note that bar and bin widths mostly need smaller values than dot sizes. |
remove.estimates |
Numeric vector with indices (order equals to row index of |
show.lines |
Logical, if |
show.resid |
Logical, if |
show.pred |
Logical, if |
show.ci |
Logical, if |
A ggplot-object.
The actual (observed) values have a coloured fill, while the predicted values have a solid outline without filling.
data(efc) # fit model fit <- lm(neg_c_7 ~ c12hour + e17age + e42dep, data = efc) # plot residuals for all independent variables plot_residuals(fit) # remove some independent variables from output plot_residuals(fit, remove.estimates = c("e17age", "e42dep"))
data(efc) # fit model fit <- lm(neg_c_7 ~ c12hour + e17age + e42dep, data = efc) # plot residuals for all independent variables plot_residuals(fit) # remove some independent variables from output plot_residuals(fit, remove.estimates = c("e17age", "e42dep"))
Display scatter plot of two variables. Adding a grouping variable to the scatter plot is possible. Furthermore, fitted lines can be added for each group as well as for the overall plot.
plot_scatter( data, x, y, grp, title = "", legend.title = NULL, legend.labels = NULL, dot.labels = NULL, axis.titles = NULL, dot.size = 1.5, label.size = 3, colors = "metro", fit.line = NULL, fit.grps = NULL, show.rug = FALSE, show.legend = TRUE, show.ci = FALSE, wrap.title = 50, wrap.legend.title = 20, wrap.legend.labels = 20, jitter = 0.05, emph.dots = FALSE, grid = FALSE )
plot_scatter( data, x, y, grp, title = "", legend.title = NULL, legend.labels = NULL, dot.labels = NULL, axis.titles = NULL, dot.size = 1.5, label.size = 3, colors = "metro", fit.line = NULL, fit.grps = NULL, show.rug = FALSE, show.legend = TRUE, show.ci = FALSE, wrap.title = 50, wrap.legend.title = 20, wrap.legend.labels = 20, jitter = 0.05, emph.dots = FALSE, grid = FALSE )
data |
A data frame, or a grouped data frame. |
x |
Name of the variable for the x-axis. |
y |
Name of the variable for the y-axis. |
grp |
Optional, name of the grouping-variable. If not missing, the scatter plot will be grouped. See 'Examples'. |
title |
Character vector, used as plot title. By default,
|
legend.title |
Character vector, used as legend title for plots that have a legend. |
legend.labels |
character vector with labels for the guide/legend. |
dot.labels |
Character vector with names for each coordinate pair given
by |
axis.titles |
character vector of length one or two, defining the title(s) for the x-axis and y-axis. |
dot.size |
Numeric, size of the dots that indicate the point estimates. |
label.size |
Size of text labels if argument |
colors |
May be a character vector of color values in hex-format, valid
color value names (see
|
fit.line , fit.grps
|
Specifies the method to add a fitted line accross
the data points. Possible values are for instance |
show.rug |
Logical, if |
show.legend |
For Marginal Effects plots, shows or hides the legend. |
show.ci |
Logical, if |
wrap.title |
Numeric, determines how many chars of the plot title are displayed in one line and when a line break is inserted. |
wrap.legend.title |
numeric, determines how many chars of the legend's title are displayed in one line and when a line break is inserted. |
wrap.legend.labels |
numeric, determines how many chars of the legend labels are displayed in one line and when a line break is inserted. |
jitter |
Numeric, between 0 and 1. If |
emph.dots |
Logical, if |
grid |
Logical, if |
A ggplot-object. For grouped data frames, a list of ggplot-objects for each group in the data.
# load sample date library(sjmisc) library(sjlabelled) data(efc) # simple scatter plot plot_scatter(efc, e16sex, neg_c_7) # simple scatter plot, increased jittering plot_scatter(efc, e16sex, neg_c_7, jitter = .4) # grouped scatter plot plot_scatter(efc, c160age, e17age, e42dep) # grouped scatter plot with marginal rug plot # and add fitted line for complete data plot_scatter( efc, c12hour, c160age, c172code, show.rug = TRUE, fit.line = "lm" ) # grouped scatter plot with marginal rug plot # and add fitted line for each group plot_scatter( efc, c12hour, c160age, c172code, show.rug = TRUE, fit.grps = "loess", grid = TRUE )
# load sample date library(sjmisc) library(sjlabelled) data(efc) # simple scatter plot plot_scatter(efc, e16sex, neg_c_7) # simple scatter plot, increased jittering plot_scatter(efc, e16sex, neg_c_7, jitter = .4) # grouped scatter plot plot_scatter(efc, c160age, e17age, e42dep) # grouped scatter plot with marginal rug plot # and add fitted line for complete data plot_scatter( efc, c12hour, c160age, c172code, show.rug = TRUE, fit.line = "lm" ) # grouped scatter plot with marginal rug plot # and add fitted line for each group plot_scatter( efc, c12hour, c160age, c172code, show.rug = TRUE, fit.grps = "loess", grid = TRUE )
Plot items (variables) of a scale as stacked proportional bars. This function is useful when several items with identical scale/categoroies should be plotted to compare the distribution of answers.
plot_stackfrq( items, title = NULL, legend.title = NULL, legend.labels = NULL, axis.titles = NULL, axis.labels = NULL, weight.by = NULL, sort.frq = NULL, wrap.title = 50, wrap.labels = 30, wrap.legend.title = 30, wrap.legend.labels = 28, geom.size = 0.5, geom.colors = "Blues", show.prc = TRUE, show.n = FALSE, show.total = TRUE, show.axis.prc = TRUE, show.legend = TRUE, grid.breaks = 0.2, expand.grid = FALSE, digits = 1, vjust = "center", coord.flip = TRUE )
plot_stackfrq( items, title = NULL, legend.title = NULL, legend.labels = NULL, axis.titles = NULL, axis.labels = NULL, weight.by = NULL, sort.frq = NULL, wrap.title = 50, wrap.labels = 30, wrap.legend.title = 30, wrap.legend.labels = 28, geom.size = 0.5, geom.colors = "Blues", show.prc = TRUE, show.n = FALSE, show.total = TRUE, show.axis.prc = TRUE, show.legend = TRUE, grid.breaks = 0.2, expand.grid = FALSE, digits = 1, vjust = "center", coord.flip = TRUE )
items |
Data frame, or a grouped data frame, with each column representing one item. |
title |
character vector, used as plot title. Depending on plot type and function,
will be set automatically. If |
legend.title |
character vector, used as title for the plot legend. |
legend.labels |
character vector with labels for the guide/legend. |
axis.titles |
character vector of length one or two, defining the title(s) for the x-axis and y-axis. |
axis.labels |
character vector with labels used as axis labels. Optional argument, since in most cases, axis labels are set automatically. |
weight.by |
Vector of weights that will be applied to weight all cases.
Must be a vector of same length as the input vector. Default is
|
sort.frq |
Indicates whether the
|
wrap.title |
numeric, determines how many chars of the plot title are displayed in one line and when a line break is inserted. |
wrap.labels |
numeric, determines how many chars of the value, variable or axis labels are displayed in one line and when a line break is inserted. |
wrap.legend.title |
numeric, determines how many chars of the legend's title are displayed in one line and when a line break is inserted. |
wrap.legend.labels |
numeric, determines how many chars of the legend labels are displayed in one line and when a line break is inserted. |
geom.size |
size resp. width of the geoms (bar width, line thickness or point size, depending on plot type and function). Note that bar and bin widths mostly need smaller values than dot sizes. |
geom.colors |
user defined color for geoms. See 'Details' in |
show.prc |
Logical, whether percentage values should be plotted or not. |
show.n |
Logical, whether count values hould be plotted or not. |
show.total |
logical, if |
show.axis.prc |
Logical, if |
show.legend |
logical, if |
grid.breaks |
numeric; sets the distance between breaks for the axis,
i.e. at every |
expand.grid |
logical, if |
digits |
Numeric, amount of digits after decimal point when rounding estimates or values. |
vjust |
character vector, indicating the vertical position of value
labels. Allowed are same values as for |
coord.flip |
logical, if |
A ggplot-object.
# Data from the EUROFAMCARE sample dataset library(sjmisc) data(efc) # recveive first item of COPE-index scale start <- which(colnames(efc) == "c82cop1") # recveive first item of COPE-index scale end <- which(colnames(efc) == "c90cop9") # auto-detection of labels plot_stackfrq(efc[, start:end]) # works on grouped data frames as well library(dplyr) efc %>% group_by(c161sex) %>% select(start:end) %>% plot_stackfrq()
# Data from the EUROFAMCARE sample dataset library(sjmisc) data(efc) # recveive first item of COPE-index scale start <- which(colnames(efc) == "c82cop1") # recveive first item of COPE-index scale end <- which(colnames(efc) == "c90cop9") # auto-detection of labels plot_stackfrq(efc[, start:end]) # works on grouped data frames as well library(dplyr) efc %>% group_by(c161sex) %>% select(start:end) %>% plot_stackfrq()
Plot proportional crosstables (contingency tables) of two variables as ggplot diagram.
plot_xtab( x, grp, type = c("bar", "line"), margin = c("col", "cell", "row"), bar.pos = c("dodge", "stack"), title = "", title.wtd.suffix = NULL, axis.titles = NULL, axis.labels = NULL, legend.title = NULL, legend.labels = NULL, weight.by = NULL, rev.order = FALSE, show.values = TRUE, show.n = TRUE, show.prc = TRUE, show.total = TRUE, show.legend = TRUE, show.summary = FALSE, summary.pos = "r", drop.empty = TRUE, string.total = "Total", wrap.title = 50, wrap.labels = 15, wrap.legend.title = 20, wrap.legend.labels = 20, geom.size = 0.7, geom.spacing = 0.1, geom.colors = "Paired", dot.size = 3, smooth.lines = FALSE, grid.breaks = 0.2, expand.grid = FALSE, ylim = NULL, vjust = "bottom", hjust = "center", y.offset = NULL, coord.flip = FALSE )
plot_xtab( x, grp, type = c("bar", "line"), margin = c("col", "cell", "row"), bar.pos = c("dodge", "stack"), title = "", title.wtd.suffix = NULL, axis.titles = NULL, axis.labels = NULL, legend.title = NULL, legend.labels = NULL, weight.by = NULL, rev.order = FALSE, show.values = TRUE, show.n = TRUE, show.prc = TRUE, show.total = TRUE, show.legend = TRUE, show.summary = FALSE, summary.pos = "r", drop.empty = TRUE, string.total = "Total", wrap.title = 50, wrap.labels = 15, wrap.legend.title = 20, wrap.legend.labels = 20, geom.size = 0.7, geom.spacing = 0.1, geom.colors = "Paired", dot.size = 3, smooth.lines = FALSE, grid.breaks = 0.2, expand.grid = FALSE, ylim = NULL, vjust = "bottom", hjust = "center", y.offset = NULL, coord.flip = FALSE )
x |
A vector of values (variable) describing the bars which make up the plot. |
grp |
Grouping variable of same length as |
type |
Plot type. may be either |
margin |
Indicates which data of the proportional table should be plotted. Use |
bar.pos |
Indicates whether bars should be positioned side-by-side (default),
or stacked ( |
title |
character vector, used as plot title. Depending on plot type and function,
will be set automatically. If |
title.wtd.suffix |
Suffix (as string) for the title, if |
axis.titles |
character vector of length one or two, defining the title(s) for the x-axis and y-axis. |
axis.labels |
character vector with labels used as axis labels. Optional argument, since in most cases, axis labels are set automatically. |
legend.title |
character vector, used as title for the plot legend. |
legend.labels |
character vector with labels for the guide/legend. |
weight.by |
Vector of weights that will be applied to weight all cases.
Must be a vector of same length as the input vector. Default is
|
rev.order |
Logical, if |
show.values |
Logical, whether values should be plotted or not. |
show.n |
logical, if |
show.prc |
logical, if |
show.total |
When |
show.legend |
logical, if |
show.summary |
logical, if |
summary.pos |
position of the model summary which is printed when |
drop.empty |
Logical, if |
string.total |
String for the legend label when a total-column is added. Only applies
if |
wrap.title |
numeric, determines how many chars of the plot title are displayed in one line and when a line break is inserted. |
wrap.labels |
numeric, determines how many chars of the value, variable or axis labels are displayed in one line and when a line break is inserted. |
wrap.legend.title |
numeric, determines how many chars of the legend's title are displayed in one line and when a line break is inserted. |
wrap.legend.labels |
numeric, determines how many chars of the legend labels are displayed in one line and when a line break is inserted. |
geom.size |
size resp. width of the geoms (bar width, line thickness or point size, depending on plot type and function). Note that bar and bin widths mostly need smaller values than dot sizes. |
geom.spacing |
the spacing between geoms (i.e. bar spacing) |
geom.colors |
user defined color for geoms. See 'Details' in |
dot.size |
Dot size, only applies, when argument |
smooth.lines |
prints a smooth line curve. Only applies, when argument |
grid.breaks |
numeric; sets the distance between breaks for the axis,
i.e. at every |
expand.grid |
logical, if |
ylim |
numeric vector of length two, defining lower and upper axis limits
of the y scale. By default, this argument is set to |
vjust |
character vector, indicating the vertical position of value
labels. Allowed are same values as for |
hjust |
character vector, indicating the horizontal position of value
labels. Allowed are same values as for |
y.offset |
numeric, offset for text labels when their alignment is adjusted
to the top/bottom of the geom (see |
coord.flip |
logical, if |
A ggplot-object.
# create 4-category-items grp <- sample(1:4, 100, replace = TRUE) # create 3-category-items x <- sample(1:3, 100, replace = TRUE) # plot "cross tablulation" of x and grp plot_xtab(x, grp) # plot "cross tablulation" of x and y, including labels plot_xtab(x, grp, axis.labels = c("low", "mid", "high"), legend.labels = c("Grp 1", "Grp 2", "Grp 3", "Grp 4")) # plot "cross tablulation" of x and grp # as stacked proportional bars plot_xtab(x, grp, margin = "row", bar.pos = "stack", show.summary = TRUE, coord.flip = TRUE) # example with vertical labels library(sjmisc) library(sjlabelled) data(efc) set_theme(geom.label.angle = 90) plot_xtab(efc$e42dep, efc$e16sex, vjust = "center", hjust = "bottom") # grouped bars with EUROFAMCARE sample dataset # dataset was importet from an SPSS-file, # see ?sjmisc::read_spss data(efc) efc.val <- get_labels(efc) efc.var <- get_label(efc) plot_xtab(efc$e42dep, efc$e16sex, title = efc.var['e42dep'], axis.labels = efc.val[['e42dep']], legend.title = efc.var['e16sex'], legend.labels = efc.val[['e16sex']]) plot_xtab(efc$e16sex, efc$e42dep, title = efc.var['e16sex'], axis.labels = efc.val[['e16sex']], legend.title = efc.var['e42dep'], legend.labels = efc.val[['e42dep']]) # ------------------------------- # auto-detection of labels works here # so no need to specify labels. For # title-auto-detection, use NULL # ------------------------------- plot_xtab(efc$e16sex, efc$e42dep, title = NULL) plot_xtab(efc$e16sex, efc$e42dep, margin = "row", bar.pos = "stack", coord.flip = TRUE)
# create 4-category-items grp <- sample(1:4, 100, replace = TRUE) # create 3-category-items x <- sample(1:3, 100, replace = TRUE) # plot "cross tablulation" of x and grp plot_xtab(x, grp) # plot "cross tablulation" of x and y, including labels plot_xtab(x, grp, axis.labels = c("low", "mid", "high"), legend.labels = c("Grp 1", "Grp 2", "Grp 3", "Grp 4")) # plot "cross tablulation" of x and grp # as stacked proportional bars plot_xtab(x, grp, margin = "row", bar.pos = "stack", show.summary = TRUE, coord.flip = TRUE) # example with vertical labels library(sjmisc) library(sjlabelled) data(efc) set_theme(geom.label.angle = 90) plot_xtab(efc$e42dep, efc$e16sex, vjust = "center", hjust = "bottom") # grouped bars with EUROFAMCARE sample dataset # dataset was importet from an SPSS-file, # see ?sjmisc::read_spss data(efc) efc.val <- get_labels(efc) efc.var <- get_label(efc) plot_xtab(efc$e42dep, efc$e16sex, title = efc.var['e42dep'], axis.labels = efc.val[['e42dep']], legend.title = efc.var['e16sex'], legend.labels = efc.val[['e16sex']]) plot_xtab(efc$e16sex, efc$e42dep, title = efc.var['e16sex'], axis.labels = efc.val[['e16sex']], legend.title = efc.var['e42dep'], legend.labels = efc.val[['e42dep']]) # ------------------------------- # auto-detection of labels works here # so no need to specify labels. For # title-auto-detection, use NULL # ------------------------------- plot_xtab(efc$e16sex, efc$e42dep, title = NULL) plot_xtab(efc$e16sex, efc$e42dep, margin = "row", bar.pos = "stack", coord.flip = TRUE)
Convenient function to save the last ggplot-figure in high quality for publication.
save_plot( filename, fig = last_plot(), width = 12, height = 9, dpi = 300, theme = theme_get(), label.color = "black", label.size = 2.4, axis.textsize = 0.8, axis.titlesize = 0.75, legend.textsize = 0.6, legend.titlesize = 0.65, legend.itemsize = 0.5 )
save_plot( filename, fig = last_plot(), width = 12, height = 9, dpi = 300, theme = theme_get(), label.color = "black", label.size = 2.4, axis.textsize = 0.8, axis.titlesize = 0.75, legend.textsize = 0.6, legend.titlesize = 0.65, legend.itemsize = 0.5 )
filename |
Name of the output file; filename must end with one of the following accepted file types: ".png", ".jpg", ".svg" or ".tif". |
fig |
The plot that should be saved. By default, the last plot is saved. |
width |
Width of the figure, in centimetres. |
height |
Height of the figure, in centimetres. |
dpi |
Resolution in dpi (dots per inch). Ignored for vector formats, such as ".svg". |
theme |
The default theme to use when saving the plot. |
label.color |
Color value for labels (axis, plot, etc.). |
label.size |
Fontsize of value labels inside plot area. |
axis.textsize |
Fontsize of axis labels. |
axis.titlesize |
Fontsize of axis titles. |
legend.textsize |
Fontsize of legend labels. |
legend.titlesize |
Fontsize of legend title. |
legend.itemsize |
Size of legend's item (legend key), in centimetres. |
This is a convenient function with some default settings that should
come close to most of the needs for fontsize and scaling in figures
when saving them for printing or publishing. It uses cairographics
anti-aliasing (see png
).
For adjusting plot appearance, see also sjPlot-themes
.
Set global theme options for sjp-functions.
set_theme( base = theme_grey(), theme.font = NULL, title.color = "black", title.size = 1.2, title.align = "left", title.vjust = NULL, geom.outline.color = NULL, geom.outline.size = 0, geom.boxoutline.size = 0.5, geom.boxoutline.color = "black", geom.alpha = 1, geom.linetype = 1, geom.errorbar.size = 0.7, geom.errorbar.linetype = 1, geom.label.color = NULL, geom.label.size = 4, geom.label.alpha = 1, geom.label.angle = 0, axis.title.color = "grey30", axis.title.size = 1.1, axis.title.x.vjust = NULL, axis.title.y.vjust = NULL, axis.angle.x = 0, axis.angle.y = 0, axis.angle = NULL, axis.textcolor.x = "grey30", axis.textcolor.y = "grey30", axis.textcolor = NULL, axis.linecolor.x = NULL, axis.linecolor.y = NULL, axis.linecolor = NULL, axis.line.size = 0.5, axis.textsize.x = 1, axis.textsize.y = 1, axis.textsize = NULL, axis.tickslen = NULL, axis.tickscol = NULL, axis.ticksmar = NULL, axis.ticksize.x = NULL, axis.ticksize.y = NULL, panel.backcol = NULL, panel.bordercol = NULL, panel.col = NULL, panel.major.gridcol = NULL, panel.minor.gridcol = NULL, panel.gridcol = NULL, panel.gridcol.x = NULL, panel.gridcol.y = NULL, panel.major.linetype = 1, panel.minor.linetype = 1, plot.backcol = NULL, plot.bordercol = NULL, plot.col = NULL, plot.margins = NULL, legend.pos = "right", legend.just = NULL, legend.inside = FALSE, legend.size = 1, legend.color = "black", legend.title.size = 1, legend.title.color = "black", legend.title.face = "bold", legend.backgroundcol = "white", legend.bordercol = "white", legend.item.size = NULL, legend.item.backcol = "grey90", legend.item.bordercol = "white" )
set_theme( base = theme_grey(), theme.font = NULL, title.color = "black", title.size = 1.2, title.align = "left", title.vjust = NULL, geom.outline.color = NULL, geom.outline.size = 0, geom.boxoutline.size = 0.5, geom.boxoutline.color = "black", geom.alpha = 1, geom.linetype = 1, geom.errorbar.size = 0.7, geom.errorbar.linetype = 1, geom.label.color = NULL, geom.label.size = 4, geom.label.alpha = 1, geom.label.angle = 0, axis.title.color = "grey30", axis.title.size = 1.1, axis.title.x.vjust = NULL, axis.title.y.vjust = NULL, axis.angle.x = 0, axis.angle.y = 0, axis.angle = NULL, axis.textcolor.x = "grey30", axis.textcolor.y = "grey30", axis.textcolor = NULL, axis.linecolor.x = NULL, axis.linecolor.y = NULL, axis.linecolor = NULL, axis.line.size = 0.5, axis.textsize.x = 1, axis.textsize.y = 1, axis.textsize = NULL, axis.tickslen = NULL, axis.tickscol = NULL, axis.ticksmar = NULL, axis.ticksize.x = NULL, axis.ticksize.y = NULL, panel.backcol = NULL, panel.bordercol = NULL, panel.col = NULL, panel.major.gridcol = NULL, panel.minor.gridcol = NULL, panel.gridcol = NULL, panel.gridcol.x = NULL, panel.gridcol.y = NULL, panel.major.linetype = 1, panel.minor.linetype = 1, plot.backcol = NULL, plot.bordercol = NULL, plot.col = NULL, plot.margins = NULL, legend.pos = "right", legend.just = NULL, legend.inside = FALSE, legend.size = 1, legend.color = "black", legend.title.size = 1, legend.title.color = "black", legend.title.face = "bold", legend.backgroundcol = "white", legend.bordercol = "white", legend.item.size = NULL, legend.item.backcol = "grey90", legend.item.bordercol = "white" )
base |
base theme where theme is built on. By default, all
metrics from |
theme.font |
base font family for the plot. |
title.color |
Color of plot title. Default is |
title.size |
size of plot title. Default is 1.3. |
title.align |
alignment of plot title. Must be one of |
title.vjust |
numeric, vertical adjustment for plot title. |
geom.outline.color |
Color of geom outline. Only applies, if |
geom.outline.size |
size of bar outlines. Default is 0.1. Use
size of |
geom.boxoutline.size |
size of outlines and median bar especially for boxplots.
Default is 0.5. Use size of |
geom.boxoutline.color |
Color of outlines and median bar especially for boxplots.
Only applies, if |
geom.alpha |
specifies the transparancy (alpha value) of geoms |
geom.linetype |
linetype of line geoms. Default is |
geom.errorbar.size |
size (thickness) of error bars. Default is |
geom.errorbar.linetype |
linetype of error bars. Default is |
geom.label.color |
Color of geom's value and annotation labels |
geom.label.size |
size of geom's value and annotation labels |
geom.label.alpha |
alpha level of geom's value and annotation labels |
geom.label.angle |
angle of geom's value and annotation labels |
axis.title.color |
Color of x- and y-axis title labels |
axis.title.size |
size of x- and y-axis title labels |
axis.title.x.vjust |
numeric, vertical adjustment of x-axis-title. |
axis.title.y.vjust |
numeric, vertical adjustment of y-axis-title. |
axis.angle.x |
angle for x-axis labels |
axis.angle.y |
angle for y-axis labels |
axis.angle |
angle for x- and y-axis labels. If set, overrides both |
axis.textcolor.x |
Color for x-axis labels. If not specified, a default dark gray color palette will be used for the labels. |
axis.textcolor.y |
Color for y-axis labels. If not specified, a default dark gray color palette will be used for the labels. |
axis.textcolor |
Color for both x- and y-axis labels.
If set, overrides both |
axis.linecolor.x |
Color of x-axis border |
axis.linecolor.y |
Color of y-axis border |
axis.linecolor |
Color for both x- and y-axis borders.
If set, overrides both |
axis.line.size |
size (thickness) of axis lines. Only affected, if |
axis.textsize.x |
size of x-axis labels |
axis.textsize.y |
size of y-axis labels |
axis.textsize |
size for both x- and y-axis labels.
If set, overrides both |
axis.tickslen |
length of axis tick marks |
axis.tickscol |
Color of axis tick marks |
axis.ticksmar |
margin between axis labels and tick marks |
axis.ticksize.x |
size of tick marks at x-axis. |
axis.ticksize.y |
size of tick marks at y-axis. |
panel.backcol |
Color of the diagram's background |
panel.bordercol |
Color of whole diagram border (panel border) |
panel.col |
Color of both diagram's border and background.
If set, overrides both |
panel.major.gridcol |
Color of the major grid lines of the diagram background |
panel.minor.gridcol |
Color of the minor grid lines of the diagram background |
panel.gridcol |
Color for both minor and major grid lines of the diagram background.
If set, overrides both |
panel.gridcol.x |
See |
panel.gridcol.y |
See |
panel.major.linetype |
line type for major grid lines |
panel.minor.linetype |
line type for minor grid lines |
plot.backcol |
Color of the plot's background |
plot.bordercol |
Color of whole plot's border (panel border) |
plot.col |
Color of both plot's region border and background.
If set, overrides both |
plot.margins |
numeric vector of length 4, indicating the top, right, bottom and left margin of the plot region. |
legend.pos |
position of the legend, if a legend is drawn.
|
legend.just |
justification of legend, relative to its position ( |
legend.inside |
logical, use |
legend.size |
text size of the legend. Default is 1. Relative size, so recommended values are from 0.3 to 2.5 |
legend.color |
Color of the legend labels |
legend.title.size |
text size of the legend title |
legend.title.color |
Color of the legend title |
legend.title.face |
font face of the legend title. By default, |
legend.backgroundcol |
fill color of the legend's background. Default is |
legend.bordercol |
Color of the legend's border. Default is |
legend.item.size |
size of legend's item (legend key), in centimetres. |
legend.item.backcol |
fill color of the legend's item-background. Default is |
legend.item.bordercol |
Color of the legend's item-border. Default is |
The customized theme object, or NULL
, if a ggplot-theme was used.
## Not run: library(sjmisc) data(efc) # set sjPlot-defaults, a slightly modification # of the ggplot base theme set_theme() # legends of all plots inside set_theme(legend.pos = "top left", legend.inside = TRUE) plot_xtab(efc$e42dep, efc$e16sex) # Use classic-theme. you may need to # load the ggplot2-library. library(ggplot2) set_theme(base = theme_classic()) plot_frq(efc$e42dep) # adjust value labels set_theme( geom.label.size = 3.5, geom.label.color = "#3366cc", geom.label.angle = 90 ) # hjust-aes needs adjustment for this update_geom_defaults('text', list(hjust = -0.1)) plot_xtab(efc$e42dep, efc$e16sex, vjust = "center", hjust = "center") # Create own theme based on classic-theme set_theme( base = theme_classic(), axis.linecolor = "grey50", axis.textcolor = "#6699cc" ) plot_frq(efc$e42dep) ## End(Not run)
## Not run: library(sjmisc) data(efc) # set sjPlot-defaults, a slightly modification # of the ggplot base theme set_theme() # legends of all plots inside set_theme(legend.pos = "top left", legend.inside = TRUE) plot_xtab(efc$e42dep, efc$e16sex) # Use classic-theme. you may need to # load the ggplot2-library. library(ggplot2) set_theme(base = theme_classic()) plot_frq(efc$e42dep) # adjust value labels set_theme( geom.label.size = 3.5, geom.label.color = "#3366cc", geom.label.angle = 90 ) # hjust-aes needs adjustment for this update_geom_defaults('text', list(hjust = -0.1)) plot_xtab(efc$e42dep, efc$e16sex, vjust = "center", hjust = "center") # Create own theme based on classic-theme set_theme( base = theme_classic(), axis.linecolor = "grey50", axis.textcolor = "#6699cc" ) plot_frq(efc$e42dep) ## End(Not run)
Plot One-Way-Anova table sum of squares (SS) of each factor level (group) against the dependent variable. The SS of the factor variable against the dependent variable (variance within and between groups) is printed to the model summary.
sjp.aov1( var.dep, var.grp, meansums = FALSE, title = NULL, axis.labels = NULL, rev.order = FALSE, string.interc = "(Intercept)", axis.title = "", axis.lim = NULL, geom.colors = c("#3366a0", "#aa3333"), geom.size = 3, wrap.title = 50, wrap.labels = 25, grid.breaks = NULL, show.values = TRUE, digits = 2, y.offset = 0.15, show.p = TRUE, show.summary = FALSE )
sjp.aov1( var.dep, var.grp, meansums = FALSE, title = NULL, axis.labels = NULL, rev.order = FALSE, string.interc = "(Intercept)", axis.title = "", axis.lim = NULL, geom.colors = c("#3366a0", "#aa3333"), geom.size = 3, wrap.title = 50, wrap.labels = 25, grid.breaks = NULL, show.values = TRUE, digits = 2, y.offset = 0.15, show.p = TRUE, show.summary = FALSE )
var.dep |
Dependent variable. Will be used with following formula:
|
var.grp |
Factor with the cross-classifying variable, where |
meansums |
Logical, if |
title |
character vector, used as plot title. Depending on plot type and function,
will be set automatically. If |
axis.labels |
character vector with labels used as axis labels. Optional argument, since in most cases, axis labels are set automatically. |
rev.order |
Logical, if |
string.interc |
Character vector that indicates the reference group (intercept), that is appended to
the value label of the grouping variable. Default is |
axis.title |
Character vector of length one or two (depending on the
plot function and type), used as title(s) for the x and y axis. If not
specified, a default labelling is chosen. Note: Some plot types
may not support this argument sufficiently. In such cases, use the returned
ggplot-object and add axis titles manually with
|
axis.lim |
Numeric vector of length 2, defining the range of the plot axis.
Depending on plot type, may effect either x- or y-axis, or both.
For multiple plot outputs (e.g., from |
geom.colors |
user defined color for geoms. See 'Details' in |
geom.size |
size resp. width of the geoms (bar width, line thickness or point size, depending on plot type and function). Note that bar and bin widths mostly need smaller values than dot sizes. |
wrap.title |
numeric, determines how many chars of the plot title are displayed in one line and when a line break is inserted. |
wrap.labels |
numeric, determines how many chars of the value, variable or axis labels are displayed in one line and when a line break is inserted. |
grid.breaks |
numeric; sets the distance between breaks for the axis,
i.e. at every |
show.values |
Logical, whether values should be plotted or not. |
digits |
Numeric, amount of digits after decimal point when rounding estimates or values. |
y.offset |
numeric, offset for text labels when their alignment is adjusted
to the top/bottom of the geom (see |
show.p |
Logical, adds significance levels to values, or value and variable labels. |
show.summary |
logical, if |
A ggplot-object.
data(efc) # note: "var.grp" does not need to be a factor. # coercion to factor is done by the function sjp.aov1(efc$c12hour, efc$e42dep)
data(efc) # note: "var.grp" does not need to be a factor. # coercion to factor is done by the function sjp.aov1(efc$c12hour, efc$e42dep)
Plot p-values of Pearson's Chi2-tests for multiple contingency tables as ellipses or tiles. Requires a data frame with dichotomous (dummy) variables. Calculation of Chi2-matrix taken from Tales of R.
sjp.chi2( df, title = "Pearson's Chi2-Test of Independence", axis.labels = NULL, wrap.title = 50, wrap.labels = 20, show.legend = FALSE, legend.title = NULL )
sjp.chi2( df, title = "Pearson's Chi2-Test of Independence", axis.labels = NULL, wrap.title = 50, wrap.labels = 20, show.legend = FALSE, legend.title = NULL )
df |
A data frame with (dichotomous) factor variables. |
title |
character vector, used as plot title. Depending on plot type and function,
will be set automatically. If |
axis.labels |
character vector with labels used as axis labels. Optional argument, since in most cases, axis labels are set automatically. |
wrap.title |
numeric, determines how many chars of the plot title are displayed in one line and when a line break is inserted. |
wrap.labels |
numeric, determines how many chars of the value, variable or axis labels are displayed in one line and when a line break is inserted. |
show.legend |
logical, if |
legend.title |
character vector, used as title for the plot legend. |
A ggplot-object.
# create data frame with 5 dichotomous (dummy) variables mydf <- data.frame(as.factor(sample(1:2, 100, replace=TRUE)), as.factor(sample(1:2, 100, replace=TRUE)), as.factor(sample(1:2, 100, replace=TRUE)), as.factor(sample(1:2, 100, replace=TRUE)), as.factor(sample(1:2, 100, replace=TRUE))) # create variable labels items <- list(c("Item 1", "Item 2", "Item 3", "Item 4", "Item 5")) # plot Chi2-contingency-table sjp.chi2(mydf, axis.labels = items)
# create data frame with 5 dichotomous (dummy) variables mydf <- data.frame(as.factor(sample(1:2, 100, replace=TRUE)), as.factor(sample(1:2, 100, replace=TRUE)), as.factor(sample(1:2, 100, replace=TRUE)), as.factor(sample(1:2, 100, replace=TRUE)), as.factor(sample(1:2, 100, replace=TRUE))) # create variable labels items <- list(c("Item 1", "Item 2", "Item 3", "Item 4", "Item 5")) # plot Chi2-contingency-table sjp.chi2(mydf, axis.labels = items)
Plot correlation matrix as ellipses or tiles.
sjp.corr( data, title = NULL, axis.labels = NULL, sort.corr = TRUE, decimals = 3, na.deletion = c("listwise", "pairwise"), corr.method = c("pearson", "spearman", "kendall"), geom.colors = "RdBu", wrap.title = 50, wrap.labels = 20, show.legend = FALSE, legend.title = NULL, show.values = TRUE, show.p = TRUE, p.numeric = FALSE )
sjp.corr( data, title = NULL, axis.labels = NULL, sort.corr = TRUE, decimals = 3, na.deletion = c("listwise", "pairwise"), corr.method = c("pearson", "spearman", "kendall"), geom.colors = "RdBu", wrap.title = 50, wrap.labels = 20, show.legend = FALSE, legend.title = NULL, show.values = TRUE, show.p = TRUE, p.numeric = FALSE )
data |
Matrix with correlation coefficients as returned by the
|
title |
character vector, used as plot title. Depending on plot type and function,
will be set automatically. If |
axis.labels |
character vector with labels used as axis labels. Optional argument, since in most cases, axis labels are set automatically. |
sort.corr |
Logical, if |
decimals |
Indicates how many decimal values after comma are printed when
the values labels are shown. Default is 3. Only applies when
|
na.deletion |
Indicates how missing values are treated. May be either
|
corr.method |
Indicates the correlation computation method. May be one of
|
geom.colors |
user defined color for geoms. See 'Details' in |
wrap.title |
numeric, determines how many chars of the plot title are displayed in one line and when a line break is inserted. |
wrap.labels |
numeric, determines how many chars of the value, variable or axis labels are displayed in one line and when a line break is inserted. |
show.legend |
logical, if |
legend.title |
character vector, used as title for the plot legend. |
show.values |
Logical, whether values should be plotted or not. |
show.p |
Logical, adds significance levels to values, or value and variable labels. |
p.numeric |
Logical, if |
Required argument is either a data.frame
or a matrix with correlation coefficients
as returned by the cor
-function. In case of ellipses, the
ellipses size indicates the strength of the correlation. Furthermore,
blue and red colors indicate positive or negative correlations, where
stronger correlations are darker.
(Insisibily) returns the ggplot-object with the complete plot (plot
) as well as the data frame that
was used for setting up the ggplot-object (df
) and the original correlation matrix
(corr.matrix
).
If data
is a matrix with correlation coefficients as returned by
the cor
-function, p-values can't be computed.
Thus, show.p
and p.numeric
only have an effect if data
is a data.frame
.
This function plots a scatter plot of a term poly.term
against a response variable x
and adds - depending on
the amount of numeric values in poly.degree
- multiple
polynomial curves. A loess-smoothed line can be added to see
which of the polynomial curves fits best to the data.
sjp.poly( x, poly.term, poly.degree, poly.scale = FALSE, fun = NULL, axis.title = NULL, geom.colors = NULL, geom.size = 0.8, show.loess = TRUE, show.loess.ci = TRUE, show.p = TRUE, show.scatter = TRUE, point.alpha = 0.2, point.color = "#404040", loess.color = "#808080" )
sjp.poly( x, poly.term, poly.degree, poly.scale = FALSE, fun = NULL, axis.title = NULL, geom.colors = NULL, geom.size = 0.8, show.loess = TRUE, show.loess.ci = TRUE, show.p = TRUE, show.scatter = TRUE, point.alpha = 0.2, point.color = "#404040", loess.color = "#808080" )
x |
A vector, representing the response variable of a linear (mixed) model; or
a linear (mixed) model as returned by |
poly.term |
If |
poly.degree |
Numeric, or numeric vector, indicating the degree of the polynomial.
If |
poly.scale |
Logical, if |
fun |
Linear function when modelling polynomial terms. Use |
axis.title |
Character vector of length one or two (depending on the
plot function and type), used as title(s) for the x and y axis. If not
specified, a default labelling is chosen. Note: Some plot types
may not support this argument sufficiently. In such cases, use the returned
ggplot-object and add axis titles manually with
|
geom.colors |
user defined color for geoms. See 'Details' in |
geom.size |
size resp. width of the geoms (bar width, line thickness or point size, depending on plot type and function). Note that bar and bin widths mostly need smaller values than dot sizes. |
show.loess |
Logical, if |
show.loess.ci |
Logical, if |
show.p |
Logical, if |
show.scatter |
Logical, if TRUE (default), adds a scatter plot of data points to the plot. |
point.alpha |
Alpha value of point-geoms in the scatter plots. Only
applies, if |
point.color |
Color of of point-geoms in the scatter plots. Only applies,
if |
loess.color |
Color of the loess-smoothed line. Only applies, if |
For each polynomial degree, a simple linear regression on x
(resp.
the extracted response, if x
is a fitted model) is performed,
where only the polynomial term poly.term
is included as independent variable.
Thus, lm(y ~ x + I(x^2) + ... + I(x^i))
is repeatedly computed
for all values in poly.degree
, and the predicted values of
the reponse are plotted against the raw values of poly.term
.
If x
is a fitted model, other covariates are ignored when
finding the best fitting polynomial.
This function evaluates raw polynomials, not orthogonal polynomials.
Polynomials are computed using the poly
function,
with argument raw = TRUE
.
To find out which polynomial degree fits best to the data, a loess-smoothed
line (in dark grey) can be added (with show.loess = TRUE
). The polynomial curves
that comes closest to the loess-smoothed line should be the best
fit to the data.
A ggplot-object.
library(sjmisc) data(efc) # linear fit. loess-smoothed line indicates a more # or less cubic curve sjp.poly(efc$c160age, efc$quol_5, 1) # quadratic fit sjp.poly(efc$c160age, efc$quol_5, 2) # linear to cubic fit sjp.poly(efc$c160age, efc$quol_5, 1:4, show.scatter = FALSE) # fit sample model fit <- lm(tot_sc_e ~ c12hour + e17age + e42dep, data = efc) # inspect relationship between predictors and response plot_model(fit, type = "slope") # "e17age" does not seem to be linear correlated to response # try to find appropiate polynomial. Grey line (loess smoothed) # indicates best fit. Looks like x^4 has the best fit, # however, only x^3 has significant p-values. sjp.poly(fit, "e17age", 2:4, show.scatter = FALSE) ## Not run: # fit new model fit <- lm(tot_sc_e ~ c12hour + e42dep + e17age + I(e17age^2) + I(e17age^3), data = efc) # plot marginal effects of polynomial term plot_model(fit, type = "pred", terms = "e17age") ## End(Not run)
library(sjmisc) data(efc) # linear fit. loess-smoothed line indicates a more # or less cubic curve sjp.poly(efc$c160age, efc$quol_5, 1) # quadratic fit sjp.poly(efc$c160age, efc$quol_5, 2) # linear to cubic fit sjp.poly(efc$c160age, efc$quol_5, 1:4, show.scatter = FALSE) # fit sample model fit <- lm(tot_sc_e ~ c12hour + e17age + e42dep, data = efc) # inspect relationship between predictors and response plot_model(fit, type = "slope") # "e17age" does not seem to be linear correlated to response # try to find appropiate polynomial. Grey line (loess smoothed) # indicates best fit. Looks like x^4 has the best fit, # however, only x^3 has significant p-values. sjp.poly(fit, "e17age", 2:4, show.scatter = FALSE) ## Not run: # fit new model fit <- lm(tot_sc_e ~ c12hour + e42dep + e17age + I(e17age^2) + I(e17age^3), data = efc) # plot marginal effects of polynomial term plot_model(fit, type = "pred", terms = "e17age") ## End(Not run)
This function has a pipe-friendly argument-structure, with the
first argument always being the data, followed by variables that
should be plotted or printed as table. The function then transforms
the input and calls the requested sjp.- resp. sjt.-function
to create a plot or table.
Both sjplot()
and sjtab()
support grouped data frames.
sjplot(data, ..., fun = c("grpfrq", "xtab", "aov1", "likert")) sjtab(data, ..., fun = c("xtab", "stackfrq"))
sjplot(data, ..., fun = c("grpfrq", "xtab", "aov1", "likert")) sjtab(data, ..., fun = c("xtab", "stackfrq"))
data |
A data frame. May also be a grouped data frame (see 'Note' and 'Examples'). |
... |
Names of variables that should be plotted, and also further arguments passed down to the sjPlot-functions. See 'Examples'. |
fun |
Plotting function. Refers to the function name of sjPlot-functions. See 'Details' and 'Examples'. |
Following fun
-values are currently supported:
"aov1"
calls sjp.aov1
. The first
two variables in data
are used (and required) to create the plot.
"grpfrq"
calls plot_grpfrq
. The first
two variables in data
are used (and required) to create the plot.
"likert"
calls plot_likert
. data
must be a data frame with items to plot.
"stackfrq"
calls tab_stackfrq
.
data
must be a data frame with items to create the table.
"xtab"
calls plot_xtab
or tab_xtab
.
The first two variables in data
are used (and required)
to create the plot or table.
See related sjp. and sjt.-functions.
The ...
-argument is used, first, to specify the variables from data
that should be plotted, and, second, to name further arguments that are
used in the subsequent plotting functions. Refer to the online-help of
supported plotting-functions to see valid arguments.
data
may also be a grouped data frame (see group_by
)
with up to two grouping variables. Plots are created for each subgroup then.
library(dplyr) data(efc) # Grouped frequencies efc %>% sjplot(e42dep, c172code, fun = "grpfrq") # Grouped frequencies, as box plots efc %>% sjplot(e17age, c172code, fun = "grpfrq", type = "box", geom.colors = "Set1") ## Not run: # table output of grouped data frame efc %>% group_by(e16sex, c172code) %>% select(e42dep, n4pstu, e16sex, c172code) %>% sjtab(fun = "xtab", use.viewer = FALSE) # open all tables in browser ## End(Not run)
library(dplyr) data(efc) # Grouped frequencies efc %>% sjplot(e42dep, c172code, fun = "grpfrq") # Grouped frequencies, as box plots efc %>% sjplot(e17age, c172code, fun = "grpfrq", type = "box", geom.colors = "Set1") ## Not run: # table output of grouped data frame efc %>% group_by(e16sex, c172code) %>% select(e42dep, n4pstu, e16sex, c172code) %>% sjtab(fun = "xtab", use.viewer = FALSE) # open all tables in browser ## End(Not run)
Set default plot themes, use pre-defined color scales or modify plot or table appearance.
theme_sjplot(base_size = 12, base_family = "") theme_sjplot2(base_size = 12, base_family = "") theme_blank(base_size = 12, base_family = "") theme_538(base_size = 12, base_family = "") font_size( title, axis_title.x, axis_title.y, labels.x, labels.y, offset.x, offset.y, base.theme ) label_angle(angle.x, angle.y, base.theme) legend_style(inside, pos, justify, base.theme) scale_color_sjplot(palette = "metro", discrete = TRUE, reverse = FALSE, ...) scale_fill_sjplot(palette = "metro", discrete = TRUE, reverse = FALSE, ...) sjplot_pal(palette = "metro", n = NULL) show_sjplot_pals() css_theme(css.theme = "regression")
theme_sjplot(base_size = 12, base_family = "") theme_sjplot2(base_size = 12, base_family = "") theme_blank(base_size = 12, base_family = "") theme_538(base_size = 12, base_family = "") font_size( title, axis_title.x, axis_title.y, labels.x, labels.y, offset.x, offset.y, base.theme ) label_angle(angle.x, angle.y, base.theme) legend_style(inside, pos, justify, base.theme) scale_color_sjplot(palette = "metro", discrete = TRUE, reverse = FALSE, ...) scale_fill_sjplot(palette = "metro", discrete = TRUE, reverse = FALSE, ...) sjplot_pal(palette = "metro", n = NULL) show_sjplot_pals() css_theme(css.theme = "regression")
base_size |
Base font size. |
base_family |
Base font family. |
title |
Font size for plot titles. |
axis_title.x |
Font size for x-axis titles. |
axis_title.y |
Font size for y-axis titles. |
labels.x |
Font size for x-axis labels. |
labels.y |
Font size for y-axis labels. |
offset.x |
Offset for x-axis titles. |
offset.y |
Offset for y-axis titles. |
base.theme |
Optional ggplot-theme-object, which is needed in case multiple
functions should be combined, e.g. |
angle.x |
Angle for x-axis labels. |
angle.y |
Angle for y-axis labels. |
inside |
Logical, use |
pos |
Position of the legend, if a legend is drawn.
|
justify |
Justification of legend, relative to its position ( |
palette |
Character name of color palette. |
discrete |
Logical, if |
reverse |
Logical, if |
... |
Further arguments passed down to ggplot's |
n |
Numeric, number of colors to be returned. By default, the complete colour palette is returned. |
css.theme |
Name of the CSS pre-set theme-style. Can be used for table-functions. |
When using the colors
argument in function calls (e.g.
plot_model()
) or when calling one of the predefined scale-functions
(e.g. scale_color_sjplot()
), there are pre-defined colour palettes
in this package. Use show_sjplot_pals()
to show all available
colour palettes.
# prepare data if (requireNamespace("haven")) { library(sjmisc) data(efc) efc <- to_factor(efc, c161sex, e42dep, c172code) m <- lm(neg_c_7 ~ pos_v_4 + c12hour + e42dep + c172code, data = efc) # create plot-object p <- plot_model(m) # change theme p + theme_sjplot() # change font-size p + font_size(axis_title.x = 30) # apply color theme p + scale_color_sjplot() # show all available colour palettes show_sjplot_pals() # get colour values from specific palette sjplot_pal(pal = "breakfast club") }
# prepare data if (requireNamespace("haven")) { library(sjmisc) data(efc) efc <- to_factor(efc, c161sex, e42dep, c172code) m <- lm(neg_c_7 ~ pos_v_4 + c12hour + e42dep + c172code, data = efc) # create plot-object p <- plot_model(m) # change theme p + theme_sjplot() # change font-size p + font_size(axis_title.x = 30) # apply color theme p + scale_color_sjplot() # show all available colour palettes show_sjplot_pals() # get colour values from specific palette sjplot_pal(pal = "breakfast club") }
Shows the results of a computed correlation as HTML table. Requires either
a data.frame
or a matrix with correlation coefficients
as returned by the cor
-function.
tab_corr( data, na.deletion = c("listwise", "pairwise"), corr.method = c("pearson", "spearman", "kendall"), title = NULL, var.labels = NULL, wrap.labels = 40, show.p = TRUE, p.numeric = FALSE, fade.ns = TRUE, val.rm = NULL, digits = 3, triangle = "both", string.diag = NULL, CSS = NULL, encoding = NULL, file = NULL, use.viewer = TRUE, remove.spaces = TRUE )
tab_corr( data, na.deletion = c("listwise", "pairwise"), corr.method = c("pearson", "spearman", "kendall"), title = NULL, var.labels = NULL, wrap.labels = 40, show.p = TRUE, p.numeric = FALSE, fade.ns = TRUE, val.rm = NULL, digits = 3, triangle = "both", string.diag = NULL, CSS = NULL, encoding = NULL, file = NULL, use.viewer = TRUE, remove.spaces = TRUE )
data |
Matrix with correlation coefficients as returned by the
|
na.deletion |
Indicates how missing values are treated. May be either
|
corr.method |
Indicates the correlation computation method. May be one of
|
title |
String, will be used as table caption. |
var.labels |
Character vector with variable names, which will be used to label variables in the output. |
wrap.labels |
Numeric, determines how many chars of the value, variable or axis labels are displayed in one line and when a line break is inserted. |
show.p |
Logical, if |
p.numeric |
Logical, if |
fade.ns |
Logical, if |
val.rm |
Specify a number between 0 and 1 to suppress the output of correlation values
that are smaller than |
digits |
Amount of decimals for estimates |
triangle |
Indicates whether only the upper right (use |
string.diag |
A vector with string values of the same length as |
CSS |
A |
encoding |
Character vector, indicating the charset encoding used
for variable and value labels. Default is |
file |
Destination file, if the output should be saved as file.
If |
use.viewer |
Logical, if |
remove.spaces |
Logical, if |
Invisibly returns
the web page style sheet (page.style
),
the web page content (page.content
),
the complete html-output (page.complete
) and
the html-table with inline-css for use with knitr (knitr
)
for further use.
If data
is a matrix with correlation coefficients as returned by
the cor
-function, p-values can't be computed.
Thus, show.p
, p.numeric
and fade.ns
only have an effect if data
is a data.frame
.
## Not run: if (interactive()) { # Data from the EUROFAMCARE sample dataset library(sjmisc) data(efc) # retrieve variable and value labels varlabs <- get_label(efc) # recveive first item of COPE-index scale start <- which(colnames(efc) == "c83cop2") # recveive last item of COPE-index scale end <- which(colnames(efc) == "c88cop7") # create data frame with COPE-index scale mydf <- data.frame(efc[, c(start:end)]) colnames(mydf) <- varlabs[c(start:end)] # we have high correlations here, because all items # belong to one factor. tab_corr(mydf, p.numeric = TRUE) # auto-detection of labels, only lower triangle tab_corr(efc[, c(start:end)], triangle = "lower") # auto-detection of labels, only lower triangle, all correlation # values smaller than 0.3 are not shown in the table tab_corr(efc[, c(start:end)], triangle = "lower", val.rm = 0.3) # auto-detection of labels, only lower triangle, all correlation # values smaller than 0.3 are printed in blue tab_corr(efc[, c(start:end)], triangle = "lower",val.rm = 0.3, CSS = list(css.valueremove = 'color:blue;')) } ## End(Not run)
## Not run: if (interactive()) { # Data from the EUROFAMCARE sample dataset library(sjmisc) data(efc) # retrieve variable and value labels varlabs <- get_label(efc) # recveive first item of COPE-index scale start <- which(colnames(efc) == "c83cop2") # recveive last item of COPE-index scale end <- which(colnames(efc) == "c88cop7") # create data frame with COPE-index scale mydf <- data.frame(efc[, c(start:end)]) colnames(mydf) <- varlabs[c(start:end)] # we have high correlations here, because all items # belong to one factor. tab_corr(mydf, p.numeric = TRUE) # auto-detection of labels, only lower triangle tab_corr(efc[, c(start:end)], triangle = "lower") # auto-detection of labels, only lower triangle, all correlation # values smaller than 0.3 are not shown in the table tab_corr(efc[, c(start:end)], triangle = "lower", val.rm = 0.3) # auto-detection of labels, only lower triangle, all correlation # values smaller than 0.3 are printed in blue tab_corr(efc[, c(start:end)], triangle = "lower",val.rm = 0.3, CSS = list(css.valueremove = 'color:blue;')) } ## End(Not run)
These functions print data frames as HTML-table, showing the results in RStudio's viewer pane or in a web browser.
tab_df( x, title = NULL, footnote = NULL, col.header = NULL, show.type = FALSE, show.rownames = FALSE, show.footnote = FALSE, alternate.rows = FALSE, sort.column = NULL, digits = 2, encoding = "UTF-8", CSS = NULL, file = NULL, use.viewer = TRUE, ... ) tab_dfs( x, titles = NULL, footnotes = NULL, col.header = NULL, show.type = FALSE, show.rownames = FALSE, show.footnote = FALSE, alternate.rows = FALSE, sort.column = NULL, digits = 2, encoding = "UTF-8", CSS = NULL, file = NULL, use.viewer = TRUE, rnames = NULL, ... )
tab_df( x, title = NULL, footnote = NULL, col.header = NULL, show.type = FALSE, show.rownames = FALSE, show.footnote = FALSE, alternate.rows = FALSE, sort.column = NULL, digits = 2, encoding = "UTF-8", CSS = NULL, file = NULL, use.viewer = TRUE, ... ) tab_dfs( x, titles = NULL, footnotes = NULL, col.header = NULL, show.type = FALSE, show.rownames = FALSE, show.footnote = FALSE, alternate.rows = FALSE, sort.column = NULL, digits = 2, encoding = "UTF-8", CSS = NULL, file = NULL, use.viewer = TRUE, rnames = NULL, ... )
x |
For |
title , titles , footnote , footnotes
|
Character vector with table
caption(s) resp. footnote(s). For |
col.header |
Character vector with elements used as column header for
the table. If |
show.type |
Logical, if |
show.rownames |
Logical, if |
show.footnote |
Logical, if |
alternate.rows |
Logical, if |
sort.column |
Numeric vector, indicating the index of the column
that should sorted. by default, the column is sorted in ascending order.
Use negative index for descending order, for instance,
|
digits |
Numeric, amount of digits after decimal point when rounding values. |
encoding |
Character vector, indicating the charset encoding used
for variable and value labels. Default is |
CSS |
A |
file |
Destination file, if the output should be saved as file.
If |
use.viewer |
Logical, if |
... |
Currently not used. |
rnames |
Character vector, can be used to set row names when |
How do I use CSS
-argument?
With the CSS
-argument, the visual appearance of the tables
can be modified. To get an overview of all style-sheet-classnames
that are used in this function, see return value page.style
for
details. Arguments for this list have following syntax:
the class-name as argument name and
each style-definition must end with a semicolon
You can add style information to the default styles by using a + (plus-sign) as initial character for the argument attributes. Examples:
table = 'border:2px solid red;'
for a solid 2-pixel table border in red.
summary = 'font-weight:bold;'
for a bold fontweight in the summary row.
lasttablerow = 'border-bottom: 1px dotted blue;'
for a blue dotted border of the last table row.
colnames = '+color:green'
to add green color formatting to column names.
arc = 'color:blue;'
for a blue text color each 2nd row.
caption = '+color:red;'
to add red font-color to the default table caption style.
See further examples in this package-vignette.
A list with following items:
the web page style sheet (page.style
),
the HTML content of the data frame (page.content
),
the complete HTML page, including header, style sheet and body (page.complete
)
the HTML table with inline-css for use with knitr (knitr
)
the file path, if the HTML page should be saved to disk (file
)
The HTML tables can either be saved as file and manually opened
(use argument file
) or they can be saved as temporary files and
will be displayed in the RStudio Viewer pane (if working with RStudio)
or opened with the default web browser. Displaying resp. opening a
temporary file is the default behaviour.
## Not run: data(iris) data(mtcars) tab_df(iris[1:5, ]) tab_dfs(list(iris[1:5, ], mtcars[1:5, 1:5])) # sort 2nd column ascending tab_df(iris[1:5, ], sort.column = 2) # sort 2nd column descending tab_df(iris[1:5, ], sort.column = -2) ## End(Not run)
## Not run: data(iris) data(mtcars) tab_df(iris[1:5, ]) tab_dfs(list(iris[1:5, ], mtcars[1:5, 1:5])) # sort 2nd column ascending tab_df(iris[1:5, ], sort.column = 2) # sort 2nd column descending tab_df(iris[1:5, ], sort.column = -2) ## End(Not run)
Performs a factor analysis on a data frame or matrix
and displays the factors as HTML
table, or saves them as file.
In case a data frame is used as
parameter, the Cronbach's Alpha value for each factor scale will be calculated,
i.e. all variables with the highest loading for a factor are taken for the
reliability test. The result is an alpha value for each factor dimension.
tab_fa( data, rotation = "promax", method = c("ml", "minres", "wls", "gls", "pa", "minchi", "minrank"), nmbr.fctr = NULL, fctr.load.tlrn = 0.1, sort = FALSE, title = "Factor Analysis", var.labels = NULL, wrap.labels = 40, show.cronb = TRUE, show.comm = FALSE, alternate.rows = FALSE, digits = 2, CSS = NULL, encoding = NULL, file = NULL, use.viewer = TRUE, remove.spaces = TRUE )
tab_fa( data, rotation = "promax", method = c("ml", "minres", "wls", "gls", "pa", "minchi", "minrank"), nmbr.fctr = NULL, fctr.load.tlrn = 0.1, sort = FALSE, title = "Factor Analysis", var.labels = NULL, wrap.labels = 40, show.cronb = TRUE, show.comm = FALSE, alternate.rows = FALSE, digits = 2, CSS = NULL, encoding = NULL, file = NULL, use.viewer = TRUE, remove.spaces = TRUE )
data |
A data frame that should be used to compute a PCA, or a |
rotation |
Rotation of the factor loadings. May be one of
|
method |
the factoring method to be used. |
nmbr.fctr |
Number of factors used for calculating the rotation. By
default, this value is |
fctr.load.tlrn |
Specifies the minimum difference a variable needs to have between factor loadings (components) in order to indicate a clear loading on just one factor and not diffusing over all factors. For instance, a variable with 0.8, 0.82 and 0.84 factor loading on 3 possible factors can not be clearly assigned to just one factor and thus would be removed from the principal component analysis. By default, the minimum difference of loading values between the highest and 2nd highest factor should be 0.1 |
sort |
logical, if |
title |
String, will be used as table caption. |
var.labels |
Character vector with variable names, which will be used to label variables in the output. |
wrap.labels |
Numeric, determines how many chars of the value, variable or axis labels are displayed in one line and when a line break is inserted. |
show.cronb |
Logical, if |
show.comm |
Logical, if |
alternate.rows |
Logical, if |
digits |
Amount of decimals for estimates |
CSS |
A |
encoding |
Character vector, indicating the charset encoding used
for variable and value labels. Default is |
file |
Destination file, if the output should be saved as file.
If |
use.viewer |
Logical, if |
remove.spaces |
Logical, if |
Invisibly returns
the web page style sheet (page.style
),
the web page content (page.content
),
the complete html-output (page.complete
),
the html-table with inline-css for use with knitr (knitr
),
the factor.index
, i.e. the column index of each variable with the highest factor loading for each factor and
the removed.items
, i.e. which variables have been removed because they were outside of the fctr.load.tlrn
's range.
for further use.
This method for factor analysis relies on the functions
fa
and fa.parallel
from the psych package.
## Not run: # Data from the EUROFAMCARE sample dataset library(sjmisc) library(GPArotation) data(efc) # recveive first item of COPE-index scale start <- which(colnames(efc) == "c82cop1") # recveive last item of COPE-index scale end <- which(colnames(efc) == "c90cop9") # auto-detection of labels if (interactive()) { tab_fa(efc[, start:end]) } ## End(Not run)
## Not run: # Data from the EUROFAMCARE sample dataset library(sjmisc) library(GPArotation) data(efc) # recveive first item of COPE-index scale start <- which(colnames(efc) == "c82cop1") # recveive last item of COPE-index scale end <- which(colnames(efc) == "c90cop9") # auto-detection of labels if (interactive()) { tab_fa(efc[, start:end]) } ## End(Not run)
This function performs an item analysis with certain statistics that are useful for scale or index development. The resulting tables are shown in the viewer pane resp. webbrowser or can be saved as file. Following statistics are computed for each item of a data frame:
percentage of missing values
mean value
standard deviation
skew
item difficulty
item discrimination
Cronbach's Alpha if item was removed from scale
mean (or average) inter-item-correlation
Optional, following statistics can be computed as well:
kurstosis
Shapiro-Wilk Normality Test
If factor.groups
is not NULL
, the data frame df
will be
splitted into groups, assuming that factor.groups
indicate those columns
of the data frame that belong to a certain factor (see return value of function tab_pca
as example for retrieving factor groups for a scale and see examples for more details).
tab_itemscale( df, factor.groups = NULL, factor.groups.titles = "auto", scale = FALSE, min.valid.rowmean = 2, alternate.rows = TRUE, sort.column = NULL, show.shapiro = FALSE, show.kurtosis = FALSE, show.corr.matrix = TRUE, CSS = NULL, encoding = NULL, file = NULL, use.viewer = TRUE, remove.spaces = TRUE ) sjt.itemanalysis( df, factor.groups = NULL, factor.groups.titles = "auto", scale = FALSE, min.valid.rowmean = 2, alternate.rows = TRUE, sort.column = NULL, show.shapiro = FALSE, show.kurtosis = FALSE, show.corr.matrix = TRUE, CSS = NULL, encoding = NULL, file = NULL, use.viewer = TRUE, remove.spaces = TRUE )
tab_itemscale( df, factor.groups = NULL, factor.groups.titles = "auto", scale = FALSE, min.valid.rowmean = 2, alternate.rows = TRUE, sort.column = NULL, show.shapiro = FALSE, show.kurtosis = FALSE, show.corr.matrix = TRUE, CSS = NULL, encoding = NULL, file = NULL, use.viewer = TRUE, remove.spaces = TRUE ) sjt.itemanalysis( df, factor.groups = NULL, factor.groups.titles = "auto", scale = FALSE, min.valid.rowmean = 2, alternate.rows = TRUE, sort.column = NULL, show.shapiro = FALSE, show.kurtosis = FALSE, show.corr.matrix = TRUE, CSS = NULL, encoding = NULL, file = NULL, use.viewer = TRUE, remove.spaces = TRUE )
df |
A data frame with items. |
factor.groups |
If not |
factor.groups.titles |
Titles for each factor group that will be used as table caption for each
component-table. Must be a character vector of same length as |
scale |
Logical, if |
min.valid.rowmean |
Minimum amount of valid values to compute row means for index scores.
Default is 2, i.e. the return values |
alternate.rows |
Logical, if |
sort.column |
Numeric vector, indicating the index of the column
that should sorted. by default, the column is sorted in ascending order.
Use negative index for descending order, for instance,
|
show.shapiro |
Logical, if |
show.kurtosis |
Logical, if |
show.corr.matrix |
Logical, if |
CSS |
A |
encoding |
Character vector, indicating the charset encoding used
for variable and value labels. Default is |
file |
Destination file, if the output should be saved as file.
If |
use.viewer |
Logical, if |
remove.spaces |
Logical, if |
Invisibly returns
df.list
: List of data frames with the item analysis for each sub.group (or complete, if factor.groups
was NULL
)
index.scores
: A data frame with of standardized scale / index scores for each case (mean value of all scale items for each case) for each sub-group.
ideal.item.diff
: List of vectors that indicate the ideal item difficulty for each item in each sub-group. Item difficulty only differs when items have different levels.
cronbach.values
: List of Cronbach's Alpha values for the overall item scale for each sub-group.
knitr.list
: List of html-tables with inline-css for use with knitr for each table (sub-group)
knitr
: html-table of all complete output with inline-css for use with knitr
complete.page
: Complete html-output.
If factor.groups = NULL
, each list contains only one elment, since just one
table is printed for the complete scale indicated by df
. If factor.groups
is a vector of group-index-values, the lists contain elements for each sub-group.
The Shapiro-Wilk Normality Test (see column W(p)
) tests if an item has a distribution that is significantly different from normal.
Item difficulty should range between 0.2 and 0.8. Ideal value is p+(1-p)/2
(which mostly is between 0.5 and 0.8).
For item discrimination, acceptable values are 0.20 or higher; the closer to 1.00 the better. See item_reliability
for more details.
In case the total Cronbach's Alpha value is below the acceptable cut-off of 0.7 (mostly if an index has few items), the mean inter-item-correlation is an alternative measure to indicate acceptability. Satisfactory range lies between 0.2 and 0.4. See also item_intercor
.
Jorion N, Self B, James K, Schroeder L, DiBello L, Pellegrino J (2013) Classical Test Theory Analysis of the Dynamics Concept Inventory. (web)
Briggs SR, Cheek JM (1986) The role of factor analysis in the development and evaluation of personality scales. Journal of Personality, 54(1), 106-148. doi: 10.1111/j.1467-6494.1986.tb00391.x
McLean S et al. (2013) Stigmatizing attitudes and beliefs about bulimia nervosa: Gender, age, education and income variability in a community sample. International Journal of Eating Disorders. doi: 10.1002/eat.22227
Trochim WMK (2008) Types of Reliability.
# Data from the EUROFAMCARE sample dataset library(sjmisc) library(sjlabelled) data(efc) # retrieve variable and value labels varlabs <- get_label(efc) # recveive first item of COPE-index scale start <- which(colnames(efc) == "c82cop1") # recveive last item of COPE-index scale end <- which(colnames(efc) == "c90cop9") # create data frame with COPE-index scale mydf <- data.frame(efc[, start:end]) colnames(mydf) <- varlabs[start:end] ## Not run: if (interactive()) { tab_itemscale(mydf) # auto-detection of labels tab_itemscale(efc[, start:end]) # Compute PCA on Cope-Index, and perform a # item analysis for each extracted factor. indices <- tab_pca(mydf)$factor.index tab_itemscale(mydf, factor.groups = indices) # or, equivalent tab_itemscale(mydf, factor.groups = "auto") } ## End(Not run)
# Data from the EUROFAMCARE sample dataset library(sjmisc) library(sjlabelled) data(efc) # retrieve variable and value labels varlabs <- get_label(efc) # recveive first item of COPE-index scale start <- which(colnames(efc) == "c82cop1") # recveive last item of COPE-index scale end <- which(colnames(efc) == "c90cop9") # create data frame with COPE-index scale mydf <- data.frame(efc[, start:end]) colnames(mydf) <- varlabs[start:end] ## Not run: if (interactive()) { tab_itemscale(mydf) # auto-detection of labels tab_itemscale(efc[, start:end]) # Compute PCA on Cope-Index, and perform a # item analysis for each extracted factor. indices <- tab_pca(mydf)$factor.index tab_itemscale(mydf, factor.groups = indices) # or, equivalent tab_itemscale(mydf, factor.groups = "auto") } ## End(Not run)
tab_model()
creates HTML tables from regression models.
tab_model( ..., transform, show.intercept = TRUE, show.est = TRUE, show.ci = 0.95, show.ci50 = FALSE, show.se = NULL, show.std = NULL, std.response = TRUE, show.p = TRUE, show.stat = FALSE, show.df = FALSE, show.zeroinf = TRUE, show.r2 = TRUE, show.icc = TRUE, show.re.var = TRUE, show.ngroups = TRUE, show.fstat = FALSE, show.aic = FALSE, show.aicc = FALSE, show.dev = FALSE, show.loglik = FALSE, show.obs = TRUE, show.reflvl = FALSE, terms = NULL, rm.terms = NULL, order.terms = NULL, keep = NULL, drop = NULL, title = NULL, pred.labels = NULL, dv.labels = NULL, wrap.labels = 25, bootstrap = FALSE, iterations = 1000, seed = NULL, robust = FALSE, vcov.fun = NULL, vcov.type = NULL, vcov.args = NULL, string.pred = "Predictors", string.est = "Estimate", string.std = "std. Beta", string.ci = "CI", string.se = "std. Error", string.std_se = "standardized std. Error", string.std_ci = "standardized CI", string.p = "p", string.std.p = "std. p", string.df = "df", string.stat = "Statistic", string.std.stat = "std. Statistic", string.resp = "Response", string.intercept = "(Intercept)", strings = NULL, ci.hyphen = " – ", minus.sign = "-", collapse.ci = FALSE, collapse.se = FALSE, linebreak = TRUE, col.order = c("est", "se", "std.est", "std.se", "ci", "std.ci", "ci.inner", "ci.outer", "stat", "std.stat", "p", "std.p", "df.error", "response.level"), digits = 2, digits.p = 3, digits.rsq = 3, digits.re = 2, emph.p = TRUE, p.val = NULL, df.method = NULL, p.style = c("numeric", "stars", "numeric_stars", "scientific", "scientific_stars"), p.threshold = c(0.05, 0.01, 0.001), p.adjust = NULL, case = "parsed", auto.label = TRUE, prefix.labels = c("none", "varname", "label"), bpe = "median", CSS = css_theme("regression"), file = NULL, use.viewer = TRUE, encoding = "UTF-8" )
tab_model( ..., transform, show.intercept = TRUE, show.est = TRUE, show.ci = 0.95, show.ci50 = FALSE, show.se = NULL, show.std = NULL, std.response = TRUE, show.p = TRUE, show.stat = FALSE, show.df = FALSE, show.zeroinf = TRUE, show.r2 = TRUE, show.icc = TRUE, show.re.var = TRUE, show.ngroups = TRUE, show.fstat = FALSE, show.aic = FALSE, show.aicc = FALSE, show.dev = FALSE, show.loglik = FALSE, show.obs = TRUE, show.reflvl = FALSE, terms = NULL, rm.terms = NULL, order.terms = NULL, keep = NULL, drop = NULL, title = NULL, pred.labels = NULL, dv.labels = NULL, wrap.labels = 25, bootstrap = FALSE, iterations = 1000, seed = NULL, robust = FALSE, vcov.fun = NULL, vcov.type = NULL, vcov.args = NULL, string.pred = "Predictors", string.est = "Estimate", string.std = "std. Beta", string.ci = "CI", string.se = "std. Error", string.std_se = "standardized std. Error", string.std_ci = "standardized CI", string.p = "p", string.std.p = "std. p", string.df = "df", string.stat = "Statistic", string.std.stat = "std. Statistic", string.resp = "Response", string.intercept = "(Intercept)", strings = NULL, ci.hyphen = " – ", minus.sign = "-", collapse.ci = FALSE, collapse.se = FALSE, linebreak = TRUE, col.order = c("est", "se", "std.est", "std.se", "ci", "std.ci", "ci.inner", "ci.outer", "stat", "std.stat", "p", "std.p", "df.error", "response.level"), digits = 2, digits.p = 3, digits.rsq = 3, digits.re = 2, emph.p = TRUE, p.val = NULL, df.method = NULL, p.style = c("numeric", "stars", "numeric_stars", "scientific", "scientific_stars"), p.threshold = c(0.05, 0.01, 0.001), p.adjust = NULL, case = "parsed", auto.label = TRUE, prefix.labels = c("none", "varname", "label"), bpe = "median", CSS = css_theme("regression"), file = NULL, use.viewer = TRUE, encoding = "UTF-8" )
... |
One or more regression models, including glm's or mixed models.
May also be a |
transform |
A character vector, naming a function that will be applied
on estimates and confidence intervals. By default, |
show.intercept |
Logical, if |
show.est |
Logical, if |
show.ci |
Either logical, and if |
show.ci50 |
Logical, if |
show.se |
Logical, if |
show.std |
Indicates whether standardized beta-coefficients should also printed, and if yes, which type of standardization is done. See 'Details'. |
std.response |
Logical, whether the response variable will also be
standardized if standardized coefficients are requested. Setting both
|
show.p |
Logical, if |
show.stat |
Logical, if |
show.df |
Logical, if |
show.zeroinf |
Logical, if |
show.r2 |
Logical, if |
show.icc |
Logical, if |
show.re.var |
Logical, if |
show.ngroups |
Logical, if |
show.fstat |
Logical, if |
show.aic |
Logical, if |
show.aicc |
Logical, if |
show.dev |
Logical, if |
show.loglik |
Logical, if |
show.obs |
Logical, if |
show.reflvl |
Logical, if |
terms |
Character vector with names of those terms (variables) that should
be printed in the table. All other terms are removed from the output. If
|
rm.terms |
Character vector with names that indicate which terms should
be removed from the output Counterpart to |
order.terms |
Numeric vector, indicating in which order the coefficients should be plotted. See examples in this package-vignette. |
keep , drop
|
Character containing a regular expression pattern that
describes the parameters that should be included (for |
title |
String, will be used as table caption. |
pred.labels |
Character vector with labels of predictor variables.
If not |
dv.labels |
Character vector with labels of dependent variables of all
fitted models. If |
wrap.labels |
Numeric, determines how many chars of the value, variable or axis labels are displayed in one line and when a line break is inserted. |
bootstrap |
Logical, if |
iterations |
Numeric, number of bootstrap iterations (default is 1000). |
seed |
Numeric, the number of the seed to replicate bootstrapped estimates. If |
robust |
Deprecated. Please use |
vcov.fun |
Variance-covariance matrix used to compute uncertainty
estimates (e.g., for robust standard errors). This argument accepts a
covariance matrix, a function which returns a covariance matrix, or a
string which identifies the function to be used to compute the covariance
matrix. See |
vcov.type |
Deprecated. The |
vcov.args |
List of arguments to be passed to the function identified by
the |
string.pred |
Character vector,used as headline for the predictor column.
Default is |
string.est |
Character vector, used for the column heading of coefficients.
Default is based on the response scale, e.g. for logistic regression models,
|
string.std |
Character vector, used for the column heading of standardized beta coefficients. Default is |
string.ci |
Character vector, used for the column heading of confidence interval values. Default is |
string.se |
Character vector, used for the column heading of standard error values. Default is |
string.std_se |
Character vector, used for the column heading of standard error of standardized coefficients. Default is |
string.std_ci |
Character vector, used for the column heading of confidence intervals of standardized coefficients. Default is |
string.p |
Character vector, used for the column heading of p values. Default is |
string.std.p |
Character vector, used for the column heading of p values. Default is |
string.df |
Character vector, used for the column heading of degrees of freedom. Default is |
string.stat |
Character vector, used for the test statistic. Default is |
string.std.stat |
Character vector, used for the test statistic. Default is |
string.resp |
Character vector, used for the column heading of of the response level for multinominal or categorical models. Default is |
string.intercept |
Character vector, used as name for the intercept parameter. Default is |
strings |
Named character vector, as alternative to arguments like |
ci.hyphen |
Character vector, indicating the hyphen for confidence interval range. May be an HTML entity. See 'Examples'. |
minus.sign |
string, indicating the minus sign for negative numbers. May be an HTML entity. See 'Examples'. |
collapse.ci |
Logical, if |
collapse.se |
Logical, if |
linebreak |
Logical, if |
col.order |
Character vector, indicating which columns should be printed
and in which order. Column names that are excluded from |
digits |
Amount of decimals for estimates |
digits.p |
Amount of decimals for p-values |
digits.rsq |
Amount of decimals for r-squared values |
digits.re |
Amount of decimals for random effects part of the summary table. |
emph.p |
Logical, if |
df.method , p.val
|
Method for computing degrees of freedom for p-values,
standard errors and confidence intervals (CI). Only applies to mixed models.
Use |
p.style |
Character, indicating if p-values should be printed as
numeric value ( |
p.threshold |
Numeric vector of length 3, indicating the treshold for
annotating p-values with asterisks. Only applies if
|
p.adjust |
Character vector, if not |
case |
Desired target case. Labels will automatically converted into the
specified character case. See |
auto.label |
Logical, if |
prefix.labels |
Indicates whether the value labels of categorical variables
should be prefixed, e.g. with the variable name or variable label. See
argument |
bpe |
For Stan-models (fitted with the rstanarm- or
brms-package), the Bayesian point estimate is, by default, the median
of the posterior distribution. Use |
CSS |
A |
file |
Destination file, if the output should be saved as file.
If |
use.viewer |
Logical, if |
encoding |
Character vector, indicating the charset encoding used
for variable and value labels. Default is |
Default standardization is done by completely refitting the model on the
standardized data. Hence, this approach is equal to standardizing the
variables before fitting the model, which is particularly recommended for
complex models that include interactions or transformations (e.g., polynomial
or spline terms). When show.std = "std2"
, standardization of estimates
follows Gelman's (2008)
suggestion, rescaling the estimates by dividing them by two standard deviations
instead of just one. Resulting coefficients are then directly comparable for
untransformed binary predictors. For backward compatibility reasons,
show.std
also may be a logical value; if TRUE
, normal standardized
estimates are printed (same effect as show.std = "std"
). Use
show.std = NULL
(default) or show.std = FALSE
, if no standardization
is required.
CSS
-argument?With the CSS
-argument, the visual appearance of the tables
can be modified. To get an overview of all style-sheet-classnames
that are used in this function, see return value page.style
for details.
Arguments for this list have following syntax:
the class-names with "css."
-prefix as argument name and
each style-definition must end with a semicolon
You can add style information to the default styles by using a + (plus-sign) as initial character for the argument attributes. Examples:
css.table = 'border:2px solid red;'
for a solid 2-pixel table border in red.
css.summary = 'font-weight:bold;'
for a bold fontweight in the summary row.
css.lasttablerow = 'border-bottom: 1px dotted blue;'
for a blue dotted border of the last table row.
css.colnames = '+color:green'
to add green color formatting to column names.
css.arc = 'color:blue;'
for a blue text color each 2nd row.
css.caption = '+color:red;'
to add red font-color to the default table caption style.
Invisibly returns
the web page style sheet (page.style
),
the web page content (page.content
),
the complete html-output (page.complete
) and
the html-table with inline-css for use with knitr (knitr
)
for further use.
The HTML tables can either be saved as file and manually opened (use argument file
) or
they can be saved as temporary files and will be displayed in the RStudio Viewer pane (if working with RStudio)
or opened with the default web browser. Displaying resp. opening a temporary file is the
default behaviour (i.e. file = NULL
).
Examples are shown in these three vignettes:
Summary of Regression Models as HTML Table,
Summary of Mixed Models as HTML Table and
Summary of Bayesian Models as HTML Table.
Performes a principle component analysis on a data frame or matrix
(with varimax or oblimin rotation) and displays the factor solution as HTML
table, or saves them as file.
In case a data frame is used as
parameter, the Cronbach's Alpha value for each factor scale will be calculated,
i.e. all variables with the highest loading for a factor are taken for the
reliability test. The result is an alpha value for each factor dimension.
tab_pca( data, rotation = c("varimax", "quartimax", "promax", "oblimin", "simplimax", "cluster", "none"), nmbr.fctr = NULL, fctr.load.tlrn = 0.1, title = "Principal Component Analysis", var.labels = NULL, wrap.labels = 40, show.cronb = TRUE, show.msa = FALSE, show.var = FALSE, alternate.rows = FALSE, digits = 2, string.pov = "Proportion of Variance", string.cpov = "Cumulative Proportion", CSS = NULL, encoding = NULL, file = NULL, use.viewer = TRUE, remove.spaces = TRUE )
tab_pca( data, rotation = c("varimax", "quartimax", "promax", "oblimin", "simplimax", "cluster", "none"), nmbr.fctr = NULL, fctr.load.tlrn = 0.1, title = "Principal Component Analysis", var.labels = NULL, wrap.labels = 40, show.cronb = TRUE, show.msa = FALSE, show.var = FALSE, alternate.rows = FALSE, digits = 2, string.pov = "Proportion of Variance", string.cpov = "Cumulative Proportion", CSS = NULL, encoding = NULL, file = NULL, use.viewer = TRUE, remove.spaces = TRUE )
data |
A data frame that should be used to compute a PCA, or a |
rotation |
Rotation of the factor loadings. May be one of
|
nmbr.fctr |
Number of factors used for calculating the rotation. By
default, this value is |
fctr.load.tlrn |
Specifies the minimum difference a variable needs to have between factor loadings (components) in order to indicate a clear loading on just one factor and not diffusing over all factors. For instance, a variable with 0.8, 0.82 and 0.84 factor loading on 3 possible factors can not be clearly assigned to just one factor and thus would be removed from the principal component analysis. By default, the minimum difference of loading values between the highest and 2nd highest factor should be 0.1 |
title |
String, will be used as table caption. |
var.labels |
Character vector with variable names, which will be used to label variables in the output. |
wrap.labels |
Numeric, determines how many chars of the value, variable or axis labels are displayed in one line and when a line break is inserted. |
show.cronb |
Logical, if |
show.msa |
Logical, if |
show.var |
Logical, if |
alternate.rows |
Logical, if |
digits |
Amount of decimals for estimates |
string.pov |
String for the table row that contains the proportions of variances. By default, "Proportion of Variance" will be used. |
string.cpov |
String for the table row that contains the cumulative variances. By default, "Cumulative Proportion" will be used. |
CSS |
A |
encoding |
Character vector, indicating the charset encoding used
for variable and value labels. Default is |
file |
Destination file, if the output should be saved as file.
If |
use.viewer |
Logical, if |
remove.spaces |
Logical, if |
Invisibly returns
the web page style sheet (page.style
),
the web page content (page.content
),
the complete html-output (page.complete
),
the html-table with inline-css for use with knitr (knitr
),
the factor.index
, i.e. the column index of each variable with the highest factor loading for each factor and
the removed.items
, i.e. which variables have been removed because they were outside of the fctr.load.tlrn
's range.
for further use.
## Not run: # Data from the EUROFAMCARE sample dataset library(sjmisc) data(efc) # recveive first item of COPE-index scale start <- which(colnames(efc) == "c82cop1") # recveive last item of COPE-index scale end <- which(colnames(efc) == "c90cop9") # auto-detection of labels if (interactive()) { tab_pca(efc[, start:end]) } ## End(Not run)
## Not run: # Data from the EUROFAMCARE sample dataset library(sjmisc) data(efc) # recveive first item of COPE-index scale start <- which(colnames(efc) == "c82cop1") # recveive last item of COPE-index scale end <- which(colnames(efc) == "c90cop9") # auto-detection of labels if (interactive()) { tab_pca(efc[, start:end]) } ## End(Not run)
Shows the results of stacked frequencies (such as likert scales) as HTML table. This function is useful when several items with identical scale/categories should be printed as table to compare their distributions (e.g. when plotting scales like SF, Barthel-Index, Quality-of-Life-scales etc.).
tab_stackfrq( items, weight.by = NULL, title = NULL, var.labels = NULL, value.labels = NULL, wrap.labels = 20, sort.frq = NULL, alternate.rows = FALSE, digits = 2, string.total = "N", string.na = "NA", show.n = FALSE, show.total = FALSE, show.na = FALSE, show.skew = FALSE, show.kurtosis = FALSE, digits.stats = 2, file = NULL, encoding = NULL, CSS = NULL, use.viewer = TRUE, remove.spaces = TRUE )
tab_stackfrq( items, weight.by = NULL, title = NULL, var.labels = NULL, value.labels = NULL, wrap.labels = 20, sort.frq = NULL, alternate.rows = FALSE, digits = 2, string.total = "N", string.na = "NA", show.n = FALSE, show.total = FALSE, show.na = FALSE, show.skew = FALSE, show.kurtosis = FALSE, digits.stats = 2, file = NULL, encoding = NULL, CSS = NULL, use.viewer = TRUE, remove.spaces = TRUE )
items |
Data frame, or a grouped data frame, with each column representing one item. |
weight.by |
Vector of weights that will be applied to weight all cases.
Must be a vector of same length as the input vector. Default is
|
title |
String, will be used as table caption. |
var.labels |
Character vector with variable names, which will be used to label variables in the output. |
value.labels |
Character vector (or |
wrap.labels |
Numeric, determines how many chars of the value, variable or axis labels are displayed in one line and when a line break is inserted. |
sort.frq |
logical, indicates whether the
|
alternate.rows |
Logical, if |
digits |
Numeric, amount of digits after decimal point when rounding values. |
string.total |
label for the total N column. |
string.na |
label for the missing column/row. |
show.n |
logical, if |
show.total |
logical, if |
show.na |
logical, if |
show.skew |
logical, if |
show.kurtosis |
Logical, if |
digits.stats |
amount of digits for rounding the skewness and kurtosis valuess. Default is 2, i.e. skewness and kurtosis values have 2 digits after decimal point. |
file |
Destination file, if the output should be saved as file.
If |
encoding |
Character vector, indicating the charset encoding used
for variable and value labels. Default is |
CSS |
A |
use.viewer |
Logical, if |
remove.spaces |
Logical, if |
Invisibly returns
the web page style sheet (page.style
),
the web page content (page.content
),
the complete html-output (page.complete
) and
the html-table with inline-css for use with knitr (knitr
)
for further use.
# ------------------------------- # random sample # ------------------------------- # prepare data for 4-category likert scale, 5 items likert_4 <- data.frame( as.factor(sample(1:4, 500, replace = TRUE, prob = c(0.2, 0.3, 0.1, 0.4))), as.factor(sample(1:4, 500, replace = TRUE, prob = c(0.5, 0.25, 0.15, 0.1))), as.factor(sample(1:4, 500, replace = TRUE, prob = c(0.25, 0.1, 0.4, 0.25))), as.factor(sample(1:4, 500, replace = TRUE, prob = c(0.1, 0.4, 0.4, 0.1))), as.factor(sample(1:4, 500, replace = TRUE, prob = c(0.35, 0.25, 0.15, 0.25))) ) # create labels levels_4 <- c("Independent", "Slightly dependent", "Dependent", "Severely dependent") # create item labels items <- c("Q1", "Q2", "Q3", "Q4", "Q5") # plot stacked frequencies of 5 (ordered) item-scales ## Not run: if (interactive()) { tab_stackfrq(likert_4, value.labels = levels_4, var.labels = items) # ------------------------------- # Data from the EUROFAMCARE sample dataset # Auto-detection of labels # ------------------------------- data(efc) # recveive first item of COPE-index scale start <- which(colnames(efc) == "c82cop1") # recveive first item of COPE-index scale end <- which(colnames(efc) == "c90cop9") tab_stackfrq(efc[, c(start:end)], alternate.rows = TRUE) tab_stackfrq(efc[, c(start:end)], alternate.rows = TRUE, show.n = TRUE, show.na = TRUE) # -------------------------------- # User defined style sheet # -------------------------------- tab_stackfrq(efc[, c(start:end)], alternate.rows = TRUE, show.total = TRUE, show.skew = TRUE, show.kurtosis = TRUE, CSS = list(css.ncol = "border-left:1px dotted black;", css.summary = "font-style:italic;")) } ## End(Not run)
# ------------------------------- # random sample # ------------------------------- # prepare data for 4-category likert scale, 5 items likert_4 <- data.frame( as.factor(sample(1:4, 500, replace = TRUE, prob = c(0.2, 0.3, 0.1, 0.4))), as.factor(sample(1:4, 500, replace = TRUE, prob = c(0.5, 0.25, 0.15, 0.1))), as.factor(sample(1:4, 500, replace = TRUE, prob = c(0.25, 0.1, 0.4, 0.25))), as.factor(sample(1:4, 500, replace = TRUE, prob = c(0.1, 0.4, 0.4, 0.1))), as.factor(sample(1:4, 500, replace = TRUE, prob = c(0.35, 0.25, 0.15, 0.25))) ) # create labels levels_4 <- c("Independent", "Slightly dependent", "Dependent", "Severely dependent") # create item labels items <- c("Q1", "Q2", "Q3", "Q4", "Q5") # plot stacked frequencies of 5 (ordered) item-scales ## Not run: if (interactive()) { tab_stackfrq(likert_4, value.labels = levels_4, var.labels = items) # ------------------------------- # Data from the EUROFAMCARE sample dataset # Auto-detection of labels # ------------------------------- data(efc) # recveive first item of COPE-index scale start <- which(colnames(efc) == "c82cop1") # recveive first item of COPE-index scale end <- which(colnames(efc) == "c90cop9") tab_stackfrq(efc[, c(start:end)], alternate.rows = TRUE) tab_stackfrq(efc[, c(start:end)], alternate.rows = TRUE, show.n = TRUE, show.na = TRUE) # -------------------------------- # User defined style sheet # -------------------------------- tab_stackfrq(efc[, c(start:end)], alternate.rows = TRUE, show.total = TRUE, show.skew = TRUE, show.kurtosis = TRUE, CSS = list(css.ncol = "border-left:1px dotted black;", css.summary = "font-style:italic;")) } ## End(Not run)
Shows contingency tables as HTML file in browser or viewer pane, or saves them as file.
tab_xtab( var.row, var.col, weight.by = NULL, title = NULL, var.labels = NULL, value.labels = NULL, wrap.labels = 20, show.obs = TRUE, show.cell.prc = FALSE, show.row.prc = FALSE, show.col.prc = FALSE, show.exp = FALSE, show.legend = FALSE, show.na = FALSE, show.summary = TRUE, drop.empty = TRUE, statistics = c("auto", "cramer", "phi", "spearman", "kendall", "pearson", "fisher"), string.total = "Total", digits = 1, tdcol.n = "black", tdcol.expected = "#339999", tdcol.cell = "#993333", tdcol.row = "#333399", tdcol.col = "#339933", emph.total = FALSE, emph.color = "#f8f8f8", prc.sign = " %", hundret = "100.0", CSS = NULL, encoding = NULL, file = NULL, use.viewer = TRUE, remove.spaces = TRUE, ... ) sjt.xtab( var.row, var.col, weight.by = NULL, title = NULL, var.labels = NULL, value.labels = NULL, wrap.labels = 20, show.obs = TRUE, show.cell.prc = FALSE, show.row.prc = FALSE, show.col.prc = FALSE, show.exp = FALSE, show.legend = FALSE, show.na = FALSE, show.summary = TRUE, drop.empty = TRUE, statistics = c("auto", "cramer", "phi", "spearman", "kendall", "pearson", "fisher"), string.total = "Total", digits = 1, tdcol.n = "black", tdcol.expected = "#339999", tdcol.cell = "#993333", tdcol.row = "#333399", tdcol.col = "#339933", emph.total = FALSE, emph.color = "#f8f8f8", prc.sign = " %", hundret = "100.0", CSS = NULL, encoding = NULL, file = NULL, use.viewer = TRUE, remove.spaces = TRUE, ... )
tab_xtab( var.row, var.col, weight.by = NULL, title = NULL, var.labels = NULL, value.labels = NULL, wrap.labels = 20, show.obs = TRUE, show.cell.prc = FALSE, show.row.prc = FALSE, show.col.prc = FALSE, show.exp = FALSE, show.legend = FALSE, show.na = FALSE, show.summary = TRUE, drop.empty = TRUE, statistics = c("auto", "cramer", "phi", "spearman", "kendall", "pearson", "fisher"), string.total = "Total", digits = 1, tdcol.n = "black", tdcol.expected = "#339999", tdcol.cell = "#993333", tdcol.row = "#333399", tdcol.col = "#339933", emph.total = FALSE, emph.color = "#f8f8f8", prc.sign = " %", hundret = "100.0", CSS = NULL, encoding = NULL, file = NULL, use.viewer = TRUE, remove.spaces = TRUE, ... ) sjt.xtab( var.row, var.col, weight.by = NULL, title = NULL, var.labels = NULL, value.labels = NULL, wrap.labels = 20, show.obs = TRUE, show.cell.prc = FALSE, show.row.prc = FALSE, show.col.prc = FALSE, show.exp = FALSE, show.legend = FALSE, show.na = FALSE, show.summary = TRUE, drop.empty = TRUE, statistics = c("auto", "cramer", "phi", "spearman", "kendall", "pearson", "fisher"), string.total = "Total", digits = 1, tdcol.n = "black", tdcol.expected = "#339999", tdcol.cell = "#993333", tdcol.row = "#333399", tdcol.col = "#339933", emph.total = FALSE, emph.color = "#f8f8f8", prc.sign = " %", hundret = "100.0", CSS = NULL, encoding = NULL, file = NULL, use.viewer = TRUE, remove.spaces = TRUE, ... )
var.row |
Variable that should be displayed in the table rows. |
var.col |
Cariable that should be displayed in the table columns. |
weight.by |
Vector of weights that will be applied to weight all cases.
Must be a vector of same length as the input vector. Default is
|
title |
String, will be used as table caption. |
var.labels |
Character vector with variable names, which will be used to label variables in the output. |
value.labels |
Character vector (or |
wrap.labels |
Numeric, determines how many chars of the value, variable or axis labels are displayed in one line and when a line break is inserted. |
show.obs |
Logical, if |
show.cell.prc |
Logical, if |
show.row.prc |
Logical, if |
show.col.prc |
Logical, if |
show.exp |
Logical, if |
show.legend |
logical, if |
show.na |
logical, if |
show.summary |
Logical, if |
drop.empty |
Logical, if |
statistics |
Name of measure of association that should be computed. May
be one of |
string.total |
Character label for the total column / row header |
digits |
Amount of decimals for estimates |
tdcol.n |
Color for highlighting count (observed) values in table cells. Default is black. |
tdcol.expected |
Color for highlighting expected values in table cells. Default is cyan. |
tdcol.cell |
Color for highlighting cell percentage values in table cells. Default is red. |
tdcol.row |
Color for highlighting row percentage values in table cells. Default is blue. |
tdcol.col |
Color for highlighting column percentage values in table cells. Default is green. |
emph.total |
Logical, if |
emph.color |
Logical, if |
prc.sign |
The percentage sign that is printed in the table cells, in HTML-format.
Default is |
hundret |
Default value that indicates the 100-percent column-sums (since rounding values
may lead to non-exact results). Default is |
CSS |
A |
encoding |
String, indicating the charset encoding used for variable and
value labels. Default is |
file |
Destination file, if the output should be saved as file.
If |
use.viewer |
Logical, if |
remove.spaces |
Logical, if |
... |
Other arguments, currently passed down to the test statistics functions
|
Invisibly returns
the web page style sheet (page.style
),
the web page content (page.content
),
the complete html-output (page.complete
) and
the html-table with inline-css for use with knitr (knitr
)
for further use.
# prepare sample data set data(efc) # print simple cross table with labels ## Not run: if (interactive()) { tab_xtab(efc$e16sex, efc$e42dep) # print cross table with manually set # labels and expected values tab_xtab( efc$e16sex, efc$e42dep, var.labels = c("Elder's gender", "Elder's dependency"), show.exp = TRUE ) # print minimal cross table with labels, total col/row highlighted tab_xtab(efc$e16sex, efc$e42dep, show.cell.prc = FALSE, emph.total = TRUE) # User defined style sheet tab_xtab(efc$e16sex, efc$e42dep, CSS = list(css.table = "border: 2px solid;", css.tdata = "border: 1px solid;", css.horline = "border-bottom: double blue;")) # ordinal data, use Kendall's tau tab_xtab(efc$e42dep, efc$quol_5, statistics = "kendall") # calculate Spearman's rho, with continuity correction tab_xtab( efc$e42dep, efc$quol_5, statistics = "spearman", exact = FALSE, continuity = TRUE ) } ## End(Not run)
# prepare sample data set data(efc) # print simple cross table with labels ## Not run: if (interactive()) { tab_xtab(efc$e16sex, efc$e42dep) # print cross table with manually set # labels and expected values tab_xtab( efc$e16sex, efc$e42dep, var.labels = c("Elder's gender", "Elder's dependency"), show.exp = TRUE ) # print minimal cross table with labels, total col/row highlighted tab_xtab(efc$e16sex, efc$e42dep, show.cell.prc = FALSE, emph.total = TRUE) # User defined style sheet tab_xtab(efc$e16sex, efc$e42dep, CSS = list(css.table = "border: 2px solid;", css.tdata = "border: 1px solid;", css.horline = "border-bottom: double blue;")) # ordinal data, use Kendall's tau tab_xtab(efc$e42dep, efc$quol_5, statistics = "kendall") # calculate Spearman's rho, with continuity correction tab_xtab( efc$e42dep, efc$quol_5, statistics = "spearman", exact = FALSE, continuity = TRUE ) } ## End(Not run)
Save (or show) content of an imported SPSS, SAS or Stata data file,
or any similar labelled data.frame
, as HTML table.
This quick overview shows variable ID number, name, label,
type and associated value labels. The result can be
considered as "codeplan" of the data frame.
view_df( x, weight.by = NULL, alternate.rows = TRUE, show.id = TRUE, show.type = FALSE, show.values = TRUE, show.string.values = FALSE, show.labels = TRUE, show.frq = FALSE, show.prc = FALSE, show.wtd.frq = FALSE, show.wtd.prc = FALSE, show.na = FALSE, max.len = 15, sort.by.name = FALSE, wrap.labels = 50, verbose = FALSE, CSS = NULL, encoding = NULL, file = NULL, use.viewer = TRUE, remove.spaces = TRUE )
view_df( x, weight.by = NULL, alternate.rows = TRUE, show.id = TRUE, show.type = FALSE, show.values = TRUE, show.string.values = FALSE, show.labels = TRUE, show.frq = FALSE, show.prc = FALSE, show.wtd.frq = FALSE, show.wtd.prc = FALSE, show.na = FALSE, max.len = 15, sort.by.name = FALSE, wrap.labels = 50, verbose = FALSE, CSS = NULL, encoding = NULL, file = NULL, use.viewer = TRUE, remove.spaces = TRUE )
x |
A (labelled) data frame, imported by |
weight.by |
Name of variable in |
alternate.rows |
Logical, if |
show.id |
Logical, if |
show.type |
Logical, if |
show.values |
Logical, if |
show.string.values |
Logical, if |
show.labels |
Logical, if |
show.frq |
Logical, if |
show.prc |
Logical, if |
show.wtd.frq |
Logical, if |
show.wtd.prc |
Logical, if |
show.na |
logical, if |
max.len |
Numeric, indicates how many values and value labels per variable are shown. Useful for variables with many different values, where the output can be truncated. |
sort.by.name |
Logical, if |
wrap.labels |
Numeric, determines how many chars of the value, variable or axis labels are displayed in one line and when a line break is inserted. |
verbose |
Logical, if |
CSS |
A |
encoding |
Character vector, indicating the charset encoding used
for variable and value labels. Default is |
file |
Destination file, if the output should be saved as file.
If |
use.viewer |
Logical, if |
remove.spaces |
Logical, if |
Invisibly returns
the web page style sheet (page.style
),
the web page content (page.content
),
the complete html-output (page.complete
) and
the html-table with inline-css for use with knitr (knitr
)
for further use.
## Not run: # init dataset data(efc) # view variables view_df(efc) # view variables w/o values and value labels view_df(efc, show.values = FALSE, show.labels = FALSE) # view variables including variable typed, orderd by name view_df(efc, sort.by.name = TRUE, show.type = TRUE) # User defined style sheet view_df(efc, CSS = list(css.table = "border: 2px solid;", css.tdata = "border: 1px solid;", css.arc = "color:blue;")) ## End(Not run)
## Not run: # init dataset data(efc) # view variables view_df(efc) # view variables w/o values and value labels view_df(efc, show.values = FALSE, show.labels = FALSE) # view variables including variable typed, orderd by name view_df(efc, sort.by.name = TRUE, show.type = TRUE) # User defined style sheet view_df(efc, CSS = list(css.table = "border: 2px solid;", css.tdata = "border: 1px solid;", css.arc = "color:blue;")) ## End(Not run)