table_categorical() builds publication-ready categorical
tables suitable for APA-style reporting in social science and data
science research. With by, it produces grouped
cross-tabulation tables with chi-squared \(p\)-values, effect sizes, confidence
intervals, and multi-level headers. Without by, it produces
one-way frequency-style tables for the selected variables. Export to gt,
tinytable, flextable, Excel, or Word. This vignette walks through the
main features.
For grouped tables, provide a data frame, one or more selected variables, and a grouping variable:
table_categorical(
sochealth,
select = c(smoking, physical_activity, dentist_12m),
by = education
)
#> Categorical table by education
#>
#> Variable │ Lower secondary n Lower secondary % Upper secondary n
#> ───────────────────┼─────────────────────────────────────────────────────────
#> smoking │
#> No │ 179 69.6 415
#> Yes │ 78 30.4 112
#> ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
#> physical_activity │
#> No │ 177 67.8 310
#> Yes │ 84 32.2 229
#> ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
#> dentist_12m │
#> No │ 113 43.3 174
#> Yes │ 148 56.7 365
#>
#> Variable │ Upper secondary % Tertiary n Tertiary % Total n
#> ───────────────────┼────────────────────────────────────────────────────
#> smoking │
#> No │ 78.7 332 84.9 926
#> Yes │ 21.3 59 15.1 249
#> ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
#> physical_activity │
#> No │ 57.5 163 40.8 650
#> Yes │ 42.5 237 59.2 550
#> ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
#> dentist_12m │
#> No │ 32.3 67 16.8 354
#> Yes │ 67.7 333 83.2 846
#>
#> Variable │ Total % p Cramer's V
#> ───────────────────┼─────────────────────────────
#> smoking │ < .001 .14
#> No │ 78.8
#> Yes │ 21.2
#> ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
#> physical_activity │ < .001 .21
#> No │ 54.2
#> Yes │ 45.8
#> ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
#> dentist_12m │ < .001 .22
#> No │ 29.5
#> Yes │ 70.5The default output is "default", which prints a styled
ASCII table to the console. Use output = "data.frame" to
get a plain numeric data frame suitable for further processing.
Omit by to build a frequency-style table for the
selected variables:
table_categorical(
sochealth,
select = c(smoking, physical_activity),
output = "default"
)
#> Categorical table
#>
#> Variable │ n %
#> ────────────────────────┼─────────────────────
#> smoking │
#> No │ 926 78.8
#> Yes │ 249 21.2
#> ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
#> physical_activity │
#> No │ 650 54.2
#> Yes │ 550 45.8table_categorical() supports several output formats. The
table below summarizes the options:
| Format | Description |
|---|---|
"default" |
Styled ASCII table in the console (default) |
"data.frame" |
Wide data frame, one row per modality |
"long" |
Long data frame, one row per modality x group |
"gt" |
Formatted gt table |
"tinytable" |
Formatted tinytable |
"flextable" |
Formatted flextable |
"excel" |
Excel file (requires excel_path) |
"clipboard" |
Copy to clipboard |
"word" |
Word document (requires word_path) |
The "gt" format produces a table with APA-style borders,
column spanners, and proper alignment:
Use output = "data.frame" for a wide numeric data frame
(one row per modality), or output = "long" for a long
format (one row per modality x group):
table_categorical(
sochealth,
select = smoking,
by = education,
output = "data.frame"
)
#> Variable Level Lower secondary n Lower secondary % Upper secondary n
#> 1 smoking No 179 69.6 415
#> 2 smoking Yes 78 30.4 112
#> Upper secondary % Tertiary n Tertiary % Total n Total % p
#> 1 78.7 332 84.9 926 78.8 2.012877e-05
#> 2 21.3 59 15.1 249 21.2 2.012877e-05
#> Cramer's V
#> 1 0.1356677
#> 2 0.1356677By default, table_categorical() uses variable names as
row headers. Use the labels argument to provide
human-readable labels:
By default, table_categorical() reports Cramer’s V for
nominal variables and automatically switches to Kendall’s Tau-b when
both variables are ordered factors. Override with
assoc_measure:
table_categorical(
sochealth,
select = smoking,
by = education,
assoc_measure = "lambda",
output = "tinytable"
)Add confidence intervals with assoc_ci = TRUE. In
rendered formats (gt, tinytable, flextable), the CI is shown inline:
pkgdown_dark_gt(
table_categorical(
sochealth,
select = c(smoking, physical_activity),
by = education,
assoc_ci = TRUE,
output = "gt"
)
)In data formats ("data.frame", "long",
"excel", "clipboard"), separate
CI lower and CI upper columns are added:
table_categorical(
sochealth,
select = smoking,
by = education,
assoc_ci = TRUE,
output = "data.frame"
)
#> Variable Level Lower secondary n Lower secondary % Upper secondary n
#> 1 smoking No 179 69.6 415
#> 2 smoking Yes 78 30.4 112
#> Upper secondary % Tertiary n Tertiary % Total n Total % p
#> 1 78.7 332 84.9 926 78.8 2.012877e-05
#> 2 21.3 59 15.1 249 21.2 2.012877e-05
#> Cramer's V CI lower CI upper
#> 1 0.1356677 0.07909264 0.1913716
#> 2 0.1356677 0.07909264 0.1913716Pass survey weights with the weights argument. Use
rescale = TRUE so the total weighted N matches the
unweighted N:
By default, rows with missing values are dropped
(drop_na = TRUE). Set drop_na = FALSE to
display them as a “(Missing)” category:
Use levels_keep to display only specific modalities. The
order you specify controls the display order, which is useful for
placing “(Missing)” first to highlight missingness:
Control the number of digits for percentages, p-values, and the association measure:
For Excel export, provide a file path:
table_categorical(
sochealth,
select = c(smoking, physical_activity, dentist_12m),
by = education,
output = "excel",
excel_path = "my_table.xlsx"
)For Word, use output = "word":
table_categorical(
sochealth,
select = c(smoking, physical_activity, dentist_12m),
by = education,
output = "word",
word_path = "my_table.docx"
)You can also copy directly to the clipboard for pasting into a spreadsheet or a text editor: