| Title: | Parallel GLM |
| Version: | 0.1.9 |
| Description: | Provides a parallel estimation method for generalized linear models without compiling with a multithreaded LAPACK or BLAS. |
| License: | GPL-2 |
| URL: | https://github.com/remlapmot/parglm, https://remlapmot.github.io/parglm/ |
| BugReports: | https://github.com/remlapmot/parglm/issues |
| Imports: | Matrix, parallelly, Rcpp |
| Suggests: | biglm, broom, broom.helpers, fastglm, glm2, gtsummary, knitr, lmtest, microbenchmark, rmarkdown, sandwich, speedglm, SuppDists, testthat |
| LinkingTo: | Rcpp, RcppArmadillo |
| VignetteBuilder: | knitr |
| Config/roxygen2/version: | 8.0.0 |
| Encoding: | UTF-8 |
| NeedsCompilation: | yes |
| Packaged: | 2026-05-12 13:00:29 UTC; eptmp |
| Author: | Benjamin Christoffersen
|
| Maintainer: | Tom Palmer <remlapmot@hotmail.com> |
| Repository: | CRAN |
| Date/Publication: | 2026-05-12 13:20:02 UTC |
parglm: Parallel GLM
Description
Provides a parallel estimation method for generalized linear models without compiling with a multithreaded LAPACK or BLAS.
Author(s)
Maintainer: Tom Palmer remlapmot@hotmail.com (ORCID)
Authors:
Tom Palmer remlapmot@hotmail.com (ORCID)
Benjamin Christoffersen boennecd@gmail.com (ORCID)
Other contributors:
Anthony Williams [copyright holder]
Boost developers [copyright holder]
See Also
Useful links:
Report bugs at https://github.com/remlapmot/parglm/issues
Fitting Generalized Linear Models in Parallel
Description
Function like glm which can make the computation
in parallel. The function supports most families listed in family.
See "vignette("parglm", "parglm")" for run time examples.
Usage
parglm(
formula,
family = gaussian,
data,
weights,
subset,
na.action,
start = NULL,
offset,
control = list(...),
contrasts = NULL,
model = TRUE,
x = FALSE,
y = TRUE,
...
)
parglm.fit(
x,
y,
weights = rep(1, NROW(x)),
start = NULL,
etastart = NULL,
mustart = NULL,
offset = rep(0, NROW(x)),
family = gaussian(),
control = list(),
intercept = TRUE,
...
)
Arguments
formula |
an object of class |
family |
a |
data |
an optional data frame, list or environment containing the variables in the model. |
weights |
an optional vector of 'prior weights' to be used in the fitting process. Should
be |
subset |
an optional vector specifying a subset of observations to be used in the fitting process. |
na.action |
a function which indicates what should happen when the data contain |
start |
starting values for the parameters in the linear predictor. |
offset |
this can be used to specify an a priori known component to be included in the linear predictor during fitting. |
control |
a list of parameters for controlling the fitting process.
For parglm.fit this is passed to |
contrasts |
an optional list. See the |
model |
a logical value indicating whether model frame should be included as a component of the returned value. |
x, y |
For For |
... |
For For |
etastart |
starting values for the linear predictor. Not supported. |
mustart |
starting values for the vector of means. Not supported. |
intercept |
logical. Should an intercept be included in the null model? |
Details
The current implementation uses min(as.integer(n / p), nthreads)
threads where n is the number of observations, p is the
number of covariates, and nthreads is the nthreads element of
the list
returned by parglm.control. Thus, there is likely little (if
any) reduction in computation time if p is almost equal to n.
The current implementation cannot handle p > n.
Since parglm returns a standard glm object, it is
compatible with the sandwich package for heteroskedasticity-consistent
(HC) and cluster-robust standard errors via vcovHC
and vcovCL. This requires model = TRUE
(the default). See vignette("sandwich", "parglm") for examples.
Value
glm object as returned by glm but differs mainly by the qr
element. The qr element in the object returned by parglm(.fit) only has the R
matrix from the QR decomposition.
Examples
# mtcars has 32 rows, sufficient for 2 threads (>= 16 rows per thread)
f1 <- glm (mpg ~ wt + hp, data = mtcars, family = Gamma(link = "log"))
f2 <- parglm(mpg ~ wt + hp, data = mtcars, family = Gamma(link = "log"),
control = parglm.control(nthreads = 2L))
all.equal(coef(f1), coef(f2))
Auxiliary for Controlling GLM Fitting in Parallel
Description
Auxiliary function for parglm fitting.
Usage
parglm.control(
epsilon = 1e-08,
maxit = 25,
trace = FALSE,
nthreads = parallelly::availableCores(omit = 1L),
block_size = NULL,
method = "LINPACK",
nthreads_auto = missing(nthreads)
)
Arguments
epsilon |
positive convergence tolerance. |
maxit |
integer giving the maximal number of IWLS iterations. |
trace |
logical indicating if output should be produced doing estimation. |
nthreads |
number of cores to use. Defaults to
|
block_size |
number of observations to include in each parallel block. |
method |
string specifying which method to use. Either |
nthreads_auto |
logical; for internal use only. Records whether
|
Details
The LINPACK method uses the same QR method as glm.fit for the final QR decomposition.
This is the dqrdc2 method described in qr. All other QR
decompositions but the last are made with DGEQP3 from LAPACK.
See Wood, Goude, and Shaw (2015) for details on the QR method.
The FAST method computes the Fisher information and then solves the normal
equation. This is faster but less numerically stable.
Value
A list with components named as the arguments.
References
Wood, S.N., Goude, Y. & Shaw S. (2015) Generalized additive models for large datasets. Journal of the Royal Statistical Society, Series C 64(1): 139-155.
Examples
# use one core
f1 <- parglm(mpg ~ wt + hp, data = mtcars, family = Gamma(link = "log"),
control = parglm.control(nthreads = 1L))
# use two cores (mtcars has 32 rows, sufficient for 2 threads)
f2 <- parglm(mpg ~ wt + hp, data = mtcars, family = Gamma(link = "log"),
control = parglm.control(nthreads = 2L))
all.equal(coef(f1), coef(f2))
Tidy a parglm model with robust standard errors
Description
A drop-in tidy_fun for tbl_regression that
computes heteroskedasticity-consistent (HC) or cluster-robust confidence
intervals via sandwich and lmtest.
Usage
tidy_parglm_robust(
x,
vcov. = "HC3",
conf.int = TRUE,
conf.level = 0.95,
exponentiate = FALSE,
...
)
Arguments
x |
a |
vcov. |
the robust variance-covariance estimator. A string is passed
as the |
conf.int |
logical; whether to include confidence intervals. |
conf.level |
confidence level for the intervals. |
exponentiate |
logical; whether to exponentiate the estimate and confidence interval limits. |
... |
unused; present for compatibility with the |
Details
Pass this function as tidy_fun to
tbl_regression:
# HC3 (default) tbl_regression(fit, tidy_fun = tidy_parglm_robust) # HC1 tbl_regression(fit, tidy_fun = \(x, ...) tidy_parglm_robust(x, vcov. = "HC1", ...)) # Cluster-robust tbl_regression(fit, tidy_fun = \(x, ...) tidy_parglm_robust( x, vcov. = \(m) sandwich::vcovCL(m, cluster = ~ cluster_var), ...))
Value
a data.frame with columns term, estimate,
std.error, statistic, p.value, and (when
conf.int = TRUE) conf.low and conf.high.
Examples
fp <- parglm(mpg ~ wt + hp, data = mtcars,
control = parglm.control(nthreads = 1L))
if (requireNamespace("sandwich", quietly = TRUE) &&
requireNamespace("lmtest", quietly = TRUE)) {
tidy_parglm_robust(fp)
}