A series system consists of \(m\) independent components, each with lifetime \(T_j\) and survival function \(S_j(t) = P(T_j > t)\). The system fails when any component fails, so the system lifetime is:
\[T_{sys} = \min(T_1, \ldots, T_m)\]
Since the components are independent:
\[S_{sys}(t) = P(T_{sys} > t) = P(T_1 > t, \ldots, T_m > t) = \prod_{j=1}^{m} S_j(t)\]
Taking the negative logarithm gives the cumulative hazard:
\[H_{sys}(t) = -\log S_{sys}(t) = \sum_{j=1}^{m} H_j(t)\]
Differentiating yields the hazard rate:
\[h_{sys}(t) = \frac{d}{dt} H_{sys}(t) = \sum_{j=1}^{m} h_j(t)\]
This is the fundamental result: the system hazard is the sum of component hazards.
Let’s verify these relationships numerically.
library(serieshaz)
sys <- dfr_dist_series(list(
dfr_weibull(shape = 2, scale = 100),
dfr_exponential(0.01),
dfr_gompertz(a = 0.001, b = 0.05)
))
h_sys <- hazard(sys)
h1 <- component_hazard(sys, 1)
h2 <- component_hazard(sys, 2)
h3 <- component_hazard(sys, 3)
t_vals <- c(10, 50, 100, 200)
for (t0 in t_vals) {
lhs <- h_sys(t0)
rhs <- h1(t0) + h2(t0) + h3(t0)
cat(sprintf("t=%3d: h_sys=%.6f, sum h_j=%.6f, diff=%.2e\n",
t0, lhs, rhs, abs(lhs - rhs)))
}
#> t= 10: h_sys=0.013649, sum h_j=0.013649, diff=0.00e+00
#> t= 50: h_sys=0.032182, sum h_j=0.032182, diff=0.00e+00
#> t=100: h_sys=0.178413, sum h_j=0.178413, diff=0.00e+00
#> t=200: h_sys=22.076466, sum h_j=22.076466, diff=0.00e+00S_sys <- surv(sys)
comp1 <- component(sys, 1)
comp2 <- component(sys, 2)
comp3 <- component(sys, 3)
S1 <- surv(comp1)
S2 <- surv(comp2)
S3 <- surv(comp3)
for (t0 in t_vals) {
lhs <- S_sys(t0)
rhs <- S1(t0) * S2(t0) * S3(t0)
cat(sprintf("t=%3d: S_sys=%.6f, prod S_j=%.6f, diff=%.2e\n",
t0, lhs, rhs, abs(lhs - rhs)))
}
#> t= 10: S_sys=0.884286, prod S_j=0.884286, diff=1.11e-16
#> t= 50: S_sys=0.377702, prod S_j=0.377702, diff=0.00e+00
#> t=100: S_sys=0.007096, prod S_j=0.007096, diff=0.00e+00
#> t=200: S_sys=0.000000, prod S_j=0.000000, diff=1.52e-210# The system's cumulative hazard is the sum of component cumulative hazards
H_sys <- cum_haz(sys)
H1 <- cum_haz(comp1)
H2 <- cum_haz(comp2)
H3 <- cum_haz(comp3)
for (t0 in t_vals) {
lhs <- H_sys(t0)
rhs <- H1(t0) + H2(t0) + H3(t0)
cat(sprintf("t=%3d: H_sys=%.6f, sum H_j=%.6f, diff=%.2e\n",
t0, lhs, rhs, abs(lhs - rhs)))
}
#> t= 10: H_sys=0.122974, sum H_j=0.122974, diff=0.00e+00
#> t= 50: H_sys=0.973650, sum H_j=0.973650, diff=0.00e+00
#> t=100: H_sys=4.948263, sum H_j=4.948263, diff=0.00e+00
#> t=200: H_sys=446.509316, sum H_j=446.509316, diff=0.00e+00The system density is:
\[f_{sys}(t) = h_{sys}(t) \cdot S_{sys}(t) = \left[\sum_{j=1}^m h_j(t)\right] \exp\left[-\sum_{j=1}^m H_j(t)\right]\]
f_sys <- density(sys)
for (t0 in t_vals) {
from_density <- f_sys(t0)
from_hS <- h_sys(t0) * S_sys(t0)
cat(sprintf("t=%3d: f(t)=%.8f, h(t)*S(t)=%.8f, diff=%.2e\n",
t0, from_density, from_hS, abs(from_density - from_hS)))
}
#> t= 10: f(t)=0.01206938, h(t)*S(t)=0.01206938, diff=0.00e+00
#> t= 50: f(t)=0.01215539, h(t)*S(t)=0.01215539, diff=0.00e+00
#> t=100: f(t)=0.00126597, h(t)*S(t)=0.00126597, diff=0.00e+00
#> t=200: f(t)=0.00000000, h(t)*S(t)=0.00000000, diff=0.00e+00The density() method and the manual \(h(t) \cdot S(t)\) calculation agree to
machine precision. Note that at \(t =
200\), both the density and survival are effectively zero — by
that time, the combined Weibull + exponential + Gompertz hazard has
driven the cumulative hazard so high that virtually all systems have
already failed.
The series system stores parameters from all components as a single
flat vector. This is critical for optimization — MLE fitting via
optim() requires a single parameter vector.
sys <- dfr_dist_series(list(
dfr_weibull(shape = 2, scale = 100), # 2 params
dfr_exponential(0.05), # 1 param
dfr_gompertz(a = 0.01, b = 0.1) # 2 params
))
# Full parameter vector (5 elements)
params(sys)
#> [1] 2e+00 1e+02 5e-02 1e-02 1e-01
# Layout maps global indices to components
param_layout(sys)
#> [[1]]
#> [1] 1 2
#>
#> [[2]]
#> [1] 3
#>
#> [[3]]
#> [1] 4 5
# Component 1 uses par[1:2], component 2 uses par[3], component 3 uses par[4:5]When an optimizer proposes a new parameter vector
par = c(p1, p2, p3, p4, p5), the series system internally
slices it: component 1 gets par[1:2], component 2 gets
par[3], and component 3 gets par[4:5].
The cumulative hazard \(H(t) = \int_0^t
h(u)\,du\) is needed for survival, CDF, and log-likelihood
computations. When all components provide an analytical
cum_haz_rate, the series system computes \(H_{sys}(t) = \sum_j H_j(t)\) exactly.
Otherwise, it falls back to numerical integration.
# All standard distributions provide analytical cumulative hazard
sys_analytical <- dfr_dist_series(list(
dfr_weibull(shape = 2, scale = 100),
dfr_exponential(0.05)
))
cat("Has analytical cum_haz:", !is.null(sys_analytical$cum_haz_rate), "\n")
#> Has analytical cum_haz: TRUE
# A custom dfr_dist without cum_haz_rate forces numerical fallback
custom <- dfr_dist(
rate = function(t, par, ...) par[1] * t^(par[1] - 1),
par = c(1.5)
)
sys_numerical <- dfr_dist_series(list(
custom,
dfr_exponential(0.05)
))
cat("Has analytical cum_haz:", !is.null(sys_numerical$cum_haz_rate), "\n")
#> Has analytical cum_haz: FALSEThe analytical path is faster and more precise, but the numerical fallback works correctly for any valid hazard function.
The log-likelihood for a series system with observed data \((t_i, \delta_i)\) is:
\[\ell(\boldsymbol{\theta}) = \sum_{i} \left[\delta_i \log h_{sys}(t_i; \boldsymbol{\theta}) - H_{sys}(t_i; \boldsymbol{\theta})\right]\]
where \(\delta_i = 1\) for exact observations, \(\delta_i = 0\) for right-censored, and \(\delta_i = -1\) for left-censored.
A generic gradient computation would apply
numDeriv::grad to the full log-likelihood, perturbing the
entire parameter vector \(\boldsymbol{\theta}\) of length \(P\). Each perturbation re-evaluates all
\(m\) component hazards at all \(n\) observations, for a cost of \(O(P \cdot n \cdot m)\).
serieshaz decomposes the gradientThe series structure admits a per-component decomposition. Because \(H_{sys} = \sum_j H_j\) and \(h_{sys} = \sum_j h_j\), the gradient with respect to component \(k\)’s parameters \(\boldsymbol{\theta}_k\) separates into two terms:
\[\frac{\partial \ell}{\partial \boldsymbol{\theta}_k} = \underbrace{ \sum_{i:\,\delta_i=1} \frac{1}{h_{sys}(t_i)} \frac{\partial h_k(t_i)}{\partial \boldsymbol{\theta}_k} }_{\text{hazard derivative term}} \;-\; \underbrace{ \sum_{i} \frac{\partial H_k(t_i)}{\partial \boldsymbol{\theta}_k} }_{\text{cumulative hazard derivative term}}\]
The two terms are computed differently:
| Term | Method | Cost |
|---|---|---|
| Hazard derivative | numDeriv::jacobian on \(h_k(t, \boldsymbol{\theta}_k)\) |
\(O(p_k \cdot n_{\text{exact}})\) per component |
| Cumulative hazard derivative | Analytical via all-censored trick (when the
component provides score_fn) |
\(O(n)\) per component |
| Cumulative hazard derivative | numDeriv::grad on \(\sum_i
H_k(t_i)\) (fallback) |
\(O(p_k \cdot n)\) per component |
Total cost: \(O(P \cdot n)\), versus \(O(P \cdot n \cdot m)\) for the naive approach — roughly an \(m\)-fold speedup.
The key insight for the cumulative hazard derivative is that a
component’s own score_fn computes:
\[\text{score}_k(\boldsymbol{\theta}_k) = \sum_i \delta_i \frac{\partial \log h_k(t_i)}{\partial \boldsymbol{\theta}_k} - \sum_i \frac{\partial H_k(t_i)}{\partial \boldsymbol{\theta}_k}\]
If we call score_fn with all observations marked as
right-censored (\(\delta_i = 0\)), the
first sum vanishes and we get:
\[\text{score}_k\big|_{\delta=0} = -\sum_i \frac{\partial H_k(t_i)}{\partial \boldsymbol{\theta}_k}\]
This extracts the cumulative hazard derivative analytically — no
numerical differentiation needed. All built-in distributions
(exponential, Weibull, Gompertz, log-logistic) provide
score_fn, so this analytical path is the common case.
sys <- dfr_dist_series(list(
dfr_weibull(shape = 2, scale = 100),
dfr_exponential(0.01),
dfr_gompertz(a = 0.001, b = 0.05)
))
set.seed(42)
times <- sampler(sys)(200)
df <- data.frame(t = times, delta = rep(1L, 200))
par0 <- params(sys)
sc_fn <- score(sys)
ll_fn <- loglik(sys)
# Decomposed score (what serieshaz computes internally)
score_val <- sc_fn(df, par = par0)
# Independent finite-difference verification
eps <- 1e-6
fd_score <- numeric(length(par0))
for (k in seq_along(par0)) {
par_p <- par_m <- par0
par_p[k] <- par0[k] + eps
par_m[k] <- par0[k] - eps
fd_score[k] <- (ll_fn(df, par = par_p) - ll_fn(df, par = par_m)) / (2 * eps)
}
result <- data.frame(
Parameter = c("shape", "scale", "rate", "a", "b"),
Decomposed = round(score_val, 6),
FiniteDiff = round(fd_score, 6),
AbsDiff = formatC(abs(score_val - fd_score), format = "e", digits = 1)
)
print(result, row.names = FALSE)
#> Parameter Decomposed FiniteDiff AbsDiff
#> shape 3.794196 3.794196 3.6e-08
#> scale 0.058314 0.058314 3.5e-08
#> rate -649.403393 -649.403384 9.0e-06
#> a 1793.760220 1793.766367 6.1e-03
#> b 328.806995 328.806992 2.9e-06The decomposed score and the finite-difference approximation agree to high precision across all five parameters, spanning three different distribution families.
At the maximum likelihood estimate, the score should be approximately zero (the first-order optimality condition). We compare the score magnitude at the true parameters (generally nonzero with finite \(n\)) versus at the MLE:
sys2 <- dfr_dist_series(list(
dfr_weibull(shape = 2, scale = 100),
dfr_exponential(0.01)
))
set.seed(42)
df2 <- data.frame(t = sampler(sys2)(300), delta = rep(1L, 300))
sc2 <- score(sys2)
cat("Score at true params:", round(sc2(df2), 4), "\n")
#> Score at true params: 2.7316 -0.0614 378.5898
solver <- fit(sys2)
result <- suppressWarnings(solver(df2, par = c(1.5, 80, 0.02)))
cat("MLE: ", round(coef(result), 4), "\n")
#> MLE: 2.4743 109.1372 0.0121
cat("Score at MLE: ", round(sc2(df2, par = coef(result)), 6), "\n")
#> Score at MLE: 0.054226 -0.001491 0.375314The score at the MLE is orders of magnitude smaller than at the true parameters, confirming the optimizer has found the stationary point. (The residual is due to finite optimizer tolerance, not a problem with the score computation.)
The Hessian also exploits the series structure. It has a natural block decomposition into within-component and cross-component terms:
Within-component block (\(k = l\)): \[\frac{\partial^2 \ell}{\partial \boldsymbol{\theta}_k \partial \boldsymbol{\theta}_k^\top} = \sum_{i:\,\delta_i=1} \left[ \frac{1}{h_{sys}(t_i)} \frac{\partial^2 h_k}{\partial \boldsymbol{\theta}_k \partial \boldsymbol{\theta}_k^\top} - \frac{1}{h_{sys}(t_i)^2} \frac{\partial h_k}{\partial \boldsymbol{\theta}_k} \frac{\partial h_k^\top}{\partial \boldsymbol{\theta}_k} \right] - \sum_i \frac{\partial^2 H_k}{\partial \boldsymbol{\theta}_k \partial \boldsymbol{\theta}_k^\top}\]
Cross-component block (\(k \neq l\)): \[\frac{\partial^2 \ell}{\partial \boldsymbol{\theta}_k \partial \boldsymbol{\theta}_l^\top} = -\sum_{i:\,\delta_i=1} \frac{1}{h_{sys}(t_i)^2} \frac{\partial h_k}{\partial \boldsymbol{\theta}_k} \frac{\partial h_l^\top}{\partial \boldsymbol{\theta}_l}\]
The cross-component blocks are free — they reuse the rate Jacobians
already computed for the score. The cumulative hazard Hessian \(\partial^2 H_k / \partial
\boldsymbol{\theta}_k^2\) is extracted analytically via the
all-censored trick on hess_fn when available. The rate
Hessian \(\partial^2 h_k / \partial
\boldsymbol{\theta}_k^2\) uses numDeriv::hessian
per-component per observation.
H_fn <- hess_loglik(sys2)
hess_val <- H_fn(df2, par = coef(result))
evals <- round(eigen(hess_val, symmetric = TRUE)$values, 2)
cat("Hessian eigenvalues at MLE:", evals, "\n")
#> Hessian eigenvalues at MLE: -0.01 -8.85 -966824.3
cat("Hessian is symmetric:", all.equal(hess_val, t(hess_val)), "\n")
#> Hessian is symmetric: TRUEAll eigenvalues are negative, confirming the MLE is a local maximum. (For non-identifiable models like exponential-only series, one eigenvalue would be near zero.)
When all components are exponential with rates \(\lambda_1, \ldots, \lambda_m\), the system hazard is constant: \(h_{sys}(t) = \sum_j \lambda_j\). This is equivalent to a single exponential with rate \(\lambda = \sum_j \lambda_j\).
rates <- c(0.1, 0.2, 0.3)
sys <- dfr_dist_series(lapply(rates, dfr_exponential))
equiv <- dfr_exponential(sum(rates))
h_sys <- hazard(sys)
h_eq <- hazard(equiv)
S_sys <- surv(sys)
S_eq <- surv(equiv)
for (t0 in c(1, 5, 10, 50)) {
cat(sprintf("t=%2d: h_sys=%.4f h_eq=%.4f | S_sys=%.6f S_eq=%.6f\n",
t0, h_sys(t0), h_eq(t0), S_sys(t0), S_eq(t0)))
}
#> t= 1: h_sys=0.6000 h_eq=0.6000 | S_sys=0.548812 S_eq=0.548812
#> t= 5: h_sys=0.6000 h_eq=0.6000 | S_sys=0.049787 S_eq=0.049787
#> t=10: h_sys=0.6000 h_eq=0.6000 | S_sys=0.002479 S_eq=0.002479
#> t=50: h_sys=0.6000 h_eq=0.6000 | S_sys=0.000000 S_eq=0.000000When all \(m\) components are Weibull with the same shape \(k\) and scale \(\lambda\), the system is also Weibull with shape \(k\) and scale \(\lambda / m^{1/k}\):
m <- 3
shape <- 2
scale <- 100
sys <- dfr_dist_series(replicate(
m, dfr_weibull(shape = shape, scale = scale), simplify = FALSE
))
equiv <- dfr_weibull(shape = shape, scale = scale / m^(1/shape))
S_sys <- surv(sys)
S_eq <- surv(equiv)
for (t0 in c(10, 30, 50)) {
cat(sprintf("t=%2d: S_sys=%.6f S_equiv=%.6f diff=%.2e\n",
t0, S_sys(t0), S_eq(t0), abs(S_sys(t0) - S_eq(t0))))
}
#> t=10: S_sys=0.970446 S_equiv=0.970446 diff=0.00e+00
#> t=30: S_sys=0.763379 S_equiv=0.763379 diff=1.11e-16
#> t=50: S_sys=0.472367 S_equiv=0.472367 diff=5.55e-17A series system with one component is equivalent to that component:
single <- dfr_dist_series(list(dfr_weibull(shape = 2, scale = 100)))
direct <- dfr_weibull(shape = 2, scale = 100)
h1 <- hazard(single)
h2 <- hazard(direct)
S1 <- surv(single)
S2 <- surv(direct)
for (t0 in c(10, 50, 100)) {
cat(sprintf("t=%3d: h=%.6f/%.6f S=%.6f/%.6f\n",
t0, h1(t0), h2(t0), S1(t0), S2(t0)))
}
#> t= 10: h=0.002000/0.002000 S=0.990050/0.990050
#> t= 50: h=0.010000/0.010000 S=0.778801/0.778801
#> t=100: h=0.020000/0.020000 S=0.367879/0.367879