Implementing a dist_structure subclass

library(dist.structure)

dist.structure is a protocol package: the closed-form constructors (exp_series, exp_kofn, …) and topology shortcuts (series_dist, kofn_dist, bridge_dist, …) are reference implementations of an S3 contract that anyone can extend. Downstream packages do exactly that: serieshaz adds a dfr_dist_series class for arbitrary dynamic-failure-rate series systems; kofn builds inference machinery on the closed-form k-out-of-n classes; future packages will add their own.

This vignette walks through the protocol from the implementor’s side: what to provide, what you get for free, when to override defaults for performance, and how to validate your subclass.

The contract: four steps

Every dist_structure subclass provides:

  1. A class chain that includes "dist_structure", "univariate_dist", and "dist". Almost always also a sub-chain like "coherent_dist" if the system is coherent.
  2. ncomponents(x): number of components m.
  3. component(x, j, ...): the j-th component returned as an algebraic.dist::dist object with parameters baked in. The returned dist must be independently evaluable: surv(component(x, j))(t), sampler(component(x, j))(n), etc., must all work.
  4. At least one of:
    • phi(x, state): the structure function, returning 0 or 1 for any binary state vector of length m.
    • min_paths(x): a list of integer vectors enumerating minimal functioning component subsets.
    phi and min_paths are dual: each has a default that derives from the other. If you provide both, they must be consistent (phi(x, state) == 1 iff state covers some path in min_paths(x)). Inconsistent implementations produce silently inconsistent results, because reliability / critical_states / is_coherent route through phi while min_cuts / system_signature route through min_paths.

Everything else has a default that derives from those four primitives. Defaults exist for min_cuts, system_signature, critical_states, reliability, is_coherent, structural_importance, system_lifetime, system_censoring, dual, surv, cdf, and sampler.

A worked example: an alarm system

Consider a fire alarm with m redundant smoke sensors and one master controller. The system functions if (i) at least k of the m sensors function AND (ii) the master controller functions. This is not exactly a k-out-of-n system (the master is special) and not exactly a series-of-(parallel-of-sensors-with-master) (the master gates everything). We can capture it directly as a custom dist_structure.

Step 1: pick a representation

Components are indexed 1..m for the sensors and m+1 for the master.

Step 2: write the constructor

alarm_dist <- function(k, sensor_components, master_component) {
  m_sensors <- length(sensor_components)
  stopifnot(k >= 1L, k <= m_sensors,
            inherits(master_component, "dist"),
            all(vapply(sensor_components,
                       function(d) inherits(d, "dist"),
                       logical(1L))))
  structure(
    list(
      k = as.integer(k),
      m_sensors = as.integer(m_sensors),
      m = as.integer(m_sensors + 1L),
      components = c(sensor_components, list(master_component))
    ),
    class = c("alarm_dist", "dist_structure",
              "univariate_dist", "continuous_dist", "dist")
  )
}

The class chain declares membership in dist_structure and the algebraic.dist ancestors. We bundle the master into the components list so the j-th component is uniformly accessible.

Step 3: implement the three required generics

ncomponents.alarm_dist <- function(x) x$m

component.alarm_dist <- function(x, j, ...) {
  stopifnot(j >= 1L, j <= x$m)
  x$components[[j]]
}

phi.alarm_dist <- function(x, state) {
  stopifnot(length(state) == x$m)
  sensor_state <- state[seq_len(x$m_sensors)]
  master_state <- state[x$m]
  as.integer(sum(sensor_state) >= x$k && master_state == 1L)
}

That’s all that’s strictly required by the protocol.

Step 4: validate

sensors <- replicate(4, algebraic.dist::exponential(0.5), simplify = FALSE)
master <- algebraic.dist::exponential(0.1)  # master is most reliable
alarm <- alarm_dist(k = 2L, sensors, master)

validate_dist_structure(alarm)

If we had forgotten phi.alarm_dist, validate_dist_structure() would have stopped with a clear error pointing at the missing primitive instead of letting the failure surface much later in some default method’s dispatch.

What you get for free

Every default below is now available on alarm:

# Number of components
ncomponents(alarm)
#> [1] 5

# Structure function evaluation
phi(alarm, c(1, 1, 0, 0, 1))   # 2 sensors + master alive: alarms
#> [1] 1
phi(alarm, c(1, 1, 1, 1, 0))   # all sensors but no master: silent
#> [1] 0

# Minimal cut sets (derived from phi via the Berge transversal)
min_cuts(alarm)
#> [[1]]
#> [1] 1 2 3
#> 
#> [[2]]
#> [1] 1 2 4
#> 
#> [[3]]
#> [1] 1 3 4
#> 
#> [[4]]
#> [1] 2 3 4
#> 
#> [[5]]
#> [1] 5

# Critical states for component j
critical_states(alarm, 5L)  # the master: critical in every state where >= k sensors are alive
#>       [,1] [,2] [,3] [,4]
#>  [1,]    1    1    0    0
#>  [2,]    1    0    1    0
#>  [3,]    0    1    1    0
#>  [4,]    1    1    1    0
#>  [5,]    1    0    0    1
#>  [6,]    0    1    0    1
#>  [7,]    1    1    0    1
#>  [8,]    0    0    1    1
#>  [9,]    1    0    1    1
#> [10,]    0    1    1    1
#> [11,]    1    1    1    1

# Reliability at uniform component reliability p
reliability(alarm, 0.9)
#> [1] 0.89667

# Distribution algebra: system survival via component composition
S <- algebraic.dist::surv(alarm)
S(c(1, 5, 10))
#> [1] 0.7494236430 0.0219195370 0.0000993122

# Sampling: m independent component lifetimes, combined via system_lifetime
withr::with_seed(1, {
  x <- algebraic.dist::sampler(alarm)(1000)
  mean(x)
})
#> [1] 1.878051

Notice that min_cuts, critical_states, reliability, surv, and sampler all “just work” without being implemented for alarm_dist. They composed from ncomponents, component, phi, and the component-level distribution algebra.

When to override a default

Defaults are designed for correctness across all topologies. They are not always the fastest path. The two most common reasons to override:

1. Closed-form distributional shortcuts.

The default surv.dist_structure enumerates over component states via the reliability polynomial, which is O(2^m). If your subclass has an analytical formula for S_sys(t), override surv directly:

surv.alarm_dist <- function(x, ...) {
  k <- x$k
  m_sens <- x$m_sensors
  sensor_surv <- lapply(seq_len(m_sens),
                        function(j) algebraic.dist::surv(component(x, j)))
  master_surv <- algebraic.dist::surv(component(x, x$m))
  function(t, ...) {
    vapply(t, function(ti) {
      # P(>= k sensors alive at ti) * P(master alive at ti).
      ps <- vapply(sensor_surv, function(S) S(ti), numeric(1L))
      sensor_p <- sum(vapply(seq.int(k, m_sens), function(sz) {
        sum(vapply(utils::combn(m_sens, sz, simplify = FALSE),
                   function(A) {
                     prod(ps[A]) * prod(1 - ps[setdiff(seq_len(m_sens), A)])
                   }, numeric(1L)))
      }, numeric(1L)))
      sensor_p * master_surv(ti)
    }, numeric(1L))
  }
}

2. Topology-specific shortcuts.

Closed-form min_paths saves the Berge transversal from rederiving them. For our alarm system, every minimal path is a (k-of-m-sensors) subset unioned with the master:

min_paths.alarm_dist <- function(x) {
  k <- x$k
  m_sens <- x$m_sensors
  master_idx <- x$m
  lapply(utils::combn(m_sens, k, simplify = FALSE),
         function(P) sort(c(as.integer(P), master_idx)))
}

With min_paths.alarm_dist defined, min_cuts(alarm) becomes the default Berge transversal but starts from a small explicit list rather than re-deriving paths from phi (which it would do via the default chain through binary_grid).

The general rule: provide what you know in closed form; rely on defaults for the rest. Do not pre-emptively override a default unless you have a reason; correctness across all dispatch chains is much easier when only the primitives are subclass-specific.

Inheriting from coherent_dist

If your subclass is a coherent system, you can inherit from coherent_dist and skip step 3 entirely:

parallel_of_master_and_sensors <- function(sensor_components, master_component) {
  components <- c(sensor_components, list(master_component))
  m <- length(components)
  obj <- coherent_dist(
    min_paths = list(seq_len(m - 1L), m),  # all sensors as one path; master as another
    components = components,
    m = m
  )
  class(obj) <- c("master_or_sensors_dist", class(obj))
  obj
}

coherent_dist already implements ncomponents, component, and min_paths for you; you supply only the topology-specific data (min_paths and components). The class chain automatically includes "coherent_dist", so all the usual dispatch (including dual.coherent_dist, which produces a real dual coherent_dist instead of a lazy wrapper) takes over.

What the default dual does

dual(x) produces the dual structure phi_dual(state) = 1 - phi(x, 1 - state). When x inherits from coherent_dist, the dual.coherent_dist method swaps minimal paths and cuts to produce a fresh coherent_dist. When x is a generic dist_structure (your own subclass), the default returns a lazy wrapper of class "dual_of_system". The lazy wrapper implements ncomponents, component, and phi by deferring to the original; everything else (signatures, importance, surv) flows through default methods exactly as for any other dist_structure.

If you want dual() on your subclass to return something more specific, override:

dual.alarm_dist <- function(x) {
  # ... return a transformed alarm_dist ...
}

Cross-references

For more detail on any specific generic, see the ?phi, ?min_paths, ?dual, ?reliability, and ?dist_structure help pages.