This vignette allows to reproduce the results for the case study
discussed in Malsiner-Walli, Frühwirth-Schnatter,
and Grün (2026) for the diabetes data set from
package mclust. In particular, we illustrate the CliPS
approach proposed in Malsiner-Walli,
Frühwirth-Schnatter, and Grün (2026) for a Bayesian multivariate
Gaussian mixture with a prior on the number of components \(K\). First, we fit the dynamic mixture of
multivariate Gaussian mixtures to the data using the telescoping
sampler. Then, based on the posterior draws, we identify the mixture by
clustering the component means in the point process representation. The
numbering of the following steps is aligned with the numbering of the
CLiPS procedure in Malsiner-Walli,
Frühwirth-Schnatter, and Grün (2026).
We start by loading the package, the data and plotting the data.
library("telescope")
data("diabetes", package = "mclust")
y <- diabetes[, c("glucose", "insulin", "sspg")]
z <- diabetes[, "class"]
pairs(y, col = c("darkred", "blue", "darkgreen")[z], pch = 19)We use the prior specification and the telescoping sampler for MCMC sampling as proposed in Frühwirth-Schnatter, Malsiner-Walli, and Grün (2021). In particular, the prior on the number of components \(K\) is selected as a beta-negative-binomial distribution with parameters \((1,4,3)\). The mixture weights a-priori have a Dirichlet distribution with parameter \(\alpha/K\) where \(\alpha = 0.5\).
priorOnK <- priorOnK_spec("BNB_143")
priorOnWeights <- priorOnAlpha_spec("alpha_const", alpha = 0.5)We specify the prior on the component parameters as in Frühwirth-Schnatter, Malsiner-Walli, and Grün (2021).
r <- ncol(y)
R <- apply(y, 2, function(x) diff(range(x)))
b0 <- apply(y, 2, median)
B_0 <- rep(1, r)
B0 <- diag((R^2) * B_0)
c0 <- 2.5 + (r-1)/2
g0 <- 0.5 + (r-1)/2
G0 <- 100 * g0/c0 * diag((1/R^2), nrow = r)
C0 <- g0 * chol2inv(chol(G0))For the sampling, we set the number of burn-in iterations
burnin to be discarded and the number of recorded
iterations M after thinning with thin.
\(k\)-means clustering into 3 groups is used to get an initial partition which is used to determine initial values for the component specific means and covariances. The component weights are initialized with equal weights.
Using this prior specification as well as initialization and MCMC
settings, we draw samples from the posterior using the telescoping
sampler. For computational ease, we set a maximum number of components
to be considered using Kmax.
Kmax <- 50
samples <-
sampleMultNormMixture(
y, z0, mu, Sigma, eta,
c0, g0, G0, C0, b0, B0,
M, burnin, thin,
Kmax = Kmax, G = "MixDynamic",
priorOnK, priorOnWeights = priorOnWeights)The sampling function returns a named list where the sampled
parameters and latent variables are contained. The list includes the
component means Mu, the weights Eta, the
allocations S, the number of observations Nk
assigned to components, the number of components K, the
number of filled components Kplus, and the parameter values
corresponding to the mode of the nonnormalized posterior
nonnormpost_mode_list. These values are extracted for
further post-processing.
Mu <- samples$Mu
Eta <- samples$Eta
S <- samples$S
Nk <- samples$Nk
K <- samples$K
Kplus <- samples$Kplus
nonnormpost_mode_list <- samples$nonnormpost_modeDiagnostic plots of the run show the sampled \(K\) and \(K_+\) and the sampled weights \(\eta_k\), see Figure 5 of Malsiner-Walli, Frühwirth-Schnatter, and Grün (2026).
par(mfrow = c(1, 2))
with(samples,
matplot(burnin + 1:M, cbind(K, Kplus), type = "l", lty = 1,
col = c("grey", "black"),
xlab = "iteration",
ylab = expression(`K/` ~K["+"]),
ylim = c(0, Kmax)))
matplot(burnin + 1:M, samples$Eta, type = "l", lty = 1, col = 1,
xlab = "iteration", ylim = 0:1,
ylab = expression(eta["k"]))The following plot shows the pairwise scatter plot of all sampled component means (within suitable ranges), see Figure 6 in Malsiner-Walli, Frühwirth-Schnatter, and Grün (2026).
Mu_ <- do.call("rbind", lapply(1:Kmax, function(k) samples$Mu[,,k])) |>
as.data.frame() |> na.omit()
colnames(Mu_) <- colnames(y)
Mu_ <- subset(Mu_,
(glucose >= 50 & glucose <= 320) &
(sspg >= 0 & sspg <= 500) &
(insulin >= 300 & insulin <= 1300))
pairs(Mu_, col = rgb(0, 0, 0, 0.2), pch = 19)We determine the posterior of the number of filled components.
## [1] 0.000 0.000 0.948 0.052
The number of clusters \(\hat{K}_+\) is estimated by taking the mode of the posterior of \(K_+\).
## [1] 3
The number of draws \(M_0\) where \(K_+ = \hat{K}_+\) is determined.
## [1] 948
We determine the indices of those iterations which have exactly
Kplus_hat filled components. For each parameter, we extract
those draws with exactly \(\hat{K}_+\)
filled components and eliminate the draws of empty components.
index <- Kplus == Kplus_hat
Nk[is.na(Nk)] <- 0
Nk_Kplus <- (Nk[index, ] > 0)
Mu_inter <- Mu[index, , , drop = FALSE]
Mu_Kplus <- array(0, dim = c(M0, r, Kplus_hat))
for (j in 1:r) {
Mu_Kplus[, j, ] <- Mu_inter[, j, ][Nk_Kplus]
}
Eta_inter <- Eta[index, ]
Eta_Kplus <- matrix(Eta_inter[Nk_Kplus], ncol = Kplus_hat)
w <- which(index)
S_Kplus <- matrix(0, M0, nrow(y))
for (i in seq_along(w)) {
m <- w[i]
perm_S <- rep(0, Kmax)
perm_S[Nk[m, ] != 0] <- 1:Kplus_hat
S_Kplus[i, ] <- perm_S[S[m, ]]
}We call the function identifyMixture() of the package
telescope to cluster the draws. The argument
Func contains the array of the functional values \(\phi(\theta_k)\) with dimension (\(M_0 \times K_+
\times d\)), where \(d=
dim(\phi(\theta_k))\). This array contains the draws for
clustering. Func_init contains the centers of the clusters
used for initializing \(k\)-means. The
draws in Mu_Kplus, Eta_Kplus,
S_Kplus are reordered according to the classification
sequence obtained with the \(k\)-means
algorithm.
Func_init <- t(nonnormpost_mode_list[[Kplus_hat]]$mu)
identified_Kplus <- identifyMixture(
Func = Mu_Kplus, Mu_Kplus, Eta_Kplus, S_Kplus, Func_init)identifyMixture() returns a named list where
S, Mu, and Eta contain the
relabed draws after having discarded draws which are not permutations,
non_perm_rate gives the non-permutation rate, and
class contains the labels of the functionals obtained with
the \(k\)-means algorithm.
## [1] 0
The non-permutation rate is \(0\).
We visualize the relabeled draws of the component means.
Mu_ <- do.call("rbind", lapply(1:Kplus_hat, function(k) identified_Kplus$Mu[,,k])) |>
as.data.frame()
colnames(Mu_) <- colnames(y)
z_ <- identified_Kplus$class
COLS <- apply(rbind(col2rgb(c("darkred", "blue", "darkgreen")),
alpha = 0.2 * 255) / 255,
2, function(x) do.call("rgb", as.list(x)))
pairs(Mu_, col = COLS[z_], pch = 19)We use the relabeled draws to characterize the cluster distribution.
We estimate the cluster specific parameters (e.g., posterior means and
cluster sizes) and determine the final partition by assigning each
observation to the cluster where it was assigned most frequently. The
final partition is stored in z_sp. Finally, the estimated
clustering solution is visualized.
## [,1] [,2] [,3]
## [1,] 105.1293 91.34298 228.83973
## [2,] 500.9749 362.36677 1095.12236
## [3,] 320.5894 166.57387 82.77191
## [1] 0.2385851 0.5640373 0.1967115
## z_sp
## 1 2 3
## 33 84 28
## z_sp
## z 1 2 3
## Chemical 24 11 1
## Normal 3 73 0
## Overt 6 0 27
## [1] 0.8551724
## [1] 0.6531253