The stage package implements the STAGE model (State Transition Analysis via Generative Estimator), a Bayesian generative classifier for estimating transition points (m50) from two groups—at what value of does become probable. An example is length-at-maturity problems, where is binary maturity status, and is a continuous measurement, such as length.
Existing methods for these sorts of applications have important limitations (see below), which the STAGE model overcomes.
Features of the STAGE model are:
- Each class is fit with a mixture distribution that combines a uniform plateau with a Gaussian tail
- Estimation of the transition point (m50) is based on the relative densities (Bayes’ rule) of the classes given , not discriminative regression
- A shared transition region centred at m50 with width d.
The goal is a model transition-point estimation that focuses inference on data in and around the transition point while being robust to unbalanced sample sizes.
![]()
The process of testing the STAGE model against other methods is underway.
Package features
-
fit_stage()— fit STAGE model (single or multi-population) -
transition_point()— summarise posteriors for m50 -
predict.stage_fit()— predict class probabilities or labels -
stage_priors()— prior helper -
get_cmdstan_model()— internal CmdStan compiler helper
Planned additions:
- Reparameterised logistic/HOF model
- Bayesian LDA
- Posterior predictive classification
- Diagnostic + visualisation tools
- Full vignette
Installation
You need cmdstanr + CmdStan installed:
install.packages("cmdstanr",
repos = c("https://mc-stan.org/r-packages/", getOption("repos"))
)
cmdstanr::install_cmdstan() # run onceThen install from GitHub:
devtools::install_github("anhsmith/stage")Example
library(stage)
set.seed(123)
N <- 200
L <- 1000
U <- 1500
true_m50 <- 1250
true_d <- 100
# simulate data
x <- runif(N, L, U)
p <- plogis((x - true_m50) / (true_d / 4))
y <- rbinom(N, 1, p)
# fit STAGE model
fit <- fit_stage(x, y, L = L, U = U, chains = 2, iter = 1000)
fitEstimated transition point:
transition_point(fit)Prediction:
x_grid <- seq(L, U, length.out = 100)
p_hat <- predict(fit, x_grid, type = "prob")
plot(x_grid, p_hat, type = "l",
xlab = "x", ylab = "P(mature)", las = 1)Hierarchical model
Multiple populations:
group <- rep(c("A","B"), each = N/2)
fit2 <- fit_stage(x, y, group = group, L = L, U = U)
transition_point(fit2)This returns:
- global transition point
m50 - population-specific transition points
m50_pop[j]
