Skip to contents

This vignette shows how to:

  • fit the STAGE model to a simple simulated dataset,
  • extract posterior transition points, and
  • extend to a hierarchical (multi-population) model.

Simulating data

To demonstrate the workflow, we simulate a simple dataset of lengths in [L,U][L, U] with a smooth transition around a true m50m_{50}.

N <- 200
L <- 1000
U <- 1500
true_m50 <- 1250
true_d   <- 100

# Simulate covariate
x <- runif(N, L, U)

# Smooth transition in P(mature)
p <- plogis((x - true_m50) / (true_d / 4))

# Binary maturity indicator
y <- rbinom(N, 1, p)

head(tibble(x = x, y = y))
#> # A tibble: 6 × 2
#>       x     y
#>   <dbl> <int>
#> 1 1133.     0
#> 2 1186.     0
#> 3 1286.     1
#> 4 1454.     1
#> 5 1101.     0
#> 6 1449.     1

Fitting the STAGE model

We now fit the STAGE model using fit_stage(). Because STAGE is a Bayesian model fit via Stan, we keep chains and iter small in this vignette to keep runtime acceptable; for real applications you should increase these.

fit <- fit_stage(x, y, L = L, U = U, chains = 2, iter = 1000)
fit

You can inspect the posterior with standard tools such as summary(fit$fit), bayesplot, or posterior utilities if you want more diagnostics.

Extracting the transition point

The helper transition_point() returns posterior summaries of the global transition point m50m_{50}.

For a single-population fit, this returns a list with a single element global: a named numeric vector with posterior mean, median, q2.5, and q97.5.

Predicting maturity probabilities

We can also predict Pr(maturex)\Pr(\text{mature} \mid x) over a grid of lengths, and plot the posterior mean transition curve.

x_grid <- seq(L, U, length.out = 200)

p_hat <- predict(fit, x_grid, type = "prob")

plot(
  x_grid, p_hat, type = "l",
  xlab = "Length (x)",
  ylab = "P(mature)",
  las = 1
)

Hierarchical (multi-population) model

In many applications we have multiple populations (e.g. regions, stocks, or sexes) and we expect:

  • a shared global transition point m50m_{50}, and
  • population-specific deviations.

STAGE implements this using a hierarchical structure 𝚖𝟻𝟶_𝚙𝚘𝚙[j]=m50+zjσα, \texttt{m50\_pop}[j] = m_{50} + z_j \,\sigma_\alpha, with a non-centred parameterisation for efficient sampling.

A simple two-group example:

group <- rep(c("A", "B"), each = N/2)

fit2 <- fit_stage(
  x, y,
  group = group,
  L = L, U = U,
  chains = 2, iter = 1000
)

transition_point(fit2)

transition_point(fit2) returns a list with:

  • global: posterior summary of the global m50m_{50} (mean, median, q2.5, q97.5), and
  • pop1, pop2, …: population-specific entries with summaries of m50_pop[j].

Summary

In this vignette we:

  • introduced the plateau–Gaussian likelihood underlying STAGE,
  • visualised the class densities and resulting transition probabilities,
  • simulated a simple dataset,
  • fit the STAGE model to estimate the transition point, and
  • extended the model to a hierarchical multi-population setting.

In practice, you will want to:

  • choose [L,U][L, U] to capture the transition zone plus some plateau on each side,
  • check posterior diagnostics (divergences, R-hat, effective sample sizes),
  • and interpret m50m_{50} and its uncertainty in the biological context.

Future vignettes will expand on:

  • comparing STAGE to logistic regression and Gaussian LDA, and
  • sensitivity to the choice of [L,U][L, U] and prior settings.