X ~ normal(mu, sigma) with target += normal_lcdf(X | mu, sigma) when \(X\) is in fact the upper bound of a left-censored observation, that is \(\mathcal{N}(\mu, \sigma^2)\) distributed. For a right-censored
observation you have to use the complementary cumulative distribution function normal_lccdf(X | mu, sigma).
A way to encode censoring in a Stan model such that all types of data can be treated equally is by using a custom probability distribution function
censored_normal_lpdf that takes an additional argument cc that determines the type of censoring. Below we use the following codes:
| Uncensored | 0 |
| Left-censored | 1 |
| Right-censored | 2 |
| Missing data | 3 |
The function
censored_normal_lpdf is defined in the functions of the Stan model below.
In addition to N Observations, we add N CensorCodes that
must be either 0, 1, 2, or 3 in the data block.
In the model block, we now can use the
sampling statement Observations[n] ~ censored_normal(mu, sigma, CensorCodes[n]),
because this is just syntactic sugar for
target += censored_normal_lpdf(Observations[n] | mu, sigma, CensorCodes[n]).
Vectorization unfortunately does not work with this method.
To demonstrate the
censored_normal distribution,
I added the following python script that generates some random data
and compiles and runs the Stan model.
The output of the script will be something like
mu: -0.018, 95% CrI: [-0.176, 0.133] sigma: 1.071, 95% CrI: [0.953, 1.203]
No comments:
Post a Comment