Survival Models
生存模型(Survival Models)属于General Linear Model, 被广泛用于Censored Data的建模, 譬如用户流失预测. 这里介绍下最基本的生存模型以及在Censored Data上的MLE估计
Survival Function
Assume $T$ is a continuous random variable indicates the death occurrence time, we have:
$$ F(t) = P\lbrace T < t\rbrace = \int_0^t f(t) dt \tag{1.1} $$
Then the Survival Function should be:
$$ S(t) = P\lbrace T > t\rbrace = 1 - F(t) = \int_t^\infty f(t) dt \tag{1.2} $$
Harzard Function
An alternative way to characterization the distribution is given by harzard function, or instantaneous rate of occurrence of the event:
$$ \begin{align} \lambda(t) &= \lim_{dt \to 0} \frac{P\lbrace t \le T < t + dt | T \ge t\rbrace}{dt} \\ &= \lim_{dt \to 0} \frac{P\lbrace t \le T < t + dt \rbrace}{P \lbrace T \ge t\rbrace dt} \\ &= \lim_{dt \to 0} \frac{f(t)dt}{S(t) dt} \\ &= \frac{f(t)}{S(t)} \end{align} \tag{2.1} $$
Given $(1.2)$ we have $\frac{d}{dt} S(t) = -f(t)$, so $(2.1)$ has another form
$$ \lambda(t) = -\frac{d}{dt} log S(t) \tag{2.2} $$
We could derive survival function from harzard function as well:
$$ S(t) = exp\lbrace - \int_0^t \lambda(x)dx \rbrace = exp\lbrace -\Lambda(t) \rbrace \tag{2.3} $$
In which $\Lambda(t) = \int_0^t \lambda(x)dx$, called cumulative hazard
Example 2.1
Here we’re modeling a constant risk over time: $$ \lambda(t) = \lambda $$ From $(2.2)$, we could solve corresponding survival function and pdf $$ \begin{align} S(t) &= exp\lbrace - \int_0^t \lambda(x)dx \rbrace = e^{-\lambda t} \\ f(t) &= \lambda e^{-\lambda t} \end{align} $$ That is exactly an exponential distribution
Expectation of Life
Given $S(t)$ or $\lambda(t)$, it’s easy to denote expected value of $T$ $$ \mu = \int_0^\infty tf(t)dt =\int_0^\infty S(t)dt $$
Censoring and the likelihood function
Censoring Type
- Type I
Typically 2 types of observatioin:
- A sample of $n$ units is followed for a fixed time $\tau$
- Generalization, fixed censoring: each unit has a fixed time $\tau_i$
In cases above, number of deaths is a random variable.
- Type II
- A sample of $n$ units is followed as long as necessary until $d$ units have experienced the event
- Generalization, random censoring: Each unit has:
- Censoring time $C_i$
- Potential lifetime $T_i$
- Observe time $Y_i = min\lbrace C_i, T_i\rbrace$
- Indicator $d_i, \delta_i$ tells us whether the observation is terminated by death or censoring
Likelihood of censoring model
-
Unit died at $t_i$. Since we know it is dead while survives till $t_i$, we have: $$ L_i = f(t_i) = S(t_i)\lambda(t_i) \tag{3.1} $$
-
Unit still alive at $t_i$. We only know it survives till $t_i$ $$ L_i = f(t_i) = S(t_i) \tag{3.2} $$
Given 2 conditions above, we have: $$ L = \prod\limits_{i=1}^{n}L_i = \prod\limits_{i} \lambda(t_i)^{d_i}S(t_i) \tag{3.3} $$ Taking logs, considering $(2.3)$, we have: $$ log L = \sum\limits_{i=1}^{n} \lbrace d_ilog\lambda(t_i) - \Lambda(t_i) \rbrace \tag{3.4} $$
Example 3.1
Considering exponential distribution $\lambda(t) = \lambda$, from$(3.4)$, we have $$ log L = \sum\limits_{i=1}^{n} \lbrace d_ilog\lambda - \lambda t_i \rbrace $$
We could estimate $\lambda$ using MLE:
Let $D=\sum d_i$ denotes the total number of deaths, $T = \sum t_i$ denotes total number of observation time:
$$ \begin{align} log L &= Dlog\lambda - T\lambda \\ \frac{\partial}{\partial \lambda} L &= \frac{D}{\lambda} - T \end{align} $$
Letting $\frac{\partial}{\partial \lambda} L = 0$ we get the estimation of $\lambda$
$$ \hat \lambda = \frac{D}{T} $$