--- title: "Defining and using objects of class SURVIVAL" author: John Aponte output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Defining and using objects of class SURVIVAL} %\VignetteEncoding{UTF-8} %\VignetteEngine{knitr::rmarkdown} editor_options: markdown: wrap: 72 bibliography: references.bib --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ## Introduction Here we present examples on how to construct and use objects of the class SURVIVAL. The function `s_factory(s_family,...)` is a function that call the constructor of the family of distribution. Each family has it own set of parameters. As the factories implement polymorphic behavior according to the parameters given, it is not possible to partial match the name of the parameters and they need to be spell correctly. If an error on processing the parameters, the factory return a message with the set of parameters accepted for that factory. Once an object of a class SURVIVAL is instantiated, it has access to the following set of methods: - `sfx(SURVIVAL, t)` for the survival (proportion of the population free of events) at time `t` - `hfx(SURVIVAL, t)` for the hazard at time `t` - `Cum_Hfx(SURVIVAL, t)` for the cumulative hazard at time `t` - `InvCum_Hfx(SURVIVAL, H)` for inverse of the cumulative hazard `H` - `rsurv(SURVIVAL, n)` for the generation of `n` random survival times from the distribution - `rsurvhr(SURVIVAL, hr)` for the generation of random survival times with a hazard ratio `hr` Instead of using the helper functions to call this methods, the methods can be called directly from the object as: - `obj <- s_factory(s_family, ...)` or `obj <- s_family(...)` - `obj$sfx(t)` - `obj$hfx(t)` - `obj$Cum_Hfx(t)` - `obj$InvCum_Hfx(H)` - `obj$rsurv(n)` - `obj$rsurvhr(hr)` In addition, the following functions help to plot the distributions - `plot(SURVIVAL)` a generic S3 method that calls the `plot_survival()` function - `plot_survival(SURVIVAL, timeto, main)` which plots the survival, hazard, cumulative hazard and the inverse cumulative hazard functions from 0 to `timeto`. An optional title can be specified with the `main` parameter - `plot_compare(SURVIVAL1, SURVIVAL2, timeto)` produce a comparison of the functions of two SURVIVAL objects. It produces a ggplot of Kaplan-Meier curve and Cumulative Hazard for `nsim` simulations for the a study with `subjects` number of subjects, censored at time `timeto`. The optional parameter `alpha` defines the transparency of each simulation in the graph. In addition of the simulations, the graph also present the calculated survival and cumulative hazard function of the distribution, to evaluate how good the simulations are compared with the real values. Functions to plots to simulated proportional hazards, accelerated failure time and accelerated hazard models: - `ggplot_survival_hr <- function(SURVIVAL, hr, timeto, subjects, nsim, alpha = 0.1)` - `ggplot_survival_aft <- function(SURVIVAL, aft, timeto, subjects, nsim, alpha = 0.1)` - `ggplot_survival_ah <- function(SURVIVAL, aft, hr, timeto, subjects, nsim, alpha = 0.1)` This functions produce Kaplan-Meier curves and Cumulative hazard curves for `nsim`simulations of the baseline distribution and the corresponding proportional hazard, accelerate failure time censored at `timeto` time. The simulation of survival times and survival times with hazard ratios follow the methods described by @bender2003 and @leemis1987 ```{r setup} library(survobj) library(survival) library(ggplot2) ``` ## Exponential Distribution The canonical parameter of the exponential distribution is called `lambda` and represents a constant hazard over time. The units of `lambda` define the units of time for a distribution. For example if `lambda = 3` is used to represent the probability of having 3 events in 1 year, the survival function `sfx(SURVIVAL, 1)` calculate the proportion of the population free of events at 1 year. The distribution can be defined also with the proportion of the population free of events (`surv`) at time `t` or the proportion of the population with events (`fail`) at time `t` ```{r exponential, fig.height=6, fig.width=7, fig.align='center'} # Instanciate an object of class SURVIVAL with the Exponential distribution obj1 <- s_factory(s_exponential, lambda = 3) obj1 # Survival at time 1 sfx(obj1,1) # Hazard at time 1 hfx(obj1,1) # Cumulative hazard at time 1 Cum_Hfx(obj1,1) # Inverse of the cumulative hazard 0.6 invCum_Hfx(obj1, 0.6) # Plot of the distribution plot(obj1) ``` The next set of examples show how to define an exponential distribution based on the surviving or failing proportion at time `t` ```{r exponential2} obj2 <- s_exponential(surv = 0.8, t = 1) obj2 obj3 <- s_exponential(fail = 0.2, t = 1) obj3 ``` The following code shows how to make 100 simulations of 1000 subjects with an object of the SURVIVAL class. The red line is the value from the distribution. ```{r exponential3, fig.height=4, fig.width=7, fig.align='center'} obj4 <- s_exponential(surv = 0.25, t = 10) ggplot_survival_random(obj4, timeto=10, subjects=1000, nsim=100, alpha = 0.1) ``` ## Weibull distribution The canonical parameters of the Weibull distribution are `scale` and `shape`. The `scale` carry on the information about the time units. The `scale` parameter can be derived from the proportion surviving or failing at a given time but the `shape` needs to be provided by the user. Both `scale` and `shape` needs to be numbers bigger than 0. A value of `shape` equal to 1 is similar to an exponential distribution with `lambda` parameter equal to the scale. If the `shape` is bigger than 1 the hazard is increasing which means more events at the end of follow up, and if between 0 and 1 is decreasing which translate to more events at the beginning of the time at risk. The following code shows the effect of the shape parameter on distributions with the same scale. ```{r weibull, fig.height=4, fig.width=7, fig.align='center'} wobj1 <- s_weibull(scale = 3, shape = 0.5) wobj2 <- s_weibull(scale = 3, shape = 1) wobj3 <- s_weibull(scale = 3, shape = 1.5) par(mfrow=c(2,3)) plot( wobj1$sfx, from = 0, to = 1, main = "Weibull with shape 0.5", xlab = "Time", ylab = "Proportion without events", ylim = c(0,1)) plot( wobj2$sfx, from = 0, to = 1, main = "Weibull with shape 1", xlab = "Time", ylab = "Proportion without events", ylim = c(0,1)) plot( wobj3$sfx, from = 0, to = 1, main = "Weibull with shape 1.5", xlab = "Time", ylab = "Proportion without events", ylim = c(0,1)) plot( wobj1$hfx, from = 0, to = 1, xlab = "Time", ylab = "hazard") plot( wobj2$hfx, from = 0, to = 1, xlab = "Time", ylab = "hazard") plot( wobj3$hfx, from = 0, to = 1, xlab = "Time", ylab = "hazard") par(mfrow=c(1,1)) ``` ## Gompertz distribution The Gompertz distribution have two canonical parameters, the `scale` and the `shape`. The `scale` needs to be a number higher than zero, and represents the hazard at time 0. The `shape` can be any real number. Negative `shape` produce a decreasing hazard. Positive `shape` produces a increasing hazard. If the `shape` is zero, the distribution is reduced to an exponential distribution, but this is not implemented in this package. Instead an error is produced. Similarly to the other distributions, the `scale` can be derived from the survival or failing proportion at a given time, but the `shape` parameter needs to be provided. The following graph shows the effect of the `scale` parameter on the Gompertz distribution ```{r gomperz, fig.height=4, fig.width=7, fig.align='center'} # define a function to generate and plot Gompertz distributions plot_sfx_gompertz<- function(shape, scale = 3, timeto = 1){ plot( s_gompertz(shape = shape, scale = scale)$sfx, from = 0, to = timeto, main = paste("Shape: ", shape), xlab = "Time", ylab = "Proportion without events", ylim = c(0,1) ) } plot_hfx_gompertz<- function(shape, scale = 3, timeto = 1){ plot( s_gompertz(shape = shape, scale = scale)$hfx, from = 0, to = timeto, xlab = "Time", ylab = "hazard", ylim = c(2,4) ) } par(mfrow=c(2,4)) plot_sfx_gompertz(shape = -0.25) plot_sfx_gompertz(shape = -0.10) plot_sfx_gompertz(shape = 0.10) plot_sfx_gompertz(shape = 0.25) plot_hfx_gompertz(shape = -0.25) plot_hfx_gompertz(shape = -0.10) plot_hfx_gompertz(shape = 0.10) plot_hfx_gompertz(shape = 0.25) par(mfrow = c(1,1)) ``` ## Piecewise Exponential distribution The Piecewise Exponential distribution is a very flexible distribution where the hazard is treated as constant until a breaks occurs and the value of a new hazard is used. The class implements two parameters the `breaks` that defines the breaks points and the `hazards` that define the hazard used until the break point time. The factory function will provide a warning if the last break is not `Inf` as otherwise the distribution is not completely defined. The parameters `break = c(1,2,3,Inf), hazards = c(0.1,3,4,3)` implements a distribution where the hazard is 0.1 until time 1, 3 from time 1 until time 2, a hazard of 4 until time 3 and from that point a hazard of 3 again. The distribution can be also defined with the proportion surviving or failing, `breaks` and `segments`. In this case the `segments` are scaled to create hazards that results in a specified proportion surviving or failing at the last not Inf break point. For example the parameters `surv = 0.2, breaks = c(1,2,3,Inf), segments = c(1, 2, 3, 1)` will scale the segments to hazards in way that at time = 3 the surviving proportion is 0.2. See the following example ```{r piecewise, fig.height=6, fig.width=7, fig.align='center'} pobj <- s_piecewise(surv = 0.2, breaks = c(1,2,3,Inf), segments = c(1,2,3,1)) pobj pobj$sfx(3) plot_survival(pobj, timeto = 3) ``` ## Log-logistic distribution The Log-logistic distribution have two canonical parameters, the scale and the shape parameters. ```{r loglogistic, fig.height=6, fig.width=7, fig.align='center'} pobj <- s_loglogistic(scale = 3, shape = 1.5) plot_survival(pobj, timeto = 3) ``` ## Log-Normal distribution The Log-normal distribution have two canonical parameters. The shape parameter that defined the median value of the distribution, and the shape parameter that represents the standard deviation of the distribution in the log scale. ```{r lognormal, fig.height=6, fig.width=7, fig.align='center'} pobj <- s_lognormal(scale = 1.5, shape = 0.8) plot_survival(pobj, timeto = 3) ``` ## Comparison of SURVIVAL objects The function `compare_survival()` can produce a graphic comparison of two SURVIVAL objects. The objects no need to be from the same distribution family. ```{r compare, fig.height=6, fig.width=7, fig.align='center'} cobj1<- s_exponential(lambda = 3) cobj2<- s_gompertz(scale = 3, shape = 0.4) compare_survival(cobj1, cobj2, timeto = 2) ``` ## References