Frontiers in Biostatistics Seminar
Noah Simon, Ph.D.
Department of Biostatistics
University of Washington
To build inferential or predictive survival models, it is common to assume proportionality of hazards and fit a model by maximizing the partial likelihood. This has been combined with non-parametric and high dimensional techniques, eg. spline expansions and penalties, to flexibly build survival models.
New challenges require extension and modification of that approach. In a number of modern applications there is interest in using complex features such as images to predict survival. In these cases, it is necessary to connect more modern backends to the partial likelihood (such as deep learning infrastructures based on eg. convolutional/recurrent neural networks). In such scenarios, large numbers of observations are needed to train the model. However, in cases where those observations are available, the structure of the partial likelihood makes optimization difficult (if not completely intractable).
In this talk we show how the partial likelihood can be simply modified to easily deal with large amounts of data. In particular, with this modification, stochastic gradient-based methods, commonly applied in deep learning, are simple to employ. This simplicity holds even in the presence of left truncation/right censoring, and time-varying covariates. This can also be applied relatively simply with data stored in a distributed manner.