Robert Levy, Dana-Farber Cancer Institute
In clinical trials of potential new cancer therapies, one statistical method, known as the log-rank/hazard ratio (log-rank/HR) approach, has become pre-eminent for measuring the survival benefit of those therapies. That’s a problem, Dana-Farber biostatistician Hajime Uno, PhD, and his colleagues write in a recent study, because other techniques can often provide clearer, more useful results.
The study, published in the journal Clinical Trials, reinforces Uno’s ongoing effort to break the monopoly that the log-rank/HR technique has gained in certain kinds of clinical research. By comparing the analytical power of log-rank/HR and several other statistical techniques, Uno and his Dana-Farber collaborators found that log-rank/HR is far from being the best choice in all situations and demonstrated that each method has individual strengths and weaknesses.
“More than 95% of recent cancer clinical trials used the log-rank/HR method to detect treatment differences between novel therapies and standard therapies,” remarks Uno, who led the study with Miki Horiguchi, PhD, and Michael Hassett MD, MPH. “By showing that this one approach doesn’t consistently provide an advantage, we hope this study will open the door for the use of other methods.”
The log-rank/HR method owes its prominence partly to longevity, partly to simple familiarity, Uno asserts. Developed nearly 50 years ago, it was an innovative way of assessing time-to-event – how long a patient lives after treatment, or how much time passes until the disease worsens. Over the years, it became entrenched as the go-to approach, even as new and in some cases more useful methods were invented.
Despite its ubiquity, log-rank/HR has several well-recognized drawbacks, including a tendency to produce vague results. Uno explains: “If a therapy is found to have a hazard ratio of 0.8, it’s understood to mean a 20% reduction of hazard. But the concept of hazard itself can be very difficult to understand. Also, the log-rank/HR method doesn’t provide a reference number from a control group [a comparison group of patients who don’t receive the therapy being tested], so it’s difficult to determine if the 20% reduction is significant. If the hazard faced by the control group is low, a 20% reduction may not be clinically important. If the hazard in the control group is high, even a 5% reduction could be significant. However, the log-rank/HR approach doesn’t indicate the hazard faced by the control group.”
The study used data from 80 recently published phase III cancer clinical trials to explore whether log-rank/HR method offers a meaningful advantage over five alternative techniques. They found that a technique called restricted mean survival time with fixed τ provided the highest statistical power for measuring average survival. (Statistical power indicates how well a study can distinguish between actual changes and those caused by chance.) For progression-free survival – the length of time, after treatment, that cancer doesn’t worsen – the six tests demonstrated roughly equal power. No single test consistently outperformed any other test.
Uno has made it his mission to encourage clinical investigators to move outside their “comfort zone” with log-rank/HR. He has led the development of new statistical methods and has written and published materials on issues involving statistical analysis for the clinical research community. In line with these efforts, he has provided software and user-friendly manuals on alternative techniques to investigators and biostatisticians.
The current study was prompted by an awareness that “no matter how much I explain the new methods or provide software, people may be reluctant to change without evidence that these methods are noninferior and can be more useful than log-rank/HR in real-world situations,” Uno relates.
“Different clinical trials have different goals, different clinical questions that researchers want to answer,” he continues. “Analytic techniques have their own sets of pros and cons. Log-rank/HR has been a useful approach for measuring differences in the length of time to the occurrence of a clinical event between two groups. But rather than limiting themselves to this single technique, clinical researchers and biostatisticians should use the technique that best meets their objectives.”