By David Tuller, DrPH
In 2020, the CODES study of cognitive behavior therapy (CBT) for psychogenic non-epileptic seizures, also known as dissociative seizures, reported null findings for its primary outcome—the number of seizures per month one year after the start of therapy. This was a disappointing result for an ambitious effort to seek an effective treatment for a disabling form of functional neurological disorder (FND).
In an impressive display of dishonesty, the press release from the lead university—King’s College London—touted the trial as a success. The release buried the bad news about the primary outcome and highlighted some modestly positive findings in secondary quality-of-life measures. In a follow-up paper published last year, the CODES investigators actually blamed the trial funders for requiring that seizure reduction rather than something else be selected as the primary outcome.
Unmentioned was the fact that the investigators themselves had spent years defending the choice of seizure reduction as the most appropriate measure for assessing treatment effectiveness. After all, they had previously conducted a pilot trial with seizure reduction as the primary outcome, and pursued the multi-center study of the CBT intervention as a follow-up. They appear to have changed their minds about the value of their primary outcome only after their big trial didn’t work out as planned.
The investigators of another major FND trial—this one of specialized physiotherapy for functional motor disorder (FMD)—seem to be engaging in a similar strategy of post-hoc primary outcome re-positioning. The FMD trial—called Physio4FMD—had null results for its primary outcome, the physical function subscale of an instrument called the SF-36. However, one of multiple secondary outcomes—a measure called the Clinical Global Impression of Improvement (CGI-I)–showed some benefit.
In a recent follow-up paper called “Cost Utility of Specialist Physiotherapy for Functional Motor Disorder (Physio4FMD),” the investigators have thrown their primary outcome under the bus. In this paper, they now declare that the SF-36 provided an “overly narrow view of the potential benefits of physiotherapy,” suggesting that it should not, in fact, have been the primary outcome. This despite the fact that, as with the CODES trial, they conducted a pilot study in which the SF-36 served as the primary outcome and organized the full trial based on those findings.
In their current view:
“The secondary outcome, the patient-reported Clinical Global Impression Improvement score, allows for a broader assessment of potential impacts that specialist physiotherapy can have across various aspects of patients’ lives, including the cost impact, thus capturing a wider range of outcomes. It is notable that consensus recommendations for outcome measures in FND (published after this trial was planned) have recommended the patient-reported Clinical Global Impression of Improvement as the primary outcome measure in trials of interventions in FND.”
This reasoning doesn’t hold up. The CGI-I asks respondents if symptoms have improved or gotten worse, on a scale of 1 to 5. How exactly does that question allow for “a broader assessment of potential impacts that specialist physiotherapy can have across various aspects of patients’ lives,” and how does that broader assessment include “the cost impact”? Is the investigators’ point that the CGI-I potentially captures a “wider range of outcomes” because it is so vague and general? I really don’t get what the investigators mean here, and it doesn’t make a lot of sense to me.
A major limitation of the CGI-I is that it only provides someone’s current assessment of their condition in relation to how they previously felt—not to anything in the outside world. In contrast, the SF-36 asks respondents about how well they can perform a wide range of particular physical activities, among many other questions. That means the SF-36 physical function subscale can easily be compared to the responses from populations with a different medical status.
In the FMD trial, for example, both the intervention and comparison arms were severely disabled for physical function on the SF-36 at baseline, compared to population norms, and remained severely disabled at the study endpoint. We have no idea from the CGI-I results how disabled the participants were, either before the intervention or after, or whether they reached thresholds indicating recovery. All we know is whether they thought they had improved or worsened from their baseline state.
Both the CGI-I and the SF-36 are subjective, self-reported measures, and both are susceptible to bias on that basis. However, the more detailed instrument requiring people to think about and rate their challenges has obvious advantages over one that simply asks if someone feels better or worse than before.
Beyond that, the statement about the consensus recommendations for outcome measures is simply untrue–that paper, published in 2020, did not recommend the CGI-I as the de facto “primary outcome measure” in FND intervention trials, as the Physio4FMD authors assert. Rather, the consensus recommendations listed the CGI-I (and related CGI scales) as the preferred option for the specific “outcome domain” identified as “core FND symptoms.”
But “core FND symptoms” was just one of several outcome domains of potential interest highlighted in the consensus recommendations. In the outcome domain called “life impact,” for example, the SF-36 was the main recommendation. In other words, the consensus recommendations did not identify one measure as the best choice overall for FND intervention trials, despite such the claim made in the FMD cost-effectiveness paper.
By misrepresenting the message of consensus recommendations in this way, the Physio4FMD investigators have implied that the SF-36 was the wrong primary outcome choice for their specialist physiotherapy trial. It is understandable they would want to downplay this measure’s significance, since it yielded null results. But this is nonetheless a disingenuous and surprising misstatement, given that some members of the Physio4FMD team were also co-authors of the consensus recommendations.
Did any of them remember what they had previously written in the consensus recommendations? How come no one noticed the discrepancy between the consensus recommendations and the statement about the consensus recommendations in the cost-effectiveness paper? However this lapse occurred, a correction is indicated.
And a reminder seems important here: When a major trial yields null primary outcome results, rejecting the primary outcome after-the-fact and deciding it was a success based on secondary outcomes is not a great look.