More CBT Research from Sir Simon and Professor Chalder

By David Tuller, DrPH

(*Thanks to the the very informed discussion–and discussants–on the Science For ME forum for alerting me to this study and its many problems!)

In 2011, Professor Trudie Chalder declared at a press conference for the high-profile PACE trial that twice as many chronic fatigue syndrome patients who received cognitive behavior therapy and graded exercise therapy got “back to normal” compared to those in the two comparison arms. Although the statement was a dramatic misrepresentation of the findings just reported in The Lancet, Professor Chalder’s comments received international media attention and helped her and and her co-investigators position the trial as a success.

Her longtime colleague at King’s College London, Professor Sir Simon Wessely, has made comparably questionable assertions. For one, he called the PACE trial “a thing of beauty”—even though it violated core principles of scientific inquiry. Despite Sir Simon’s bountiful appreciation of PACE’s aesthetic qualities, much of the international scientific community has rejected the study’s findings.

It bears repeating that the US Centers for Disease Control and Prevention has eliminated references to PACE and dropped the CBT and GET recommendations. More than 100 experts from Columbia, Stanford, University College London, Queen Mary University of London, Berkeley and other leading institutions signed Virology Blog’s open letter to The Lancet denouncing the trial’s “unacceptable methodological lapses” and demanding an independent investigation.

Given this history, it should not be surprising that when these two like-minded investigators join forces, the result is an unconvincing mishmash like “Cognitive Behavioural Therapy for chronic fatigue and CFS: outcomes from a specialist clinic in the UK.“ This study has been accepted for publication in the Journal of the Royal Society of Medicine. While it is not yet officially published, King’s College London has posted a copy of the accepted draft. (Sir Simon is the outgoing president of the society.)

The study purports to demonstrate the effectiveness of CBT as a real-world treatment for what Sir Simon, Professor Chalder and their three co-authors still prefer to call CFS. As with much of the research from leading lights of the CBT/GET ideological brigades, close scrutiny of the paper reveals how the investigators have gussied up their disappointing results with pretty ribbons and a bow.

In fact, the paper reads like a possible effort to impact the ongoing deliberations over new guidelines for ME/CFS from the UK’s National Institute for Health and Care Excellence. The current guidelines, published in 2007, recommend CBT and GET for what was then being called CFS/ME. The revision was supposed to be done this year but the pandemic has pushed the process into 2021. The NICE decision to revisit the guidelines was one of a number of signs over the last few years that the hegemony of the CBT/GET paradigm for treatment of ME/CFS was starting to crumble under the weight of its own deficiencies and contradictions.

After the pandemic hit, NICE issued a statement that GET should not be presumed to be a treatment for post-Covid fatigue based on the 2007 guidelines. Meanwhile, Professor Chalder advised post-Covid patients in a video interview not to rest too much and to get back to their activities as quickly after the acute phase of illness had passed. With potent evidence from the ME/CFS field of harm stemming from GET, along with the emerging Covid-related concerns that this exercise advice might be misapplied in the real world, it seems increasingly possible that the NICE panel will dump the recommendations for GET altogether.

Addressing CBT is likely to be trickier for the NICE panel, for multiple reasons. CBT has been an established therapeutic intervention for decades. It is routinely offered to people with major illnesses like cancer and multiple sclerosis who are also experiencing depression or have other mental health needs, so some find it hard to understand why ME/CFS patients would object to the recommendation.

The reason is that the PACE-style version of CBT is very specific to ME/CFS and was promoted as a curative treatment for the illness itself, not as adjunct support while the patient receives medical care for an underlying problem. The intervention is premised on the unproven theory that “unhelpful beliefs” about illness drive the behaviors that perpetuate the terrible symptoms. And it is designed specifically to help patients overcome their purportedly irrational fears of being disabled from an organic disease. An article like this new one–with the impressive imprimatur of the Royal Society of Medicine and its president, Sir Simon himself–could be seen as an argument in support of preserving a role in ME/CFS treatment for CBT, even if NICE ends up deciding to dump GET.


Less Than Meets the Eye

The new study investigates the outcomes for 995 patients who passed through the a specialist CFS service and received a course of CBT between 2002 and 2016. At the beginning of this period, the CBT/GET paradigm was already the prevailing treatment approach. In 2003, government agencies approved funding for the PACE trial, which the investigators themselves hailed as the “definitive” trial of the interventions.

In setting out the rationale for the intervention, the investigators write: “CBT treatment is based on a model which assumes that certain triggers such as a virus and/or stress trigger symptoms of fatigue. Subsequently symptoms are perpetuated inadvertently by unhelpful cognitive and behavioural responses.” The treatment involves, among other elements, “addressing unhelpful beliefs which may be interfering with helpful changes.”

This theory is essentially the one laid out in a 1989 paper by a team that also included Sir Simon and Professor Chalder. In more than 30 years, the notion that pathophysiological processes and not just “unhelpful cognitive and behavioural responses” could be involved in perpetuation of the symptoms has not penetrated this static formulation.  

This wouldn’t matter if the results of the research justified the hype. But they don’t, no matter what the ideological brigadiers continue to argue. In the new paper, Sir Simon, Professor Chalder and their colleagues simply assert their longstanding position, cite various flawed papers to back their case, and fail to acknowledge that it has come under well-grounded and robust criticism in recent years. For example, they favorably cite the reported PACE results but do not cite the peer-reviewed papers that document the study’s flaws.  

The investigators might be unhappy that their ideations about patients’ “unhelpful beliefs” have lost much credibility, and that an entire issue of an academic journal–the Journal of Health Psychology–has been devoted to the PACE-gate controversy. But the broad challenge to the CBT/GET paradigm is part of the current clinical landscape as well as the medical literature. Like President Trump, Sir Simon and Professor Chalder appear to prefer ignoring bad news and making happy talk–even when their chatter is so easy to pick apart.


Who Were the Participants?

First, the investigators seem confused about whether they are investigating patients with chronic fatigue or patients with CFS. The title suggests the answer is both. But the paper itself refers to CFS throughout and to the participants as having met CFS criteria. The conflation of these two constructs makes some sense in light of the investigators’ apparent belief that fatigue exists on a continuum, with CFS “at the more severe end of the spectrum.” Many experts do not view CFS as just an extreme case of fatigue but rather as a clinical entity on its own, albeit one that has been challenging to define in the absence of a biomarker.

In the retrospective study, all 995 participant met the criteria outlined in the 2007 NICE guidance for what it called CFS/ME. Yet only 76% met the Oxford case definition, which requires six months of fatigue and no other symptoms, and 52% met the CDC criteria, which require six months of fatigue plus four of eight other symptoms. Hm. That’s odd. The 2007 NICE guidelines advised that a diagnosis of CFS/ME could be considered if a patient suffered fatigue for four months rather than the six required by both the Oxford and CDC criteria.

So did 24% of the sample only have fatigue for between four and six months? That seems hard to understand, given that participants had been ill for a mean duration of 6.64 years. Perhaps the numbers add up in some other way I haven’t figured out. Did peer reviewers assigned by the Journal of the Royal Society of Medicine notice or ask about these apparent discrepancies? Did they actually scrutinize the paper, unlike a recent BMJ peer reviewer who acknowledged in his review that he had not read “beyond the abstract” of the assigned study?

Nor is it clear if many or any of these participants experienced post-exertional malaise, which is considered a core symptom of the illness. Neither the Oxford nor CDC definitions require a version of this symptom—more recent and better case definitions do. NICE is ambiguous on the matter, including versions of it as part of the description of the fatigue but also as one of multiple optional symptoms. Without more specifics about the sample in this paper, it is hard to determine how many people with CFS were in the sample—as opposed to idiopathic chronic fatigue, for example, which might respond to some form of CBT.

As described in the paper, the course of CBT included up to 20 sessions on a twice-monthly basis. Patients were asked to fill out several questionnaires at the start of treatment, at the fourth and seventh sessions, at discharge, and at three months after discharge. The measures included the same questionnaires for physical function and fatigue as in the PACE trial—the SF-36 and the Chalder Fatigue Questionnaire. They also included more generic scales, such as those for work and social adjustment, depression and anxiety, and overall health.

It is important to note that all of these measures are subjective. The study includes no objective indicators—how far people could walk, whether they returned to work, whether they got off social benefits, and so on. And everyone knew they were receiving an intervention designed to help them. In fact, as described in PACE, the CBT approach includes informing participants that the intervention has already been proven to work. It should not be surprising that some people receiving such an intervention would report short-term but ephemeral benefits well within what might be expected from a placebo response. Without any objective measures, such responses are fraught with potential bias and inherently unreliable.

Showing Poor Results in the Most Flattering Light

Even when presented in the best light, the main results do not support the argument that the treatments overall are effective. For physical function, the mean score rose from 47.6 at baseline to 57.5 at discharge and 58.5 at three-month follow-up. (The SF-36 scale runs from 0 to 100, with higher scores representing better physical function.) In the PACE trial, a score of 65 or below was considered disabled enough for trial entry, so the mean scores at discharge and follow-up in this study represent serious disability. Likewise, the mean CFQ score at discharge and follow-up, while modestly improved since baseline still represents high levels of fatigue.

On closer inspection, things look even worse. As it turns out, those highlighted results do not seem to take into account a lot of missing data. Of the 995 participants in the study, the investigators define 31% as lost-to-follow-up—that is, they provided no data at either the end of treatment or the follow-up assessment three months later.

So we have no idea at all what happened to almost a third of the participants. Maybe some got worse and became bed-bound or even killed themselves. Maybe some got bored with the psychotherapy. Maybe others felt it was a waste of time and found they got more from smoking weed or going fishing. It’s not a good thing when almost a third of your patients, for whatever reason, don’t bother to let you know what happened to them. This lost-to-follow-up rate is not mentioned in the abstract–a disturbing omission that could be inadvertent or could be an attempt to underplay information that reflects poorly on the reported findings. 

Interestingly, the drop-outs appeared to be in worse condition at baseline than those who stayed in. They reported more depression, poorer work and social adjustment, and significantly worse physical function—their mean score was 7.38 points lower on the SF-36 scale. Perhaps they were more likely to have actual ME/CFS and not idiopathic fatigue. For some of these patients, the CBT intervention–with its message that their symptoms were being perpetuated by their “unhelpful beliefs” and irrational behavior–might have led to deterioration in their health through both organic and psychological pathways.

To their credit, the investigators acknowledge this limitation. The poorer health condition of the drop-outs, they write, “suggests that there may have been some bias in the data, in that those who completed treatment may not represent all patients who access CBT treatment for CFS.” Notwithstanding this warning, they deploy their biased data to boost the impression that the intervention is effective.

And even the 31% drop-out figure isn’t a true reflection of the low data collection rates in the study. Of the 995 participants, only 581 answered the CFQ at end of treatment and only 503 at follow-up—58% and 51%, respectively. For the SF-36, only 441 responded at discharge and 404 at follow-up—44% and 41%, respectively. (For unexplained reasons, only 768 of the 995 participants provided information on the SF-36 at baseline.)

That means the loss-to-follow-up at discharge on the CFQ and the SF-36 were, respectively, 42% and a whopping 56%–and even worse on follow-up. When close to or more than half a sample does not provide data for a key outcome, investigators should be cautious in interpreting findings from those who managed to stick out the intervention. If participants with lower scores at baseline dropped out, as with the SF-36, that alone should raise the mean among those who remained. It seems silly to position modestly improved mean scores from a half-depleted sample as an indicator of treatment success when little or nothing is known about those who disappeared.

Claims of Causality

In the discussion section, the investigators write the following: “The CBT intervention led to significant improvements in patients [sic] self-reported fatigue, physical functioning and social adjustment.” From my understanding of the King’s English*, I would interpret that sentence as a statement of causality—and an unwarranted one at that. The investigators have not reported evidence that the CBT intervention “led to” anything. They have provided evidence only that their CBT intervention was chronologically followed by reported changes in mean results among a shrinking pool of participants.

*[I was informed this phrase should have been rendered as the Queen’s English, since there is a current queen and not a current king. But my American dictionary defines the King’s English as “standard, pure, or correct English speech or usage,” and does not mention a queen. In other words, the King’s English is a generic, to be used even when the monarch is female or non-binary. So I stand by my American usage.]

They make a similar extravagant slip when they write the following in their conclusions: “The lack of a control condition limits us from drawing any causal inferences as we can not be certain that the improvements seen are due to CBT alone and not any other extraneous variables.” This statement is self-contradictory. To state that the improvements might not be “due to CBT alone” is to posit as fact that they are due to CBT at least in part but that other factors might have contributed. In one sentence, the investigators are drawing a causal inference while denying the possibility of being able to do just that.

Let’s be clear. Given the study design, there is no evidence that the CBT intervention played any role whatsoever. Perhaps it did; perhaps not. It is unfortunate, but not surprising, that Sir Simon, Professor Chalder and the peer reviewers selected by the Journal of the Royal Society of Medicine* did not notice or care about these impactful misstatements of causality–a continuation of a time-honored tradition of sloppy argumentation and inadequate peer-reviewing in this domain of science. *[I initially wrote that the peer reviewers were selected by the Royal Society of Medicine, when I intended to write the Journal of the Royal Society of Medicine. I apologize for the error.]

Oh, one last point. In the abstract, the investigators highlight that 90 % of patients “were satisfied with their treatment.” Presumably that impressive-looking figure does not include responses from the 31% who were lost-to-follow-up. Does it include the many others who showed up at discharge and follow-up but failed to provide key information on other questionnaires? Who knows? The study does not mention how many responded to this question, as far as I can see. It is not surprising that this squishy but deceptively presented data point found its way into the abstract’s conclusions.

My epidemiology colleagues at Berkeley have used the PACE trial in seminars as a case study of how not to conduct research. If their students turned in something as inadequate as this new Wessely-Chalder collaboration, they’d get slapped down pretty quickly.




17 responses to “More CBT Research from Sir Simon and Professor Chalder”

  1. CT Avatar

    Also, the % figure given for improvement in fatigue in the abstract doesn’t appear to match that given in the results section?

  2. tygrus Avatar

    To the study authors: Patients with ME, looser CFS and CF definitions should be separated and some patient groups excluded. With so many drop-outs, they should separate them into another group(s). The drop-outs can poison the baseline results to be lower making the final results to show improvements. Analyse the 380 or 500 as the main study from start to finish. Show the drop-out group stats and any indicators of early treatment and results prior to exit. Be honest about size and lack of significance regarding improvements. What scores & variance are expected for normal population of same age/sex?

  3. Alicia Butcher Ehrhardt, PhD Avatar
    Alicia Butcher Ehrhardt, PhD

    Their entire house of cards needs to come crashing down.

    Each layer is standing on a previous one of lies.

  4. RT Avatar

    I went to this clinic in this time frame. I dropped out. The CBT process at the start tried to persuade me to lower my expectations of what my ‘self defined goal’ of recovery looked like. When I said, well, walking 10,000 steps a day and going for a long walk in the countryside from time to time would look like recovery, I was challenged on how ‘scientific’ the 10,000 steps a day was as a goal, and they tried to moderate that. Now I see that was so I could be seen to ‘meet my recovery’ target more easily by the end.

    The person I saw didn’t keep accurate records of our sessions. When at Session 9 he came me a draft of his pre-assessment Report, there were so very many factual errors and misrepresentations of what I had said that I raised them as a serious matter of concern about the validity of the process. He responded that it was just drop down menus, but there were statements about diagnosis that had never been made in my case.

    Having spent many hours dutifully competing activity and other records, it turned out that he had barely read them, let alone helped me to understand connections between activity and fatigue.

    I thought a conversation about this was raising a complaint about quality of service and expected a response to that effect. Instead I was told I’d had a therapy session.

    I concluded the therapist was an unreliable witness, that the process was not responsive to me, but was trying to shape my experience to some undisclosed model of “people with your, err, condition” that was seriously discounting my loved experience.

    I spoke to my GP about the basic and fundamental errors in the Report and how the process was deeply unhelpful, a waste of my scarce time and energy and of NHS budgets and they agreed that I should withdraw.

    I didn’t complain: my therapist was a CBT trainer in the clinic, he didn’t see my concerns about basic errors as legitimate, when I raised concerns I wasn’t offered an alternative therapist or any follow up as to the quality of the service. All my concerns were spelled out in an email, and I would have expected an in-service response to concerns.

    I could say more. Happy to respond by email in confidence David.

  5. RT Avatar

    Grr! Typos.
    “Lived” experience.
    It was Pre-Treatment report ie after first session and before ‘Treatment’, but given to me months down the line as a draft.

  6. CT Avatar

    What an outfit!

  7. Lady Shambles Avatar
    Lady Shambles

    RT: very interested to hear about your experience of this clinic and its study. I think your observations could do with being made more pubic, if that’s at all possible. I hope David takes you up on your offer to provide him with more information. I wish more people with ME would speak out in this way.

  8. Acacia Avatar

    Here, as in other CBT/GET studies this crowd consistently make themselves impervious and dumb to any meaning which might be derived from drop-outs and unsuccessful physical or other outcomes. They just show no interest in why these unsuccessful cases might occur. It just serves their purposes to fudge the definitions. It’s all just a sales piece and con job with no care for the patients.
    Thanks for the expose, David.

  9. CT Avatar

    Last time I looked, 85% patients improving for fatigue (abstract) + 16% deteriorating for fatigue (results section) does not add up to 100% but 101%. And then there’s those who stayed the same, where do they fit in exactly?

  10. N A Avatar
    N A

    The authors make no comment about how unrepresentative the treatment group was by education level: “228 (23%) were educated to school level (GCSE/O-Level) with 683 (69%) educated to university level (undergraduate).”

    This number of graduates is way above the national average which amongst the 20 – 65 year old population in England and Wales increased from below 25% to only 41% across the duration of the study period The England & Wales graduate rate in the over 65 range is below 10%.

    With an average graduate level of no more than 33% in the potential patient population across the duration of the study, the question arises as to where exactly this CBT service was receiving patients from – and how that reception process might impact patient responses. For example were patients being referred onward to GPs via institutional occupational health processes, such as those that operate within Universities and the NHS ? This would have potentially significant impacts on the psychology of the interactions between the service providers and patients.

  11. Lene Christiansen Avatar
    Lene Christiansen

    I find it strange that 5 authors are getting paid (well, I guess) to write constructed articles that really are useless. I say useless because the patient group is not well defined. This is low quality pseudo-research. Embarrasing. I hope English taxpayers are not funding this.

  12. boolybooly Avatar

    Thanks David, for this justified criticism.

  13. Steve Hawkins Avatar
    Steve Hawkins

    “It is unfortunate, but not surprising, that Sir Simon, Professor Chalder and the peer reviewers selected by the Royal Society of Medicine did not notice or care about these impactful misstatements of causality–a continuation of a time-honored tradition of sloppy argumentation and inadequate peer-reviewing in this domain of science.”

    More than that: it continues their time-honoured tradition of setting up ‘trials’ and research exercises designed *not* to be able to give any clear results or conclusions. Why are they allowed to keep this up? Any well intentioned study would rely on objective measures of improvement that would provide unequivocal results that indicated success or failure.

    The time-honoured tradition of, either not including such measures, or of finding excuses to drop them mid study or glossing over anything inconvenient about them, indicates that this cosy little club of pulp fiction writers know when they are on to a good thing, and will not include any concrete tests that might kill their goose that lays golden eggs.

  14. Nancy Blake Avatar
    Nancy Blake

    Simon Wessely is deeply embedded in a project to deprive sufferers of long-term disability benefits, originally in the pay of UNUM, a US medical insurer found guilty of fraudulent denial of payments to the insured. ME/CFS is the poster-girl for this project – the disease about which the ‘joke’ is the doctor telling the patient “You’ve got ME. The good new is that you won’t die! The bad news is that you won’t die.”. The Blair government involved Wessely in New Labour’s ‘Welfare to Work’ reforms to Incapacity Allowance, and DWP’s funding of the PACE Trial was part of this project.

    The link below is to a long article describing how the biopsychosocial model of disability is taking over disability legislation. The quotation from it given below describes Wessely and Sharpe’s involvement, confirming my view that ME/CFS is deeply embedded in a more general movement to get more and more ill health moved into the mental health category.
    p. 13 ‘Rutherford argues: in the 1980s Unum, and insurance companies Provident and Paul Revere were in trouble in the U. S. They had increased profits by sharing similar policies on disability and sickness insurance and selling to professionals. A combination of falling interest rates and the growth of diagnosed illnesses which were not subject to the insurance sector’s tests appeared to be increasing, affecting the professionals who had taken out policies with the companies, and in turn affecting company profits. These illnesses included: Myalgic Encephalomyelitis (ME) or Chronic Fatigue Syndrome (CFS), Fibromyalgia, Chronic Pain, Multiple Sclerosis, Lyme disease.
    An aggressive Chronic Fatigue Syndrome plan followed, with claims being managed in a way that continued to maximise profits. The insurance industry called on the academics, Professor Simon Wessely of King’s College and Professor Michael Sharpe of Edinburgh University (both participants in the Woodstock conference) in an attempt to reclassify those conditions that were costing money, and lobby the medical profession on such conditions so they fell outside the remit of pay outs, It meant that specific illnesses were targeted in order to discredit the legitimacy of claims. This strategy was to prove useful in dealing with the UK’s welfare reform and in overriding the basis of medical opinion on a whole set of conditions. As the state joined in the denial with its set of private companies and supporting academics Unum achieved more market returns while disabled people began to see their own welfare support rapidly diminishing.

    (This apparently is from an article by Jonathan Rutherford in Soundings.)

    I may also, in other comments, mentioned Farhad Dalal’s profound and important book, CBT: The Cognitive Behavioural Tsunami: Managerialism, Politics and the Corruption of Science. I cannot stress enough the importance of Dalal’s insight and analysis.

    I’m going to add my insistence that critics of the IOM report that gave us SEID need to read the small print:


    The quotations here from the IOM Report can be used for the information of anyone who needs to be told what the authors intended the label ‘Systemic Exertion Intolerance Disease’ to convey: this illness is not psychiatric, and exertion of any kind can cause multisystem damage.

    Psychiatry has no role whatever in relation to this disease.

    Any kind of exertion can do widespread harm.

    It follows that recommending, encouraging or coercing a patient into any form of exertion is culpable medical abuse.

    From book description: ‘Beyond Myalgic Encephalomyelitis/Chronic Fatigue Syndrome stresses that SEID is a medical – not a psychiatric or psychological – illness.

    ‘First and foremost, listening to patients and taking a careful history are key diagnostic tools.’ p 213

    ‘A new code should be assigned to this disorder in the International Classification of Diseases, Tenth Revision (ICD-10), that is not linked to “chronic fatigue” or “neurasthenia.”’ p 222

    ‘Systemic exertion intolerance” captures the fact that exertion of any sort—physical, cognitive, emotional—can adversely affect these patients in many organ systems and in many aspects of their lives. The committee intends for this name to convey the complexity and severity of this disorder.’ pp 227/8

    “7 Recommendations.” Institute of Medicine. 2015. Beyond Myalgic Encephalomyelitis/Chronic Fatigue Syndrome: Redefining an Illness. Washington, DC: The National Academies Press. doi: 10.17226/19012

    As I keep saying, they handed us a tool – it’s up to us to use it.

    Sent from my iPad

    Sent from my iPad

  15. Nancy Blake Avatar
    Nancy Blake

    Two corrections to previous comment:

    Wessely may or may not have been directly ‘in the pay of UNUM’. The quote from Rutherford speaks of UNUM ‘Reaching out’ to Wessely and Sharpe but does not specify paying them. Although it is hard to imagine what else it could imply.
    The other correction would be to ‘Also, I may have mentioned in other comments’ I left out ‘have’. Should definitely proofread before posting, my apologies.

  16. Steve Hawkins Avatar
    Steve Hawkins

    Further to my observation that this group of ‘researchers’ has carefully avoided using outcome measures that might disprove their hypothesis, now comes this new study using ‘research grade’ wristband activity meters with many thousands of participants from UK Biobank.

    It’s a study of activity against fitness, rather than anything specific the fatigue research, but, from now on, there will be no excuses for the CBT cultists to leave out objective measurements to favour their silly ambivalent questionnaires:

    “A challenge facing researchers has been that the low intensity, incidental movement that accumulates in the course of everyday activities is very hard to recall accurately, and consequently *difficult to measure using questionnaires*. Wearable devices have enabled better detection of this type of movement that makes up the majority of our daily physical activity, but until now have not been used on a large enough scale to determine if more intense activity makes a contribution to health, distinct from increasing total volume.”