Questions About the Prevalence of Functional Neurological Disorder and the Research on Hoover’s Sign for Functional Leg Weakness

By David Tuller, DrPH
(This is a long-ish post. Sorry! It covers two complicated issues. I want to thank an intrepid source for help with this.)

I have great sympathy for patients diagnosed with functional neurological disorder (FND). Their symptoms can be seriously disabling and their plight has long been neglected and dismissed by the medical establishment. When I post about FND, I like to recommend this well-written essay by a patient who goes by the moniker FNDPortal. The article provides a harrowing portrait of the experience of living with FND as well as a cogent account of the history of the construct.

I have, however, raised issues with how FND experts and investigators have made claims that do not seem to conform to the evidence cited. That includes the routine and unwarranted tripling of the reported FND prevalence rate from a 2010 study from Stone et al called “Who is referred to neurology clinics?—the diagnoses made in 3781 new patients,” published in the journal Clinical Neurology and Neurosurgery. FND, formerly called conversion disorder, was redefined in 2013 in the fifth edition of the Diagnostic and Statistical Manual (DSM-5), often referred to as the “psychiatric bible.” Among the changes in the new definition of the diagnosis was that it required the presence of a clinical sign incompatible with neurological disease.

As I’ve blogged here, here and here, Stone et al has repeatedly been referenced for the claim that FND–as re-defined in DSM-5–is the second-most common reason, after headache, for patients to see a neurologist, and/or that it has a 16% prevalence among new presentations at neurology clinics. That is simply not what the paper reported, as should be apparent to anyone reading it.

In fact, a 2016 chapter for the Handbook of Clinical Neurology–co-written by a co-author of Stone et al–cited a much lower number for FND prevalence: “The recent changes in the DSM-5 to a definition [of FND] based on positive identification of physical symptoms which are incongruent and inconsistent with neurologic disease and the lack of need for any associated psychopathology represent a significant step forward in clarifying the disorder. On this basis, FND account for approximately 6% of neurology outpatient contacts.” The chapter specifically mentioned the evidence from Stone et al and also provided a lower and more precise figure of FND prevalence: 5.4%. At that rate, FND would be much further down than #2 on Stone et al’s list of diagnoses, after conditions like epilepsy, peripheral nerve disorders, miscellaneous neurological disorders demyelination, spinal disorders and Parkinson’s disease/movement disorders.

The site neurosymptoms.org, which is maintained by the lead author of Stone et al, makes the same point about the study data. Of the 3781 patients, the site explains, 209 of them “had clear FND.” That’s 5.5%–basically in line with the two data points, 5.4% and “approximately 6%,” from the 2016 article. (The correct percentage using the Stone et al numbers is 5.5%. The 5.4% cited in the 2016 article appears to have been either a typo or miscalculation.)

The diagnoses given for the symptoms of these 209 patients with “clear FND” fell into the three categories most closely identified with conversion disorder, the prior name for the condition—“non-epileptic attacks,” “functional sensory,” and “functional motor.” So who were the additional 10% among the total sample of 3781 that raised the purported FND prevalence to 16%, per more recent publications? These were patients given a grab-bag of what the study identified as “psychological” diagnoses, including hyperventilation, anxiety and depression, atypical facial/temporomandibular joint pain, post-head injury symptoms, fibromyalgia, and alcohol excess, among others. Also lumped into this “psychological” group were cases assigned to categories identified as “non-organic” and “no diagnosis.”

Presumably some or many of these patients with “psychological” diagnoses might today be given some form of “functional” diagnosis, indicating that they have unexplained symptoms. Nonetheless, Stone et al offer no evidence that these additional 10% would have met DSM-5 criteria for the specific clinical entity known as FND or that they could have been ruled in as having FND through the required positive clinical signs. Perhaps some of them would have met this diagnostic burden during a current neurological exam, given advances in the field. But since FND is no longer considered a diagnosis of exclusion but one based on positive rule-in signs, the 16% prevalence claim is speculation relying on unproven assumptions–no matter how many times it is repeated as if it were a documented fact.

In a way, Stone et al set this stage for this mis-communication. In reporting the data, Stone et al combined the “approximately 6%” meeting the DSM-5 criteria for FND with the 10% given “psychological” diagnoses to create a larger category with 16% prevalence that was called “psychological/functional.” The authors did not explain exactly why they combined the two groups rather than keeping them separate. However, the decision to do so conveniently let them call this new, bigger, wildly heterogeneous category the second-most common reason to consult a neurologist, after headache at 19%. Presumably there are benefits in being able to make the argument that a category of interest is #2 rather than much lower down the line.

Unfortunately, multiple papers in recent years have confused the matter further by dropping any mention of the “psychological” group altogether and re-branding the entire 16% as having FND and/or asserting that FND is the #2 presentation. FND experts have done something similar in lectures and other public presentations. Of course, neither statement–that the FND prevalence is 16% and that it is the #2 presentation–is in accord with the 2016 chapter co-written by one of the co-authors of Stone et al; that chapter stated unequivocally that “approximately 6%” met the new DSM-5 definition of FND. Nor is the assertion of 16% prevalence in accord with the information currently provided on, neurosymptoms.org, which notes only 209 out of 3781 patients–that is, 5.5%—had “clear FND.” Hm.

Neurosymptoms.org attempts to address these inconsistencies with some serious post-hoc re-interpretation and theorizing. Besides those with “clear FND” In Stone et al, notes Neurosymptoms.org, “another 200…had additional functional disorder diagnoses including dizziness and cognitive symptoms which could also be included now within FND” and also “other patients presented with diagnoses like migraine, but the neurologists thought the main issue was an associated functional disorder.” Therefore, neurosymptoms.org concludes, “anything from 6-16% of patients could be said to have a functional disorder depending on how that was defined. The upper limit of that estimate would make it the second commonest reason to see a neurologist.”

This lengthy explanation confirms the hey point. The recent papers do not make a vague and belabored argument about the “upper limit” of a broad possible range of prevalence rates for any sort of “functional disorder depending on how it was defined.” Their claim is much more precise, specific, unambiguous, and authoritative: a categorical declaration that Stone et al found 16% to have “clear FND” –that is, diagnoses that could be called FND per the criteria outlined in DSM-5.

With this sleight-of-percentage, the wave of articles mis-citing Stone et al has in effect tripled the reported prevalence of FND as defined in DSM-5. Given that the diagnosis now requires rule-in signs and no longer requires prior trauma, the only acceptable and appropriate prevalence for FND to cite from Stone et al is 5.5% (or “approximately 6%”)—without lumping in the 10% from the “psychological” camp based on subsequent re-framing of the data.

Although published 13 years ago, Stone et al remains the largest investigation of its kind. Perhaps prevalence rates in neurology clinics based on current understandings and awareness differ from those reported in Stone et al, as neurosymptoms.org suggests. Perhaps the rates found in much, much smaller and less authoritative studies differ as well. But this much is indisputable about Stone et al itself: The findings do not support the claim that FND as defined in DSM-5 is the #2 presentation at neurology clinics with a prevalence of 16%. Clarity and consistency in reporting prevalence rates is essential to the practice of public health. For this reason and others, papers that have made this untrue assertion while citing Stone et al should be corrected.

**********

On another matter…Does the research really show that Hoover’s sign is close to 100% specific?

Earlier this month I posted an interview with David Putrino, a neuroscientist and physical therapist at Mt Sinai Health System in New York, about long Covid and its relationship to functional neurological disorder, or FND. My tweet of the interview drew a response from David Perez, a neurologist and psychiatrist at Boston’s Massachusetts General Hospital and a leader in the FND field. 

Dr Perez tweeted to Dr Putrino: “as a fellow clinician & researcher – I’m concerned that you are mischaracterizing Functional Neurological Disorder. While sensorimotor & cognitive domains of impairment are found in many conditions – there are positive NEUROLOGIC EXAM SIGNS that rule-in #FND.”  

In a second tweet he included six links under the slug “articles for your consideration.” These articles provided advice and guidance on diagnosing FND using the kinds of “rule-in” NEUROLOGIC EXAM SIGNS mentioned by Dr Perez. FND is the new-ish name for what has for a century or so been called conversion disorder. Since 2013, the fifth edition of the Diagnostic and Statistical Manual [DSM-5], the so-called “psychiatric bible,” has required not just the absence of known neurological disease but also the presence of clinical signs that are incompatible with such disease. (For consistency, in this post I will generally use the term FND even when writing about research that used more archaic terms like conversion disorder.) 

But the requirement in the revised definition of FND for positive clinical signs has focused awareness on a major gap in the literature. Neurologists have for decades relied on some of these time-honored procedures in diagnosing patients; however, not much if any effort was made to investigate their accuracy. 

This issue remains a challenge for the field. As Dr Perez and colleagues pointed out in “Decade of progress in motor functional neurological disorder: continuing the momentum,” a 2021 article in the Journal of Neurology, Neurosurgery and Psychiatry: “There is a need to further test the specificity, sensitivities and inter-rater reliability of the growing range of positive functional signs compared to other neurological populations, particularly given that statistical properties for some signs have been only tested in a single cohort.” 

In fact, almost all of the signs identified to test motor FND have been tested in only a single cohort, according to an article from Dr Perez and a colleague called “Diagnosis and management of functional neurological disorder,” published the following year in The BMJ. In a table of 41 “validated positive motor signs” of the kind required to rule in the motor FND diagnoses discussed in the 2021 article, 34–or 83%–were shown as tested in only a single cohort. Five were tested in two studies, and only two signs were tested in more than two.

The quintessential and most well-known example of these clinical signs—the poster-sign, if you will—is Hoover’s sign, one of the two motor FND signs found to have been tested in more than two studies. It was first recommended more than a century ago as a means of distinguishing between cases of leg weakness or paralysis caused by neurological disease and those thought to be due to “malingering” or what might then have been called hysteria but would now be called FND. (This post is already long and explaining more about Hoover’s sign and how it’s done would take space. Here’s a video about it.) 

Just like a positive Hoover’s sign serves as a rule-in indicator for functional leg weakness, the other clinical signs are used to rule in other types of FND. Articles in the FND literature about the use of these clinical signs advise that they should be viewed with some caution, that none are perfect, and that they need to be interpreted alongside the other medical information available.

When it comes to Hoover’s sign, FND experts themselves report that some other conditions, like apraxia, can generate false positives. At the same time, the FND literature frames Hoover’s sign as the exemplar of the genre and touts its “diagnostic specificity”—meaning that a positive result is always or almost always accurate. As Dr Perez and colleagues wrote in their 2021 paper: “Establishing the diagnosis of mFND [motor FND] has been made more practicable, as physical examination findings with diagnostic specificity have been identified (e.g., Hoover’s sign with an estimated specificity of 95.7-99.9%).”

(Specificity and sensitivity are complicated. In brief, the first is a measure of whether a true positive case is correctly identified by a positive test and the second is a measure of whether a true negative case is correctly identified by a negative test. There is often a trade-off between the two, but the best tests are those that measure close to 100% on both. I realize this mini-explanation will leave many a bit perplexed. Sorry!!)”

If Hoover’s sign has high diagnostic specificity for functional leg weakness, the corollary is that other conditions would rarely generate a positive result—or never, if the specificity were 100%. But if clinicians are relying on a claim of specificity that is inflated or exaggerated, other diagnoses that might explain a positive Hoover’s sign could potentially be overlooked and missed.

These implications raise a key question: Is the research into the diagnostic reliability of Hoover’s sign robust? As it turns out, the answer is—not really, despite the sign’s venerable history. The evidence base is very thin—as I explain below. Two issues are immediately apparent. First, the few studies that have been done only included handfuls of FND patients; the most authoritative validation study of Hoover’s sign had eight FND patients. Beyond that, studies were designed in a circular fashion, with Hoover’s sign apparently serving in many or all cases as a diagnostic tool initially as well as being the object of epidemiological investigation.

My colleague John Swartzberg, a public health expert and an emeritus professor of infectious diseases at University of California, Berkeley, said Hoover’s sign could be helpful in the context of other medical tests and data. But he added that the deficiencies of the studies made it hard to draw any solid conclusions from them.

The studies looking at the sensitivity and specificity of Hoover’s sign suffer from confirmation bias and small sample size. The sign was described over 100 years ago when there was a very different understanding of neurological disease. The idea is that there is a neurological loop for the hip flexors. If there is neurological disease on one side, that loop should be interrupted. That makes some degree of sense but it does not address other possibilities, such as neuropathies.”

Dr Putrino, whose interview with me prompted Dr Perez’ tweets, said this:

“A positive Hoover’s sign basically shows us that, for whatever reason, someone is unable to initiate a voluntary muscle contraction but that they have intact spinal reflexes. There are so many things that can go wrong with the nervous system to cause this that are easily missed during a mainstream neurological exam, especially if you have a bias towards diagnosing ‘conversion disorder.’ So to immediately and over-confidently assume that a positive Hoover’s sign means ‘functional neurological disorder’ is emblematic of the sort of thinking that we would associate with a clinician who is light on anatomical knowledge.”

Jonathan Edwards, an emeritus professor of medicine at University College London, agreed that Hoover’s sign could play a role in patient assessment but that it was unwarranted to suggest it had such high specificity:

“There is no doubt that there are people with neurological symptoms that have to be assigned to unexplained central problems. There is also no doubt that in some cases the defect seems to relate more to conscious conceptions than any neuroanatomy. Sometimes signs like Hoover’s sign are quite remarkably salient. From my perspective here the problem is not with the idea that neurological symptoms can occur as a result of conscious or unconscious mental processes. The problem is the claim that anyone understands what is going on or that any such mysterious goings on can be reliably recognised with such signs.”

(I responded to Dr Perez’s tweet to Dr Putrino, since I was also on the twitter thread. In my response, I indicated that the research on Hoover’s sign seemed underwhelming and asked if he could provide data from more studies. Dr Perez did not respond. Before posting this blog, I sent him an e-mail requesting comment and promising to post his response in full if/when I receive it. In the e-mail, I also mentioned that I was writing about the FND field’s habit of mis-citing Stone et al’s 2010 paper and tripling the reported FND prevalence rates; I suggested he might respond to that concern as well.)

********

Studies of Hoover’s sign: tiny samples and self-fulfilling prophecies

One study among the six links tweeted by Dr Perez was a 2014 article from Daum et al called “The value of ‘positive’ clinical signs for weakness, sensory and gait disorders in conversion disorder: a systematic and narrative review,” published in the Journal of Neurology, Neurosurgery, and Psychiatry. It was the only one of the six articles to offer an in-depth analysis of the accuracy of some of the signs, including Hoover’s sign. (A few years ago I wrote a post about this paper. This new post recycles a few paragraphs from the earlier one. I guess that would be self-plagiarism???) 

Daum et al’s 2014 review mentioned the DSM’s diagnostic change (at the time the paper was written, the change was proposed but had not yet been adopted), noting that the new definition depended on “the exclusion of neurological signs pointing to a lesion of the central or peripheral nervous system, together with the identification of ‘positive signs’ known to be specific for functional symptoms.” According to the review, “These positive signs are well known to all trained neurologists but their validity is still not established.”

The last sentence is interesting. It could perhaps be translated like this: “Although all trained neurologists know well that these positive signs identify people with a functional neurological disorder, we still have no actual evidence for that.”

As Daum et al recognized, that approach to medical care and treatment was no longer viable. “In the era of evidence-based medicine however, clinicians are facing a lack of proof regarding the validity of those clinical ‘positive signs,’” the authors noted. Hence, their decision to conduct a review of studies of the various signs for a range of FND presentations—functional weakness, functional sensory disorders and functional gait disorders.

After surveying the literature, the authors identified eleven studies that provided “some degree of validation” for 14 clinical signs. Ten of these studies included 23 or fewer subjects identified with FND. In ratings of study quality per the American Academy of Neurology’s classification system, nine of them were designated as Class III–the third out of four grades of quality. Only two included blinding. None included information on the key metric of inter-rater reliability, which would have assessed differences in how clinicians interpreted the various signs.

According to the review, these clinical signs overall had low sensitivity–meaning they would miss many of those who supposedly suffered from the relevant ailment, in this case FND. In contrast, the review reported, the signs had high specificity–meaning those identified by positive results were likely to have the condition and not something else instead. But the review’s account of its own limitations made clear that the findings of high specificity could not be taken at face value.

As the authors wrote: “As no gold standard exists for functional weakness, sensory and gait disturbances, precise diagnostic criteria on how a diagnosis of functional disorder has been made are not always provided [in the studies reviewed] and wrong attribution of subjects could have occurred. More importantly and more likely, this could have introduced a circular reasoning bias (self-fulfilling prophecy): if the studied sign is also used in the diagnosis process, the reported specificity is overestimated.”

That’s a significant point. If a studied sign is used in the diagnostic process, the reported specificity is essentially meaningless—to refer to it as “overestimated” would be generous. What has been proven in that case is that the sign is positive in the same people in whom it was positive the first time around. And that’s about it.

For Hoover’s sign, Daum et al included five studies and reported a pooled specificity of 100%. The earliest study, Ziv et al (1998), noted that Hoover’s sign “has several obvious limitations,” including that “it is semi-subjective, it is not quantitative, and it lacks sensitivity.” The study, which included nine FND patients, tested a computerized, quantified version of Hoover’s sign that does not seem relevant to its performance during standard use in clinical care.

The second study, from Sonoo (2004), cited clinicians “who have stated that this test [Hoover’s sign] may give variable or equivocal results” and was designed to investigate a different clinical sign for functional leg weakness; the author called this the abductor sign. The study, which included 16 patients diagnosed with FND, reported that the abductor sign provided better results overall than Hoover’s sign. The 2022 BMJ article co-authored by Dr Perez highlighted the abductor sign, along with Hoover’s sign, as the two validated signs for leg weakness; the article cited no additional studies.

Tinazzi et al (2008), a “brief report” in the journal Movement Disorders, was a study not of Hoover’s sign but of a finger abductor sign for arm paralysis. However, most of the ten FND patients in the study also had leg paralysis, and Hoover’s sign was part of the neurological examinations. The fourth paper, Stone et al (2010), was a descriptive epidemiology study of 107 patients diagnosed with functional weakness at neurology clinics. The investigators found that 60, or 56%, had a positive Hoover’s sign.

The fifth and most recent study, McWhirter et al (2011), was published in the Journal of Psychosomatic Research and was the only one actually designed to assess the diagnostic value of Hoover’s sign as used in clinical practice. In the introduction, the authors explained the rationale for the study in light of the proposed DSM changes:

“In 1908 Charles Hoover described a physical sign of functional (i.e. psychogenic) weakness of the lower extremities. Hoover’s sign is commonly used as a test for the diagnosis of functional weakness. However, no studies have tested the diagnostic performance of this sign in unselected patients with neurological symptoms. In the next revision of DSM, reference to positive physical signs of functional weakness may be incorporated within the criteria for conversion disorder itself. Data on the specificity and sensitivity of Hoover’s sign are therefore important.”

This study was part of a larger investigation of 377 patients admitted to hospital for suspected stroke. All underwent a thorough neurological exam, which included Hoover’s sign. Subsequently, an expert panel rendered a “gold standard” assessment of whether these patients had FND. The analysis of the validity of Hoover’s sign was based on the results in 124 patients who presented with leg weakness, eight of whom had been given a diagnosis of FND by the expert panel and 116 of whom received other diagnoses.

Hoover’s sign was positive In five of the eight FND patients, negative in two, and uncertain in one. Since all five who had positive Hoover’s signs had been given a gold-standard diagnosis of FND by the expert panel, and none of the additional 116 had a false positive Hoover’s sign, the specificity of the test was 100%. With three of those with gold-standard FND diagnoses being negative for Hoover’s sign, the sensitivity was only 63%.

One obvious point—this study included a teensy sample of people with FND. Second, the expert panel had the Hoover’s sign results at their disposal when they were making their gold-standard diagnoses. Given that all five in the sample with a positive Hoover’s sign had understandably been assigned to the FND group, the study seems mainly to have confirmed that a first positive Hoover’s sign accurately predicts a second.

As did the 2014 review in which it was referenced, McWhirter et al acknowledged the dilemma posed by a study in which participants were possibly or likely selected using the diagnostic tool being investigated—a design that would introduce what the authors called incorporation bias. They also acknowledged potential bias from lack of blinding. Here’s the relevant paragraph from the section on study limitations:

“The examining neurologist was not blinded to the diagnosis of functional disorder, in general performing both the history and the clinical examination. Hence, their interpretation of Hoover’s sign may have been influenced by the preceding history. Incorporation bias is also possible as Hoover’s sign may have been interpreted as a positive feature of a functional disorder by the adjudicating panel, and used to determine the presence or not of a functional disorder. Lastly we were limited by the small number of patients with functional symptoms presenting to the study. Therefore our estimates of diagnostic performance have wide limits of uncertainty around them.”

The problem is you could fly a plane through the vast space left by these limitations. They make it very difficult, if not impossible, to know how much credence, if any, can be given the reported findings. McWhirter et al concluded:

“Blinded studies with larger numbers of patients with functional weakness and several observers could provide better estimates of inter-observer reliability and diagnostic performance of this and other signs of functional weakness.” 

One more small study…

So was McWhirter et al the last word on the validity of Hoover’s sign? Not quite. In the BMJ paper Dr Perez and a colleague published last year, Diagnosis and Management of Functional Neurological Disorder, the list of research on Hoover’s sign included a single additional study, from 2015. (The 2022 list did not include the first study listed in the 2014 review, Ziv et al. Perhaps the authors decided that a study of a computerized, quantified Hoover’s sign was irrelevant to current practice.) In the 2022 paper, when the results from the identified studies were pooled, the specificity for Hoover’s sign was reported as 99.5% and the sensitivity as 61%.

The 2015 study included data on multiple signs tested in a group of 20 FND patients, with data for Hoover’s sign available for 17 of them. Unlike McWhirter et al, this study did not explicitly indicate whether those conducting the initial neurological assessment used Hoover’s sign. However, given the sign’s venerable history, it seems probable or likely that it would have been included in the standard work-up of patients presenting to a neurology clinic with relevant neurological complaints. Moreover, the study provided inter-rater reliability results for many other FND clinical signs, but not for Hoover’s sign—despite McWhirter et al’s explicit call for just such investigations. Nonetheless, the study categorized Hoover’s sign as “highly reliable,” citing “strong validation” in “several previous studies.”

That seems to be the extent of the data on Hoover’s sign. Perhaps it is true that a positive Hoover’s sign is highly specific and always or almost always indicates FND; I’m not a clinician and obviously can’t answer that question from personal experience. But as a journalist and public health academic I can read epidemiology papers, and the data on offer in these very few studies fails to make much of a case. The research is fraught with issues, including minuscule samples of FND patients and multiple forms of bias. It does not offer convincing or impressive support for bold assertions about the validity—and in particular the specificity—of Hoover’s sign. That doesn’t make the assertions–grounded in decades of authority from traditional practice–wrong. But it does mean there isn’t much proof to back them up.

Neurosymptoms.org, a popular site for FND patients and others maintained by a top FND expert, has sought to address the question of the reliability of the signs. The site acknowledges some issues and interpretive challenges, but also seeks to offer reassurance that the signs—and especially Hoover’s sign—have proven in studies to be sufficiently discriminating for the job. Here’s a key paragraph:

“For FND, each of the signs…has a varying degree of reliability. Studies looking at them show that they can discriminate between patients who have functional leg weakness and patients who have other neurological diseases, even when doctors don’t know what the diagnosis is, in advance. Some of them, like Hoover’s sign, perform well in these tests and for others we have less data or there needs to be more caution.”

This is obviously not the full background on Hoover’s sign. As far as I could find, neurosymptoms.org does not explain that Hoover’s sign, which is said to warrant less caution than other signs, was investigated in patients after it had definitely or likely been part of their diagnostic work-up. And it doesn’t mention that the state of data on Hoover’s sign is basically where it was in 2011 when McWhirter et al called for larger and more robust validation research—including investigations of inter-rater reliability—to supplement findings derived from that study’s sample of eight FND patients.

Where are these larger and more robust studies of Hoover’s sign? Why haven’t they been conducted in the last dozen years, if not by the authors of McWhirter et al than by others in the field? And is the research into the discriminatory value of other FND clinical signs any more persuasive than what’s available for this poster-sign?