• Contact us
  • Cookie Privacy Policy
  • Disclaimer
  • DMCA
  • Get the latest Health and Fitness News on
  • Privacy Policy
  • Terms and Conditions
Your Fitness News Today
No Result
View All Result
  • Home
  • Fitness
  • Mental Health
  • Skincare
  • Weight Loss
  • Workout
  • Nutrition
  • Yoga
  • Home
  • Fitness
  • Mental Health
  • Skincare
  • Weight Loss
  • Workout
  • Nutrition
  • Yoga
No Result
View All Result
Your Fitness News Today
No Result
View All Result

GenAI chatbots can treat clinical level mental health symptoms

July 28, 2025
in Mental Health
61 1
0
Home Mental Health
Share on FacebookShare on Twitter


You might also like

The ongoing hunt for biomarkers: Can machine learning help?

Perinatal mental health in women with overweight: A gut feeling?

Targeting distressing mental imagery in psychosis: a neglected but promising area for intervention

AI

For some, the title of this blog might look like ‘click-bait’ – and dismissed as a further example of the exaggeration that can surround discussions of Generative Artificial Intelligence (GenAI). For others, the statement may seem axiomatic and obvious given that research has already suggested that chatbots are a feasible, engaging, and effective way to deliver Cognitive Behavioural Therapy (CBT; e.g., Fitzpatrick et al., 2017).

Yet the title to this blog is neither hyperbole nor self-evident. Although chatbots have previously been shown to have benefits, these tended to be rule-based agents, “limited by their reliance on an explicitly programmed decision trees and restricted inputs” (Heinz et al., 2025, p.2). It therefore is of interest that a recent paper by Heinz and colleagues (2025) reported on a randomised controlled trial (RCT) to demonstrate the effectiveness of a fully GenAI chatbot for treating clinical level mental health symptoms.

Within this blog, we look at the details of this study and ask where it leaves us going forward.

Is GenAI finally on the verge of transforming the way we deliver mental health care?

Is GenAI finally on the verge of transforming the way we deliver mental health care?

Methods

The authors conducted a national RCT of adults with clinically significant symptoms of major depressive disorder (MDD), generalised anxiety disorder (GAD) or at high risk for feeding and eating disorders (FED). The 210 eligible participants were stratified into one of these three groups and randomly assigned to a 4-week chatbot intervention (n = 106) or waitlist control (n = 104).

Participants in the intervention group were prompted daily to interact with a chatbot (‘Therabot’) during treatment phase (4 weeks). During post-intervention (weeks 4-8) and follow-up, participants were not prompted, but were still permitted to use Therabot.

The chatbot was developed with over 100,000 human hours and utilises a generative large language model (LLM) “fine-tuned on expert-curated mental health dialogues” (p.3). Based on third-wave CBT, Therabot allowed users to either initiate a session directly in the chat interface or reply to notifications. A user prompt, conversation history and most recent user message were then combined and sent to the LLM. All responses from Therabot were supervised by trained personnel post-transmission. In the event of an inappropriate response from Therabot, the participant was contacted to provide correction.

Primary outcomes were symptom changes from baseline to postintervention (4 weeks) and follow up (8 weeks). Measures included the Patient Health Questionnaire (PHQ-9), Generalised Anxiety Disordered Questionnaire (GAD-Q-IV), and the Weight Concerns Scale (WCS) within the Stanford-Washington University Eating Disorder (SWED). Secondary outcomes included measures of therapeutic alliance, and satisfaction and engagement with Therabot.

Results

Participant characteristics

Of the 210 participants recruited to the study, 125 (59.5%) identified as female and 166 identified as heterosexual (79.05%). Around half of the sample (53.3%) were Non-Hispanic White and approximately 60% had a Bachelor degree or above. The paper reports that 68% (n = 142) with MDD, 55% (n = 116) with GAD and 42% (n = 89) with CHR-FED at baseline. Minimal withdrawal or attrition was seen across the 8-week period (n = 7).

Main findings

Therabot users showed significantly greater reductions in depression symptoms. The mean change on PHQ-9 score from baseline to postintervention was -6.13 (SD = 6.12) in the intervention group and -2.63 (SD = 6.03) in the control group. Change from baseline to follow-up was -7.93 (SD = 5.97) in the intervention group and -4.22 (SD = 5.94) in the control group. As the authors note, a decrease of 5 or more has been shown to constitute clinically meaningful change.

Similar patterns were observed for anxiety symptoms. The GAD-Q-IV does not have established clinically meaningful change thresholds so the Cohen’s d values for effect sizes are most instructive here. Both groups see an improvement from baseline to follow up but this is significantly larger in the intervention group ( d = 0.84, 95% CI [0.38 to 1.298], p = .001 at 4 weeks; and d = 0.79, 95% CI [0.32 to 1.26], p = .003 at 8 weeks). If we take the ‘rule-of-thumb’ that a Cohen’s d of 0.8 or greater signifies a substantial difference then these would be considered ‘large’ effects.

The WCS score ranges from 0 to 100 and also does not have established meaningful change thresholds. The effect sizes do suggest that the intervention group showed greater improvement in weight concerns than the control group (d = 0.82, 95% CI [0.26 to 1.37], p = .008 at 4 weeks; and d = 0.63, 95% CI [0.07 to 1.18], p = .027 at 8 weeks).

With respect to secondary outcomes, the mean number of messages sent by participants was 260 (min = 1, max = 1,557) and the mean number of days interacting was 24 (min = 1, max = 60). For the authors, these figures suggest over the space of 4 weeks, participants were able to develop a working alliance comparable to that shown in an outpatient psychotherapy sample.

Therabot users showed greater reductions in depression, generalised anxiety and feeding and eating disorder symptoms at both post-intervention and follow-up in comparison to the waitlist control.

Therabot users showed greater reductions in depression, generalised anxiety and feeding and eating disorder symptoms at both post-intervention and follow-up in comparison to the waitlist control.

Conclusions

The key take-home message from this paper is that a GenAI chatbot can reduce clinical symptoms across several different mental health conditions. The authors suggest that Therabot’s success may be driven by three main factors:

  1. Therabot is evidence-informed, rooted in evidenced-based psychotherapies and built on what we know already works.
  2. Users had unrestricted access, meaning that they could engage at any time and place. The ability to access therapeutic support wherever and whenever most needed may be a key advantage of digital therapeutics.
  3. Unlike existing chatbots for mental health treatment, Therabot was powered by GenAI, “allowing for natural, highly personalised, open-ended dialogue” (Heinz et al. 2025, p.10).
Therabot’s success may be driven by a range of different factors, including the fact that it is based on a range of evidence-based psychotherapies.

Therabot’s success may be driven by a range of different factors, including the fact that it is based on a range of evidence-based psychotherapies.

Strengths and limitations

A key strength of this study is the robustness of the design. The authors conducted a national RCT, and statistical considerations look appropriate (e.g., a Monte-Carol simulation study was used to estimate the statistical power). Although only ever as good as the assumptions underpinning it, these methods do work well with complex designs. Missing data was also minimal throughout, including with the user satisfaction survey. The authors also recognised that there is potential in waitlist control trials for differential contact between the intervention and control group and attempted to mitigate this with by planning equivalent contact where possible.

The authors also seem to have paid attention to some of the more general methodological challenges involved in running a study on mobile/digital therapeutics. For example, Therabot ran on both Android and iOS devices. Although the research remains a little unequivocal, studies have suggested that, in comparison to Android users, iPhone users are more likely to be younger, female, and have higher levels of emotionality (Shaw et al., 2016). Restricting the sample to either Android or iOS could therefore have skewed the sample. The authors also “assumed participant identity to be truthful unless we detected irregularities in the data”, seemingly recognising some of the challenges of online recruitment as well as the increasing challenge of ‘imposter participants’(Sharma et al., 2024), such as preventing duplicate sign-ups and two-factor authentication.

There are, however, limitations. The authors do note the short follow-up period and that longer studies are needed to assess the durability of Therabot’s effectiveness. They also recognise the potential self-selection and possible bias toward younger, technologically-minded participants who were open to AI.

Less is said by the authors about the fact that the study was not blinded and the fact that other interventions were being delivered at the same time.  Of those currently receiving treatment (around 27%), 17 people were receiving both medication and psychotherapy. Further to this, when considering the possible self-selection and bias noted above the authors move over this quite rapidly. There is little overt recognition of the role the socio-economic status (SES) might be playing here. The baseline characteristics show 42% of the overall sample had a Bachelor’s degree and around 17% had a Master’s degree or higher. Research continues to link academic achievement and SES and – as such – it is possible that the education profile of the sample means that it was also skewed towards those with higher SES. Further reflection by the authors on the possible implications of this would have been welcome.

Heinz et al. (2025) note the potential self-selection and possible bias toward younger, technologically-minded participants who were open to AI in this study, which could impact the generalisability of the results.

Heinz et al. (2025) note the potential self-selection and possible bias toward younger, technologically-minded participants who were open to AI in this study, which could impact the generalisability of the results.

Implications for practice

So where does this leave us going forward? As I write this, the BBC news is running a story with the title “NHS plans ‘unthinkable’ cuts to balance books” – with one “boss of a mental health trust” telling the BBC that waits for psychological therapies now exceed a year. It is here that we often situate our discussions of what GenAI may, or may not, be able to do. On the one hand, GenAI may provide solutions to a mental health infrastructure which is “inade­quately resourced to meet the current and growing demand for care” (Heinz et al., 2025, p.2). On the other, there are concerns around privacy, data protection, biased datasets, widening inequalities and generic models being inappropriately deployed. Professor Miranda Wolpert neatly summarises these debates in a recent Wellcome blog.

We see this now familiar tension play out within this paper. The authors suggest that the paper does show that fine-tuned GenAI chatbots offer a feasible approach to delivering personalised mental health at scale. They then add the caveat that further research with larger samples is needed to confirm their effectiveness and generalisability. Elsewhere, the authors emphasise the need to understand GenAI’s potential role and risks in mental health treatment and the need for guardrails and close human supervision whilst testing. Indeed, within their own study, post-transmission staff intervention was required 15 times for safety concerns and 13 times to correct inappropriate responses provided by Therabot.

At one level, then, the implications remain within this familiar ground of ‘potential for change’ versus safeguards being necessary when testing similar future models to ensure safety. The need for larger samples means that chatbots like Therabot are still a long way from implementation.

The authors also note that the inner processes of Gen-AI models are difficult or impossible to understand analytically. This introduces a further implication for practice in that it invites us to think about if and how we can ever move to implementation. Can the current methods we use to conduct and evaluate research ever be made compatible with something considered “difficult or impossible to understand analytically”? Or what might need to change here?

In light of concerns related to privacy, biased datasets, and widening inequalities, should we be using GenAI in mental health treatments?

In light of concerns related to privacy, biased datasets, and widening inequalities, should we be using GenAI in mental health treatments?

Statement of interests

Robert Meadows has recently completed a British Academy funded project titled: “Chatbots and the shaping of mental health recovery”. This work was carried out in collaboration with Professor Christine Hine.

Links

Primary paper

Heinz, M. V., Mackin, D. M., Trudeau, B. M., Bhattacharya, S., Wang, Y., Banta, H. A., … & Jacobson, N. C. (2025). Randomized trial of a generative AI chatbot for mental health treatment. Nejm Ai, 2(4), AIoa2400802.

Other references

Fitzpatrick, K. K., Darcy, A., & Vierhile, M. (2017). Delivering cognitive behavior therapy to young adults with symptoms of depression and anxiety using a fully automated conversational agent (Woebot): a randomized controlled trial. JMIR Mental Health, 4(2), e7785.

Sharma, P., McPhail, S. M., Kularatna, S., Senanayake, S., & Abell, B. (2024). Navigating the challenges of imposter participants in online qualitative research: Lessons learned from a paediatric health services study. BMC Health Services Research, 24(1), 724.

Shaw, H., Ellis, D. A., Kendrick, L. R., Ziegler, F., & Wiseman, R. (2016). Predicting smartphone operating system from personality and individual differences. Cyberpsychology, Behavior, and Social Networking, 19(12), 727-732.

Wolpert, M. (2025). AI and mental health: “it could help revolutionise treatments”. Wellcome.

Photo credits



Source link

Share30Tweet19

Recommended For You

The ongoing hunt for biomarkers: Can machine learning help?

by Your Fitness News Today Staff
November 10, 2025
0
The ongoing hunt for biomarkers: Can machine learning help?

Psychiatry has long been plagued by the fact that despite diagnoses of things like depression and anxiety being considered distinct disorders, they tend to correlate with each other...

Read more

Perinatal mental health in women with overweight: A gut feeling?

by Your Fitness News Today Staff
November 7, 2025
0
Perinatal mental health in women with overweight: A gut feeling?

The gut microbiome consists of approximately 20-100 trillion microorganisms, encompassing at least a thousand distinct bacterial species (Sender et al., 2016). The microbiome – the combined genetic material...

Read more

Targeting distressing mental imagery in psychosis: a neglected but promising area for intervention

by Your Fitness News Today Staff
November 6, 2025
0
Targeting distressing mental imagery in psychosis: a neglected but promising area for intervention

It has been found that distressing intrusive mental imagery is common, occurring in around 70% of people who live with psychosis (Morrison et al., 2002; Schulze at al.,...

Read more

Doubling of respiratory deaths in people with severe mental illness

by Your Fitness News Today Staff
November 5, 2025
0
Doubling of respiratory deaths in people with severe mental illness

The 15-year life expectancy gap between people with severe mental illness (SMI) and the general population is shocking, but well-established (Chan et al., 2023), and we must do...

Read more

The relationship between hobbies and substance misuse in adolescents

by Your Fitness News Today Staff
November 4, 2025
0
The relationship between hobbies and substance misuse in adolescents

The history of medicine and social science is littered with the wreckage of ostensibly good ideas that were founded on the belief that an epidemiological association was causal....

Read more
Next Post
10 Foods That Have Way More Calories Than You Think

10 Foods That Have Way More Calories Than You Think

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Browse by Category

  • Fitness
  • Mental Health
  • Nutrition
  • Skincare
  • Weight Loss
  • Workout
  • Yoga

Recent Posts

  • You Can Save Your Thyroid – But Only If You Do THIS First
  • 10-Minute Morning Yoga + Pilates for Essential Core-Strengthening
  • What Jupiter Retrograde 2025 Means, According to Your Sign
  • The ongoing hunt for biomarkers: Can machine learning help?
  • Your Weekly Horoscope, November 9-15, 2025: Finding Where You Belong

Recent Comments

No comments to show.
RSS Facebook

CATEGORIES:

Your Fitness News Today

Get the latest Health and Fitness News on YourFitnessNewsToday.com.

Wellbeing tips, weight Loss, workouts, and more...

SITE MAP

  • Contact us
  • DMCA
  • Disclaimer
  • Privacy Policy
  • Cookie Privacy Policy
  • Terms and Conditions

Copyright © 2024 Your Fitness News Today.
Your Fitness News Today is not responsible for the content of external sites.

No Result
View All Result
  • Home
  • Fitness
  • Mental Health
  • Skincare
  • Weight Loss
  • Workout
  • Nutrition
  • Yoga

Copyright © 2024 Your Fitness News Today.
Your Fitness News Today is not responsible for the content of external sites.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In