AI vs. Human Therapists: Study Finds ChatGPT Responses Rated Higher

Summary: A new study suggests that ChatGPT’s responses in psychotherapy scenarios are often rated higher than those written by human therapists. Researchers found that participants struggled to distinguish between AI-generated and therapist-written responses in couple’s therapy vignettes. ChatGPT’s responses were generally longer and contained more nouns and adjectives, providing greater contextualization.

This additional detail may have contributed to higher ratings on core psychotherapy principles. The findings highlight AI’s potential role in therapeutic interventions while raising ethical and practical concerns about its integration into mental health care. Researchers emphasize the need for professionals to engage with AI developments to ensure responsible oversight.

Key Facts

Higher Ratings: ChatGPT’s responses were rated higher on psychotherapy principles.
Indistinguishable Responses: Participants struggled to differentiate AI from human-written responses.
Potential Integration: Findings suggest AI could play a role in future therapeutic interventions.

Source: PLOS

When it comes to comparing responses written by psychotherapists to those written by ChatGPT,the latter are generally rated higher, according to a study published February 12, 2025, in the open-access journal PLOS Mental Health by H. Dorian Hatch, from The Ohio State University and co-founder of Hatch Data and Mental Health, and colleagues

Whether machines could be therapists is a question that has received increased attention given some of the benefits of working with generative artificial intelligence (AI).

This finding echoes Alan Turing’s prediction that humans would be unable to tell the difference between responses written by a machine and those written by a human. Credit: Neuroscience News

Although previous research has found that humans can struggle to tell the difference between responses from machines and humans, recent findings suggest that AI can write empathically and the generated content is rated highly by both mental health professionals and voluntary service users to the extent that it is often favored over content written by professionals.

In their new study involving over 800 participants, Hatch and colleagues showed that, although differences in language patterns were noticed, individuals could rarely identify whether responses were written by ChatGPT or by therapists when presented with 18 couple’s therapy vignettes.

This finding echoes Alan Turing’s prediction that humans would be unable to tell the difference between responses written by a machine and those written by a human. In addition, the responses written by ChatGPT were generally rated higher in core psychotherapy guiding principles.

Further analysis revealed that the responses generated by ChatGPT were generally longer than those written by the therapists. After controlling for length, ChatGPT continued to respond with more nouns and adjectives than therapists.

Considering that nouns can be used to describe people, places, and things, and adjectives can be used to provide more context, this could mean that ChatGPT contextualizes more extensively than the therapists.

More extensive contextualization may have led respondents to rate the ChatGPT responses higher on the common factors of therapy (components that are common to all modalities of therapy in order to achieve desired results).

According to the authors, these results may be an early indication that ChatGPT has the potential to improve psychotherapeutic processes. In particular, this work may lead to the development of different methods of testing and creating psychotherapeutic interventions.

Given the mounting evidence suggesting that generative AI can be useful in therapeutic settings and the likelihood that it might be integrated into therapeutic settings sooner rather than later, the authors call for mental health experts to expand their technical literacy in order to ensure that AI models are being carefully trained and supervised by responsible professionals, thus improving quality of, and access to care.

The authors add: “Since the invention of ELIZA nearly sixty years ago, researchers have debated whether AI could play the role of a therapist. Although there are still many important lingering questions, our findings indicate the answer may be “Yes.”

“We hope our work galvanizes both the public and Mental Practitioners to ask important questions about the ethics, feasibility, and utility of integrating AI and mental health treatment, before the AI train leaves the station.”

About this AI and psychotherapy research news

Author: Charlotte Bhaskar
Source: PLOS
Contact: Charlotte Bhaskar – PLOS
Image: The image is credited to Neuroscience News

Original Research: Open access.
“When ELIZA meets therapists: A Turing test for the heart and mind” by H. Dorian Hatch et al. PLOS Mental Health

Abstract

When ELIZA meets therapists: A Turing test for the heart and mind

“Can machines be therapists?” is a question receiving increased attention given the relative ease of working with generative artificial intelligence.

Although recent (and decades-old) research has found that humans struggle to tell the difference between responses from machines and humans, recent findings suggest that artificial intelligence can write empathically and the generated content is rated highly by therapists and outperforms professionals.

It is uncertain whether, in a preregistered competition where therapists and ChatGPT respond to therapeutic vignettes about couple therapy, a) a panel of participants can tell which responses are ChatGPT-generated and which are written by therapists (N = 13), b) the generated responses or the therapist-written responses fall more in line with key therapy principles, and c) linguistic differences between conditions are present.

In a large sample (N = 830), we showed that a) participants could rarely tell the difference between responses written by ChatGPT and responses written by a therapist, b) the responses written by ChatGPT were generally rated higher in key psychotherapy principles, and c) the language patterns between ChatGPT and therapists were different.

Using different measures, we then confirmed that responses written by ChatGPT were rated higher than the therapist’s responses suggesting these differences may be explained by part-of-speech and response sentiment.

This may be an early indication that ChatGPT has the potential to improve psychotherapeutic processes. We anticipate that this work may lead to the development of different methods of testing and creating psychotherapeutic interventions.

Further, we discuss limitations (including the lack of the therapeutic context), and how continued research in this area may lead to improved efficacy of psychotherapeutic interventions allowing such interventions to be placed in the hands of individuals who need them the most.

Ideas and Discoveries

AI vs. Human Therapists: Study Finds ChatGPT Responses Rated Higher

About this AI and psychotherapy research news

Categories

ID

NEW

The 10 best space and sci-fi Super Bowl commercials of all time

4,000-year-old mural reveals complex worldview of ancient Peru

About this AI and psychotherapy research news

Related posts

Latest posts