HomeBusinessAI Deriving Clinically Apt Mental Health Assessments Gets Sharply Uplifted Via Dynamic...

AI Deriving Clinically Apt Mental Health Assessments Gets Sharply Uplifted Via Dynamic Prompt Engineering


Improving AI for mental health by clever adoption of dynamic prompt engineering and weighted transformers.

getty

In today’s column, I examine a promising approach that enhances the capabilities of generative AI and large language models (LLMs) to provide more accurate mental health assessments.

The deal is this. Making prudent and proper mental health assessments by popular AI is a vital task. Millions upon millions of people are using LLMs such as ChatGPT, Claude, Gemini, Grok, and the expectation of society is that the AI will suitably determine if someone is experiencing a mental health condition.

Unfortunately, the major LLMs tend to do a less-than-stellar job at this. That’s not good. An AI can fail to detect that someone is embroiled in a mental health disorder. AI can mistakenly label a severe mental health issue as being inconsequential. All sorts of problems arise if LLMs aren’t doing a reliable and accurate job at assessing the mental health of a person using the AI.

An intriguing research study seeking to overcome these failings has identified and experimented with a new method that entails dynamic prompt engineering and the use of a weighted transformer architecture. The goal is to substantially improve LLMs at diagnosing and labeling detectable mental health conditions. This is the kind of work that is sorely needed to ensure that AI is doing the right thing and avoiding doing the wrong thing when it comes to performing ad hoc mental health assessments at scale.

Let’s talk about it.

This analysis of AI breakthroughs is part of my ongoing Forbes column coverage on the latest in AI, including identifying and explaining various impactful AI complexities (see the link here).

AI And Mental Health Therapy

As a quick background, I’ve been extensively covering and analyzing a myriad of facets regarding the advent of modern-era AI that produces mental health advice and performs AI-driven therapy. This rising use of AI has principally been spurred by the evolving advances and widespread adoption of generative AI. For a quick summary of some of my posted columns on this evolving topic, see the link here, which briefly recaps about forty of the over one hundred column postings that I’ve made on the subject.

There is little doubt that this is a rapidly developing field and that there are tremendous upsides to be had, but at the same time, regrettably, hidden risks and outright gotchas come into these endeavors too. I frequently speak up about these pressing matters, including in an appearance last year on an episode of CBS’s 60 Minutes, see the link here.

People Are Using AI For Mental Health Advice

The most popular use nowadays of the major LLMs is for getting mental health guidance, see my discussion at the link here. This occurs easily and can be undertaken quite simply, at a low cost or even for free, anywhere and 24/7. A person merely logs into the AI and engages in a dialogue led by the AI.

There are sobering worries that AI can readily go off the rails or otherwise dispense unsuitable or even egregiously inappropriate mental health advice. Huge banner headlines in August of this year accompanied a lawsuit filed against OpenAI for their lack of AI safeguards when it came to providing cognitive advisement. Despite claims by AI makers that they are gradually instituting AI safeguards, there are still a lot of downside risks of the AI doing untoward acts, such as insidiously helping users in co-creating delusions that can lead to self-harm.

For the details of the OpenAI lawsuit and how AI can foster delusional thinking in humans, see my analysis at the link here. I have been earnestly predicting that eventually all of the major AI makers will be taken to the woodshed for their paucity of robust AI safeguards. Lawsuits aplenty are arising. In addition, new laws about AI in mental healthcare are being enacted (see, for example, my explanation of the Illinois law, at the link here, the Nevada law at the link here, and the Utah law at the link here).

Building To Improve AI For Mental Health

Many in the AI community are hopeful that we can build our way toward AI that does a superb job in the mental health realm. In that case, society will feel comfortable using LLMs for this highly sensitive usage. Intense research is taking place to devise sufficient AI safeguards and craft LLMs that are on par with and even exceed human therapists in quality-of-care metrics (see my coverage at the link here).

One crucial focus entails constructing AI to do a top-notch job of ascertaining that a mental health issue might be at play. Right now, there are abundant false positives, namely that the AI falsely assesses that someone is encountering a demonstrative mental health condition when they really are not. There are also too many false negatives. A false negative is when the AI fails to detect a mental health issue that could have been ascertained.

I have previously performed an eye-opening mini-experiment using the classic DSM-5 guidebook of psychological disorders (see my analysis at the link here). In my informal analysis, I wanted to see whether ChatGPT could adequately determine mental health conditions based on the DSM-5 stated symptoms. By and large, ChatGPT seemed to only succeed when a conversation laid out the symptoms in a blatantly obvious way.

One interpretation of this result is that perhaps the AI was tuned by the AI maker to minimize the chances of false positives. The AI maker can choose to set parameters indicating that only once a high bar has been reached would the AI suggest the presence of a mental health condition. This reduces false positives. Unfortunately, it also tends to increase or possibly maximize false negatives (mental health conditions that could have been detected but weren’t).

AI makers are currently caught between a rock and a hard place. Is it sensible to minimize false positives, but do so at the possibility of increasing or maximizing false negatives? Of course, the preference would be to minimize both factors. That’s the desired goal.

Let’s see how that might be accomplished.

Research Toward Improving Assessments

In a notable research study entitled “DynaMentA: Dynamic Prompt Engineering and Weighted Transformer Architecture for Mental Health Classification Using Social Media Data” by Akshi Kumar, Aditi Sharma, and Saurabh Raj Sangwan, IEEE Transactions on Computational Social Systems, June 4, 2025, these salient points were made about enhancing AI in this realm (excerpts):

  • “Mental health classification is inherently challenging, requiring models to capture complex emotional and linguistic patterns.”
  • “Although large language models (LLMs) such as ChatGPT, Mental-Alpaca, and MentaLLaMA show promise, they are not trained on clinically grounded data and often overlook subtle psychological cues.”
  • “Their predictions tend to overemphasize emotional intensity, while failing to capture contextually relevant indicators that are critical for accurate mental health assessment.”
  • “This paper introduces DynaMentA (Dynamic Prompt Engineering and Weighted Transformer Architecture), a novel dual-layer transformer framework that integrates the strengths of BioGPT and DeBERTa to address these challenges.”
  • “Through dynamic prompt engineering and a weighted ensemble mechanism, DynaMentA adapts to diverse emotional and linguistic contexts, delivering robust predictions for both binary and multiclass tasks.”

As stated above, a key element of diagnosing or assessing the potential presence of a mental health condition involves making use of psychological cues. Often, a generic conventional chatbot does not home in on the contextual milieu that can be a telltale clue that a mental health condition is likely present.

The researchers sought to overcome the contextuality dilemma.

How It Works

At a 30,000-foot level, here’s what this preliminary research study devised and opted to test (for the specific details, please see the study).

When a user enters a prompt, the LLM employs dynamic prompt engineering to refine the prompt (for more about prompt engineering strategies and best practices, see my extensive coverage at the link here). The prompt is flourished by accentuating contextual cues. For example, if a user has entered a prompt that says “I feel hopeless”, the AI can use two other components that include primary and secondary indicators. This provides a helpful, structured representation associated with the potential mental state of the us

By extracting domain-specific contextual cues from these rich sources, consisting of two allied systems regarding mental health conditions (BioGPT and DeBERTa), a fuller contextual cue vector can be created. This is intended to more deeply capture relevant semantic and syntactic information. An assembled weighted ensemble then enables a more thorough assessment. The process is iterated until a threshold is reached, and then a final classification is produced.

This outside-the-box method was implemented and tested. The testing made use of Reddit postings, as collected into special sets known as Dep-Severity, SDCNL, and Dreaddit. Those sets each consist of several thousand Reddit posts that have been annotated or labeled regarding detected mental health conditions (e.g., depression, potential for self-harm, anxiety).

Tests were undertaken. And, when compared to several other AI models, the results of this new approach were quite encouraging. The DynaMentA configuration appeared to outperform the other baseline models, doing so across a wide variety of metrics. This included surpassing ChatGPT in these kinds of assessments.

Devising Architectural Innovations

The approach cited was principally a preliminary exploration. I will keep watch for further advancements to this particular approach. I’d especially like to see this tested at scale. Plus, it would be meritorious if independent third parties tried their hand at similar LLM adjustments and shared their results accordingly.

Time will tell whether this proves to be a valued pursuit.

Overall, we need lots of ongoing hard work, innovation, and creativity in trying to push ahead on making generative AI and LLMs capable of performing the revered task of mental health advisement. I applaud earnest efforts on this front.

Are We On The Right Track

Please know that highly vocal skeptics and cynics are exceedingly doubtful that we will ever make adequate advancements in AI for mental health. In their view, therapy and mental health guidance can only be undertaken on a human-to-human basis. They therefore rule out AI as getting us there, no matter what cleverness, trickery, or ingenuity is attempted.

My response to that doomy and gloomy perspective is best stated by Ralph Waldo Emerson: “Life is a series of surprises and would not be worth taking or keeping if it were not.” I vote that uplifting surprises about advancements in AI for mental health are up ahead.

Stay tuned.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Must Read

spot_img