Beyond the usual suspects: The role of ML in suspect diagnosis

Written by Corinne Stroum on 03 Feb 2021


81% of Medicare-aged adults have multiple chronic conditions (RAND, 2017).  The interplay of multiple conditions can increase complexity of management, requiring accurate documentation to ensure appropriate delivery of clinical and other care services. Medicare’s complex per-member payment algorithm for Medicare Advantage Plans, known as Hierarchical Condition Categories (HCC) or Total Risk Adjustment Factor (RAF) scores, addresses right-sized funding for member care.

Medicare Advantage plans also use lists of their members who have a chronic condition diagnosis to inform care coordination and condition management efforts. This documentation is key to ensuring clinical practices align with evidence-based medicine - annual preventative care, monitoring for key changes in lab results, or treatment with desired medications.

There are many ways to ensure accurate and complete diagnosis documentation over time. The simplest are rules-based systems: if a member is taking a key prescription, he or she might have a related condition. There are also “carry-forwards” – reviewing members with a history of a chronic conditions to ensure they are kept up-to-date. Some organizations rely on randomized chart auditing by a third party to ensure annual documentation is sufficient. More sophisticated solutions utilize natural language processing (NLP) technology to consume and summarize unstructured documentation, such as clinician notes or pathology interpretation. Advanced solutions aim to be supportive of clinician behavior without the overhead of rules maintenance.


What It Is: Clinician-informed Machine Learning

KenSci has developed a solution that relies on an advanced technique without requesting charts and detailed notes: it uses a Recommender System. A Recommender System is the technology that underpins retail tools that suggest, “Shoppers like you also purchased…”.  These systems learn the relationship between shoppers and the products they buy through information from both entities such as shopper demographics and product attributes. By studying these relationships, the system makes recommendations through inferences of similarity.  We built a Recommender System to suggest diagnoses that may not be documented but could meet Medicare’s risk adjustment criteria.  We call these Suspect Diagnoses. 


Why It’s Different

Current approaches in suspect diagnoses take three forms:

Existing Model



Advantages of KenSci’s Recommender System

“Carry-forward” suspecting: If a member had a condition last year, he/she should be evaluated for it this year.

A reliable baseline - many conditions of interest are chronic and consistent from year-to-year.

Not useful on new members, or those members who have not been previously documented as having a condition.

The recommender system is especially successful where the carry-forward logic leaves opportunities: those members who are undergoing treatment for a condition that a clinician has not documented using CMS criteria.

Auditing: The health plan reviews a randomized set of charts for suspect diagnoses.

Yields extremely thorough documentation for the randomly selected members

The most time-intensive process, and the investment may not always yield opportunities to improve documentation

Random charts may not always provide opportunities to improve documentation.  With a recommender system, clinicians only see the diagnoses that are suspected and missing from a chart.

Rules-based suspecting: Clinicians write rules such as, “If a member is taking medication X, suspect him/her for condition Y”

Rigorous and rich with clinical signal

These rules require maintenance and heavy investment of clinicians across specialties. They must also change with clinical practice.

The recommender system infers the rules a clinician might author and highlights those with most significant impact for clinical review.

Natural Language Processing: Utilizes Text Analytics to parse unstructured clinician notes and infer missing diagnoses. 

Uses state-of-the-art technology and “sees” what traditional structured text may not be capturing

Unstructured notes are not always available to health plans.

Despite the limitations of structured text, there is still significant information to glean - the recommender system is the best technology to do so.


How It’s Used

In a pilot deployment, we focused on 7 chronic conditions we believe to be highly prevalent in our population of interest. Our clinical team reviewed the relevance for all associations, and we filtered per-member recommendations to only the strongest clinically validated associations.  We surfaced suspect diagnoses with their global explainer and supporting data: for example, prescription fill event details such as the prescribing clinician and his/her specialty, the number of days supplied, and the fill date.

We shared our results with medical group physicians on paper forms and integrated it with one EMR for an in-clinic pilot.  We intend for the physicians to use this information to look for evidence of a diagnosis that may be incomplete or undocumented; the physician may order additional tests or request data from specialists.  Our goal is to enhance accuracy of diagnostic coding and, through this, treatment of a member’s complete clinical picture.



To keep our sample size manageable and reduce noise to our consumers, we proceeded only with those suspects that our joint clinician team deemed to be credible wherein the source of signal reliably pointed to the suspect condition. This was 37% of our suspect diagnosis events – over 120,000 distinct member-diagnosis pairings. We are continuing to improve this recommendation rate by reviewing model outputs and expanding our techniques.

The Takeaway

Our pilot has demonstrated that the Recommender System suits suspect diagnoses if we apply it mindfully: as a prompt for evaluation by the member’s primary care physician. Our system has limited overhead when compared to current market alternatives and does not require coordination with outsourced coding experts.


(This blog was written with inputs from Vikas Kumar, PhD and Jasmine Wilkerson from KenSci)

Access Our Latest Thinking

Research and other whitepapers on AI Led healthcare transformation

Case Study

See how Advocate Aurora Health is working on fighting the opioid battle

May 13, 2020


SLUHN is fighting COVID-19 by staying ahead with a real-time command center

September 4, 2020


HIT Like a girl~ Health from Corinne Stroum on Healthcare Informatics

August 12, 2020

Press Release

SCAN Health and KenSci are leveraging AI to help the senior members

August 11, 2020