7 Ways to Make Clinical Measures More Relevant to Physicians

If you want doctors to change their behavior, these seven aspects must be addressed when creating metrics of performance.

The challenge: Many measures of the quality of healthcare are important, but don’t get much engagement from practicing doctors. One example is the rate of mammography screening. Doctors often say, “I can order the test, but how can I be held accountable for whether they go to get it done or not?” Another example is Average Length of Stay in the hospital (ALOS). This is a hugely important metric that drives hospital costs and is the largest component of the healthcare expenditures. But few doctors will feel personally responsible for the length of stay of a patient, so the measure is met with frustration and disengagement.

The Solution: Through trial and error, we have discovered 7 characteristics of clinical measures that overcome the disengagement. By addressing these aspects of a measure, one greatly increases the chance of interest and subsequent action in response to the metric.

1. Process vs Outcome (ask yourself if the measure is ‘actionable’)

Outcome measures are more relevant than process measures, certainly when it comes to attaching financial incentives, suggests Ashish Jha. For example, in the case of a hip replacement, it would seem more important to measure what the long-term improvement in functional state is, rather than what drugs were used in the operating room. However, the opposite is true when it comes to trying to achieve individual behavior change. For that use case, a measure needs to be ‘actionable’ and our definition for that is “Does it tell me what I can change tomorrow?” To get a behavior to change, you must describe it and measure it discretely.

For major joint replacements, there is a drug, Tranexamic Acid, that dramatically cuts the rate of bleeding, but it is relatively new and not yet used universally. To increase adoption, we measured this behavior for each surgeon and subsequently saw the percent of cases where Tranexamic Acid was used go from 20% to 90%, which cut the use of blood transfusions in half. By using the specific actionable measure, we achieved a significant clinical improvement.

2. Personal Attribution

Assigning individual names to a measure can be challenging and require a lot of extra effort. Frequently data analysts stop short of the extra effort and say, “it was too hard to assign the measure to individuals, so we are reporting at the group level.” Even the Agency for Healthcare Research and Quality recommends that physicians be measured in groups rather than as individuals to make the measure more valid. The problem this causes is that no doctor who hears the group rate will feel any reason to change. For example, to report that a group of hospitalists has a rate of ordering telemetry (cardiac monitoring) of 50% will not be personally compelling to any of the individual hospitalists, as each individual will likely say to themselves, “That sounds too high, but I don’t think that my own rate is that bad…”

This changes completely when the individual rates are calculated and found to vary between 25% - 75%. Seeing one’s own rates displayed in front of peers causes intense personal engagement, especially from doctors at either extreme. The effort to tease out which hospitalist to hold responsible for the order is quite complicated and time consuming, but the result is that a meaningful discussion will occur because of the personal engagement and the group average frequently changes because of that discussion. That shift in behavior is much less likely when merely reporting the group average.

3. Clinical Relevance

A measure that points merely to a cheaper option rather than to something that benefits a patient is going to land poorly when addressing a group of clinicians. Reducing costs for a patient might possibly be viewed as compatible with a physician’s professional role, but saving money for the hospital or any large institution is certainly not part of that clinical identity. But there is a way to combine improvements in quality and affordability. Launching a project that identifies overuse of CT scans as a means to save costs, would be much better received by doctors if the project was focused on avoiding harmful radiation from CT scans. With some effort, it is usually possible to find a clinical outcome that that can be paired with a cost savings measure.

4. Balance Measure

Clinicians will resist a measure they perceive could cause harm. “Primum non Nocere” (Most Importantly, Do No Harm) is drilled into all students during medical school. When we approached a group of Urgent Care clinicians with a measure to reduce abdominal imaging studies, the immediate clinical response was, “If I order less CT scans, I might miss a case of appendicitis.” To make this project more acceptable, it needed to be paired with a “Balance Measure” that ensured no harm was done. The balance measure in this case was ‘percent of patients who returned to the ER for appendicitis within 7 days of the urgent care visit.’ This addition made the project acceptable and a drop in CTs was achieved.

Another example was cholesterol control: when doctors were asked to switch from a brand drug to a generic for the reduction of LDL cholesterol, they expressed concern about efficacy for their patients. The project met with acceptance only after we agreed to add a measure addressing that concern and measure whether there was any deterioration in the number of patients who were achieving adequate control.

5. Well Defined Scenarios with Options

An example of a not well-defined scenario would be ‘overall rate of imaging tests ordered by doctor.’ Even the more specific, ‘rate of CT scans’ is too broad because it fails the criteria of being actionable. Such a measure often results in the response: “What do you want me to do about it?” By making the scenario very specific in carefully defining the clinical setting and symptoms, such as ‘Patients who present with new onset of abdominal pain in the setting of urgent care clinic’ and by simplifying the options into ‘CT scan ordered yes/no, the measure becomes something easily recognizable for a clinical behavior change that can be enacted the next day.

We have nicknamed this challenge, “find the fork.” We strive to find a clinical scenario where there is a ‘fork in the road’ and the treatment options can be grouped in to 2-3 well described patterns such that the clinicians can be easily grouped. This clear definition has allowed participants to quickly grasp the situation and identify themselves into one of the groups. That sets the stage for a robust discussion about which option is more preferred.

6.Peer Comparisons

Telling a doctor in California that his or her rate of ordering cardiac monitoring is 50%, but that the average in a Boston hospital is 20%, will not cause buy-in nor a compelling reason to change. Even presenting data about a hospital group 20 miles down the road is often met with scorn and the rationalization that: “Maybe their patients are different or the medical environment is different.” But when data is shown that has the names of local peers who are well known, the comparison becomes meaningful. Seeing the names of colleagues confirms that the types of patients and the clinical setting are similar, so there ought to be no differences in how they are treated.

Another significant factor that is triggered by the peer comparison is the psychological impact of social norming, but this is a large enough topic to deserve a separate discussion.

7. Group Discussion

Even with all the previously mentioned aspects meticulously followed, if the data is presented in a private setting, such as a in a mailed out individual report, or even a 1:1 discussion, there are too many questions that arise, such as “why am I different from the others?” and “How then do those other doctors treat this condition?” Without the ability to answer those questions, the only way out is to rationalize one’s own behavior with: “my patients must be different.” But all this changes in a group setting. There, the differences can be addressed and knowledge transfer will occur from those who practice according to the more preferred patterns. Hearing a respected peer explain, “This is how I do it” seems to speak louder than distant journal articles.

Summary

Attending to these seven aspects will be transformative in presenting clinical measures to groups of doctors. Each one of them may seem to be common sense, but our challenge to you is: look at the measures that are being distributed in your organization. Most will barely get 4 out of these 7 aspects. If you have several measures that address all 7, we want to hear about it!