top of page

How to find the outlier doctor in “small data?”

Clinical Care Variation can be discovered even in small samples; finding opportunities for cost savings in every day care.

We propose a method for identifying potentially unnecessary care, even in small sample sizes.

There is much interest in Clinical Care Variation, which means we are searching for differences in care patterns between doctors, specifically ‘unwarranted variation’ in care, i.e. variation that is not caused by unique aspects of each patient. If doctor Smith believes a certain laboratory test should be ordered on almost everyone, while his/her colleagues are only ordering that test on 50% of their patients, that would be an example of unwarranted variation, and potential unnecessary overuse of medical care.

Big Data methods might not be the right tool to find patterns of unusual clinical care that point to these opportunities for cost savings. The reason is that in clinical practice the unit of comparative analysis is ‘other doctors who treated a similar patient.’ Since the average doctor might see about 20 patients per day[1], it is not likely that doctors will see hundreds of patients with identical conditions per year. Therefore, to find groupings of identical patient conditions, it is more common to find frequencies of only 20-30 similar patients per doctor in a 12-month time period.

To illustrate the challenge and a potential solution, let’s suppose a set of 10 doctors, #1 through #10. And let’s examine 20 patients for each of them, assuming all of these patients have the same condition. For each doctor, we show the frequency of ordering a given text X:

We see a range of the frequencies from 15% to 75%. Displaying these from high to low, shows the variation in care. Each column in the graph represent a different doctor.

The question before us is whether the difference in the rates is caused by unique aspects of each patient, or whether it is caused by a preference for ordering or not ordering the test that is unique to the doctor. The data itself cannot tell us that just by looking at these averages.

This is a sample data set that was created by assigning a “propensity for ordering the test” to each of the ten doctors. 8 of the 10 were given a propensity of 50%, one was given 80% and another was given 20%. This is not unlike typical clinical situations where there are some doctors who say, “I rarely order that test!” and others who say “I was taught to almost always order that test” and many in the middle who order the test depending on the patient’s unique situation.

Just by looking at the averages, we can’t be assured of making interpretations about whether there is

  • A: variation caused by the patients

  • B: variation caused by the doctor

But by looking at the standard deviation of each of these distributions, we might be able to gain confidence in making interpretations. The standard deviations are as follows:

We now see that the doctor who rarely orders the test (Dr #5 with a rate of 15% in this sample, who was programmed to order the test at 20% rate), has a lower standard deviation than the doctors who had a propensity of 50% for ordering the test. Similarly, the doctor with the higher rate (Dr #6 with a rate of 75%, who was programmed to order the test at 80% rate), also has a lower standard deviation than the rest of the doctors.

In a Monte Carlo simulation[2] of 10,000 iterations of this sampling of 20 patients per doctor, the doctors with the pre-programmed rates of 20% or 80% had a standard deviation of 0.394 vs a standard deviation of 0.5 for the doctors who were at 50%.

Our hypothesis is that a rate with a lower standard deviation than the rates of peers points to a pattern of a doctor who consistently does something a certain way (high or low), regardless of patient factors, while the higher standard deviation suggests a pattern that is more varied, because the variation is caused by patient factors.

I am not a trained Data Scientist, and defer to my colleagues who are more versed in the various methodologies for finding patterns in big data to ascertain whether this method of using standard deviations to separate

  • A: variation caused by the patients

  • B: variation caused by the doctor

is a useful method of pursuing potential signals of ‘overuse’ or ‘underuse’ in small samples of data.


[2] Credit for the Excel add-in for doing Monte Carlo simulations goes to

bottom of page