Agile Health Services - A resource for the intersection of healthcare, computer science, and design: illustration

Showing posts with label illustration. Show all posts

October 14, 2014

Bonnie: An Open Source Clinical Quality Measure Testing Tool

Bonnie is a new open source software tool that MITRE has developed and released in April 2014 that allows Meaningful Use (MU) Clinical Quality Measure (CQM) developers to test and verify the behavior of their CQM logic. The goal of Bonnie is to reduce the number of defects in CQMs by providing a robust and automated testing framework. Bonnie allows measure developers to independently load measures that they have constructed using the Measure Authoring Tool (MAT). Loading the measures into Bonnie converts the measures from their Extensible Markup Language (XML) eSpecifications into executable artifacts and measure metadata.

Bonnie Dashboard Page

The measure eSpecification format that Bonnie loads is Health Quality Measure Format (HQMF) XML. The HQMF specification provides the metadata and logic that describe the specifics of calculating a CQM. Bonnie can load the HQMF describing a measure and programmatically convert the HQMF specification into an executable format that allows calculating the measure directly from the specification.

The measure metadata loaded into Bonnie is then used to allow developers to rapidly build a synthetic patient test deck for the measure using the clinical elements defined during the measure construction process. By using measure metadata as a basis for building synthetic patients, developers can rapidly and efficiently create a test deck for a measure.

Once a CQM has been loaded into Bonnie, a user can inspect the measure logic and then build synthetic test records and set expectations on how those test records will calculate against a measure. This capability to build synthetic test patient records, set expectations against those records, and calculate the measures using those patient records provides an automated and efficient testing framework for CQMs.

Using the Bonnie-supported CQM testing framework allows measure developers to more clearly understand the behavior of the measure logic, validate that the measure logic encodes their intent, and allows for multiple iterations of measure updates to be validated against a test deck.

Bonnie Measure Page

Additionally, the development of a test deck as part of measure development provides benefits after the measures are finalized. The test deck build as part of measure development can be used to demonstrate the intent of the measure though the use of patient examples included in the test deck. Furthermore, the test deck provides systems that implement the measures with a means to validate the development of their systems. This is provided in the form of a base set of synthetic patient records with known expectations for calculating against the implemented measures. Finally, the test deck could be used as a basis for the test deck used as part of the Meaningful Use certification program.

Bonnie has been designed to integrate with the nationally recognized data standards used by the Meaningful Use program for expressing CQM logic for machine-to-machine interoperability. This provides enormous value to the CQM program and federal policy leaders and stakeholders: this software tool verifies that the new and evolving standards for the Meaningful Use CQM program are tractable and can be implemented in software.

Additionally, Bonnie was designed to provide an intuitive and easy-to-use interface based on feedback from the broader measure developer community. A key goal of the Bonnie application is to deliver a user experience that provides an efficient and intuitive method for constructing synthetic patient records for testing and validating CQMs.

The Bonnie software is freely available via an Apache 2.0 open source license. The Meaningful Use program makes all or parts of the Bonnie software available for inspection, verification, and even reuse by other government programs or federal contractors.

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License. © Rob McCready, 2014.

August 5, 2014

ICD 10 vs ICD 9 Code Format Structural Differences

ICD is the World Health Organization's (WHO's) International Classification of Diseases (ICD) and Related Health Problems and is the international standard diagnostic classification system, and is the tenth revision of the ICD. The ICD is the coding system which physicians and other healthcare providers currently use to code all diagnoses, symptoms, and procedures recorded in hospitals and physician practices.

It is a big deal.

Today, the Department of Health and Human Services (HHS) just published an updated rule for the adoption of ICD-10 code sets. HHS is requiring that all HIPAA covered entities must be ICD-10 compliant by October 1st, 2015. This newly updated compliance date is meant to be firm and not subject to any change. This is the third time that ICD-10 has been delayed, so I strongly suspect that this new deadline will be met by next fall.

Last year, I put together a visualization of analysis of the ICD-10 coding landscape. Below is a simple primer explaining the distinction between ICD-9 and ICD-10, demonstrating structurally the difference in the two coding systems. In particular, the ICD-10 code set has been expanded from five positions (first one alphanumeric, others numeric) to up to seven positions. The codes use alphanumeric characters in all positions, not just the first position as in ICD-9.

ICD-9 vs ICD-10 coding

Some other interesting artifacts of ICD-9 vs ICD-10 that I discovered today:

As of the latest version, there are ~68,000 codes in ICD-10, as opposed to the ~13,000 in ICD-9. More specifically, there are nearly 5 times as many diagnosis codes in ICD-10 than in ICD-9 and there are nearly 19 times as many procedure codes in ICD-10 than in ICD-9. Yikes!
The new code set provides a significant increase in the specificity of the reporting, allowing more information to be conveyed in a code. To support this, the terminology has been modernized and has been made consistent throughout the code set. There are codes that are a combination of diagnoses and symptoms, with a claim that fewer codes need to be reported to fully describe a condition.

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License. © Rob McCready, 2014.

November 22, 2013

Conceptual Hospital Dashboard Design

I have been neglecting the "design" aspect of this blog for some time. To re-engergize the need for more engaging content, to put some eye-candy out... and have some fun on a late Friday afternoon… I thought I would share a shelved design for a dashboard that I developed for a COO at a notional hospital:

Notional Hospital COO Dashboard Design

I have made several assumptions when coming up with this mockup. The first is that quality and quality metrics would be of utmost importance to this user. Maybe pay-for-performance programs have been adopted based on Clinical Quality Measures (CQMs), and quality issues have an immediate impact on the bottom line of the organization. I included both a kiviat visualization for showing targets against measured results for CQMs, and a longitudinal trend for those CQMs over time.

With the disclaimer that I am not a clinician; I made some assumptions that knowing the types of procedures that are being performed, and when those procedures deviate from expected norms is something else that a COO would track.

Lastly, I am showing a design technique that Stephen Few will sometimes use to provide a non-white background to reduce eyestrain. Having met him several years ago, I am a fan of Stephen's work. I always like the muted background color that he tends to include in his works.

I welcome the feedback.

Kudos if you "get" the names of the ten physicians.

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License. © Rob McCready, 2013.

November 2, 2013

Visualization of ICD-10 Code Counts

This past week I have been working in the bowels of the QRDA Category 1 XML for the popHealth project that we are deploying for the Veterans Health Administration (VHA). In the process of working with the QRDA Category 1, I had to resuscitate some of my Ruby and REXML skills that had atrophied in the past year.

This weekend, I wanted to shakeout some of my technical skills in a cleaner environment and downloaded the XML for the full set of ICD-10 codes from the CMS site.

Why ICD-10? It is the 10th revision of the International Statistical Classification of Diseases and Related Health Problems (ICD) by the World Health Organization (WHO). ICD-10 provides a hierarchy of structured codes for diseases, symptoms, findings, complaints, social circumstances, and external causes of injury/diseases. The big national issue related to ICD-10 is that it will be required for expressing claims data to the Center for Medicare and Medicaid Services (CMS) starting on October 2014.

The current state-of-the-practice for capturing this coded data in Electronic Health Record systems is (IMHO) still ICD-9, the predecessor to ICD-10. One of the biggest differences between ICD-9 and ICD-10 is the fidelity of data that can be captured in ICD-10. In particular, there are over 68,000 distinct codes in ICD-10 as opposed to the roughly 13,000 in ICD-9.

Working with the XML file provided on the CMS site that details the ICD-10 code hierarchy, I wanted to see if I could convert the data into a format that would allow me to visualize the code counts into a D3.js example. I figured it was good to exercise some XML knowledge outside of the complexity of the QRDA Category 1 XML. Further,I wanted to learn a little more about the structure of the ICD-10 codes.

It is worth noting that the CMS ICD-10 XML is surprisingly easy to understand for the purposes of enumerating the full set of codes and the hierarchy. The QRDA Category 1 XML… not so easy to understand.

What I did was to load the ICD-10 XML hierarchy into a simple Ruby program via REXML. I created a aggregate count in a hash table of the second-level codes in the ICD-10 hierarchy by traversing the XML file. I had to do this only at the second-level of the ICD-10 hierarchy because the sheer number of third-level ICD-10 codes broke the D3.js visualization examples. To explain this a little more, the hierarchy of an example diabetes code down that the fourth level in ICD-10 follows:

E00-E89: Endocrine, nutritional and metabolic diseases
|-> E08 Diabetes mellitus due to underlying condition
|->E08.2 Diabetes mellitus due to underlying condition with kidney complications
|->E08.22 Diabetes mellitus due to underlying condition with diabetic chronic kidney disease

So for the illustration of ICD-10 code counts, I stopped aggregating at just the second level of the hierarchy and count/aggregate codes from the third and forth levels. Each tiny square in the illustration below represents the counts of just the second level of the ICD-10 space of roughly 68,000 total codes.

Once I had the counts of individual ICD-10 codes aggregated at second-level of the ICD-10 hierarchy, I exported a JSON file that could work with the D3.js example that I picked. Below is a thumbnail (admittedly... illegible) of the ICD-10 code counts transformed with the D3.js treemap example.

Visualization of second level ICD-10 code counts

If you want to try and download a higher resolution image of the ICD-10 codes and actually read more of the details, click here. HEADS UP… it is gianormous.

With the illustration, starting from left-to-right and then top-to-bottom, the sections in the ICD-10 data set that coincide with the colors in the illustration as follows. The only confusing item is the last "chapter" from ICD-10 is the gray box in the bottom left "Factors influencing health status and contact with health services". I think the D3.js code had to try and fit that section into the illustration.

A00-B99: Certain infectious and parasitic diseases
C00-D49: Neoplasms
D50-D89: Diseases of the blood and blood-forming organs and certain disorders involving the immune mechanism
E00-E89: Endocrine, nutritional and metabolic diseases
F01-F99: Mental, Behavioral and Neurodevelopmental disorders
G00-G99: Diseases of the nervous system
H00-H59: Diseases of the eye and adnexa
H60-H95: Diseases of the ear and mastoid process
I00-I99: Diseases of the circulatory system
J00-J99: Diseases of the respiratory system
K00-K95: Diseases of the digestive system
L00-L99: Diseases of the skin and subcutaneous tissue
M00-M99: Diseases of the musculoskeletal system and connective tissue
N00-N99: Diseases of the genitourinary system
O00-O9A: Pregnancy, childbirth and the puerperium
P00-P96: Certain conditions originating in the perinatal period
Q00-Q99: Congenital malformations, deformations and chromosomal abnormalities
R00-R99: Symptoms, signs and abnormal clinical/laboratory findings, not elsewhere classified
S00-T88: Injury, poisoning and certain other consequences of external causes
V00-Y99: External causes of morbidity
Z00-Z99: Factors influencing health status and contact with health services

If you are interested, you can access the JSON file with the second level code counts from my GitHub repository that I setup.

Further, you could use this JSON with several other data hierarchy examples off the D3.js site if you are interested. They use the same JSON format for representing the data, and you should be able to just drop the JSON that I created into that HTML if you tweak the name of the file in the examples and set your width and height of the demo to about one thousand times greater than what is provided since the about of data is so large.

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License. © Rob McCready, 2013.

October 16, 2013

popHealth v2.4 Design for Meaningful Use Stage 2

As some of you know, I have been leading the open source popHealth project at MITRE for the past 4 years. The MITRE popHealth team is currently working for the Veterans Health Administration (VHA). Our work is focused on using popHealth as the software service allow the VHA to meet Meaningful Use Stage 2 certification for Clinical Quality Measures (CQMs).

Since we hadn't been working on popHealth in any substantial level in most of 2013, so we identified some problems for ongoing operations and maintenance (O&M) of the popHealth software when we started to put more and more engineers back on the project. The concerns that the popHealth team had was the software that manages the User Interface (UI) presentation layer in the browser was becoming too complex and convoluted. This had the potential for introducing difficulty when a non-MITRE contractor will eventually manage the popHealth software in an "O&M" mode for the VHA. This will happen when our team completes the development task and moves off of this project.

What the popHealth team opted to do was to clean up the JavaScript and HTML so that there was an Application Programming Interface (API) that could provide JavaScript Object Notation (JSON) for the popHealth data needed for the UI directly to the browser. This is a cleaner software design that also allows user to forgo the existing popHealth UI if they need to provide just the raw CQM data.

In the process of refactoring the popHealth web application infrastructure, the team opted to introduce some tweaks to the popHealth presentation layer but still maintain the existing interaction model. What we decided to do was only slightly evolve the existing popHealth branding and look and feel, so that is is more maintainable from an engineers perspective, but does not introduce a radically different interaction model require re-training for all our existing users.

The popHealth v1.4 user interface that was aligned with the Meaningful Use Stage 2 program is below:

popHealth v1.4 Dashboard Design for Meaningful Use Stage 1

In the process of refactoring the popHealth web application infrastructure, the popHealth design team opted to introduce some tweaks to the presentation layer but still maintains the established L&F and interaction model. The latest updated designs for popHealth v2.4 follow:

popHealth v2.4 Dashboard Design for Meaningful Use Stage 2

Changes in the evolved popHealth v2.4 design include:

Eliminated confusing links for the "parameters" of each CQM on the right; now, there is just a link to access the patients.
Reduced the number of colors and focused on a more muted and simplified color pallet
Eliminated the single horizontal bar visualization and replaced it with two stacked bars for the fractions
Identified the exception and exclusion population with muted gray bars on the stacked bar charts
Enhanced the performance rate fraction with a visual radial circle around the fraction value

If anyone has opinions on the latest designs, the team welcomes any and all constructive feedback.

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License. © Rob McCready, 2013.

May 24, 2013

Meaningful Use Stage 2 XML Standards for Clinical Quality Measure Reporting

When implementing an EHR system, or EHR module, that supports the Meaningful Use Stage 2 program for Clinical Quality Measure (CQM) support, there are three HL7 XML standards that you need to be aware of:

HQMF
QRDA Category 1
QRDA Category 3

The HQMF XML standard defines the data elements referenced by the logic, and associates them to value sets using object identifiers (OIDs). For those really interested in using the HQMF, you can download the HQMF files for all the Meaningful Use Stage 2 Clinical Quality Measures from the CMS site here.

The QRDA Category 1 XML standard is what is used for expressing patient-level data as inputs to a Clinical Quality Measure calculator, such as popHealth, as part of the Meaningful Use Stage 2 program. This XML standard allows for EHRs to express the clinical results of individual patients based on the CQM that an EHR system was queried about. The reports that determine the data to include in the QRDA Category 1 XML are tightly coupled to the HQMF definition of the measure. I have shared high-level information about the QRDA Category 1 specification, as well as an example of what the QRDA Category 1 XML should look like.

Lastly, the QRDA Category 3 XML standard is what is used to express the summary results of a CQM. It is unfortunate that the QRDA Category 1 and Category 3 are worded so similarly, yet have significantly different roles in this landscape. The QRDA Category 3 is the artifact that needs to be generated for expressing summary/aggregate report numbers. For instance, if the CQM report where created to assess the mammography screening results of women between the ages of 45 and 65 years old, and the result were 73% of that population met that criteria, the QRDA Category 3 XML could express the 73% performance rate, as well as counts associated with the initial patient population, the denominator, the numerator , the exception, and exclusion populations. You can view an illustration of these different CQM logical families for the proportion-based CQMs here.

Below is an illustration that details the use of these various HL7 XML standards when reporting on Meaningful Use Stage 2 Clinical Quality Measures.

Landscape of Meaningful Use Stage 2 Clinical Quality Measure XML Standards

Hopefully this is helpful. If not, let me know!

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License. © Rob McCready, 2013.

May 17, 2013

Designing Logos for a Portfolio of Healthcare Projects

Over the past three years, I have been involved with numerous open source healthcare projects, ranging from large reference implementations of Clinical Quality Measures, certification and testing tools adopted by the federal government, a small research project assessing CQM complexity/cost/tractability, this blog, and other ideas that failed to launch.

What I have been endorsing throughout the course of the past three years has been to establish a common "brand" that could be applied to various open source healthcare projects, but still provide a unique look and feel for each individual project.

The design approach that organically evolved was to develop logos followed the following rules:

Use only circles
The circles may come in different sizes and arrangements
Limiting the logos to a pallet of only two colors
Color opacity could be changed

You can see the portfolio of these project logos below:

Portfolio of logos for open source healthcare projects I have worked on

Since I am a "left brain" engineer at heart, the various arrangements of the circles has been an easier task than color selection. I have often struggled with finding the magic combination of colors for anything from these logos to our interior design of our hours.

The best resource that I have found been able to identify is ColorLovers.com. For those not familiar with the site, it is an evolving resource for palettes and colors that users can share, rank and comment on.

When I had to select to colors, I went to the filter for "most loved" colors of all time, and tried to identify pallets where instead of 5 colors, there were 2 colors used in the pallet. Hopefully this is a helpful trick.

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License. © Rob McCready, 2013.

September 13, 2012

Review: CardioChek PA Blood Meter

I have been working for the Office of the National Coordinator for Health Information Technology (ONC), on the popHealth and Cypress projects for several years now. Both projects are based around Clinical Quality Measures (CQMs).

By education, I am a physicist and engineer, and not a clinician. However, I have repeatedly noticed the importance of lipid profiles within the context of the logic of several of the Meaningful Use CQMs. Further, in the very recent past, these metrics had been considerably high for me; a male in my late 30's. Between my work looking at CQMs emphasizing the importance of just measuring lipid profiles, and my own personal warning signs associated with this health metric, I thought it would be good to collect some "hands on" experiences with the collection of the data for this metric.

When my department at MITRE had some additional overhead resources available at the end of our fiscal year, I purchased a portable blood testing device that could provide me with my own lipoid profile information (total cholesterol, HDL cholesterol, LDL cholesterol, and triglycerides).

I eventually picked the CardioChek PA blood meter device. Interestingly, this was not the first CardioChek blood device that I purchased. I originally found the consumer CardioChek (no "PA") device on Amazon.com. The consumer device lists at about $125. That seemed reasonable, but the device requires three separate strips to measure total cholesterol, HDL cholesterol, and triglycerides (you can calculate the LDL cholesterol from those three). While that may not sound bad, I quickly found out that it involved giving myself at least two, sometimes three, pricks with lancets to draw enough blood for the three separate tests.

Additionally, I purchased this device for my department at work, and felt that an experience getting three pokes with a needle might not go over well with my colleagues, resulting in some future retribution via office pranks.

To solve the problem of collecting a full lipoid panel from one drop of blood, I purchased these PTS panels that appeared nice in the sense that they have the ability to drive the calculation of multiple readings from a single sample of blood.

Each box comes with one lipid panel MEMo chip that can be inserted
into a CardioChek PA device and 15 single use lipid profile panels

Unfortunately, I quickly discovered that these really nice 3-in-1 lipid profile panels are not compatible with the $125 consumer CardioChek device. To support multiple readings from a single drop of blood would require the more expense CardioChek PA device, that runs at close to $700.

Being the end of the fiscal year, the additional money was a little easy to come by. I went back to our finance staff and purchased the more expensive, clinical-grade device. I also picked up some supporting medical equipment like gloves, lancets, pipettes, and band aids.

CardioChek PA blood testing device, several tests
with some additional medical equipment

To take your lipid profile, you need one lipid panel test chip, called a MEMo Chip by the manufacturer, and one lipid panel test strip. The MEMo chip contains lot-specific calibration and other information needed to properly perform testing. The lot-specific information is presumably associated with the 15 test panels that come with the package. I would not recommend mixing and matching test panels with different MEMo chips, because of this prior calibration by the manufacturer.

There is also guidance around always storing the unused test panels at a temperature between 68-80˚F. I could see this temperature requirement as a challenge for some home/consumer users. Lastly, there is an expiration date on the test panels. For all the tests I have, the expiration date, is less than one year from now, a little under 8 months of viable shelf time.

You can see the relative size of the test panel and MEMo chip in the picture below. You only need one strip for a test. I just flipped one test strip over to show the single wide channel where you deposit your blood, and the three openings for the device sensor to read the total cholesterol, HDL cholesterol, and triglycerides.

lipoid panel test chip, with two lipoid panel test strips

Lipoid panel test chip and two lipoid panel test strips

It appears that the CardioChek PA device has the ability to independently test up to four different metrics from the single sample of blood. While I haven't found any tests that use a full four metrics from one sample, I am still happy that the manufacturer (presumably) recognized this need for multiple tests from a single sample.

CardioChek PA sensor

Collecting and depositing your sample for the device is relatively easy for an individual, non-clinician. I suggest you get 2 paper towels, and a small band aid before you get started. If you are unfamiliar with using lancets, they are small and cheap medical implements used for capillary blood sampling (no vein or arties). A lancet includes a spring-loaded needle. When used, it pops out and makes a very small puncture in your skin, allowing a few drops of blood to appear on your skin over the next ~10 seconds. They are single use and disposable.

You can see how they are fairly straight-forward to use for the blood sampling that you can then collect with pipettes.

Lancet

About one drop of blood after lancet met my finger
with small pipette in background

After my first two attempts to use the device for a full lipid profile, the device would eventually display "TEST ERROR" on the screen. Needless to say, I was disappointed to think that I had put over $1K into this exercise, and had no data to show. As it turns out, my finger was not providing enough blood to generate the full lipid profile. There is some documentation provided with the device that makes this association from the "TEST ERROR" message. However, I just can't understand why the manufacturer didn't make this more intuitive for users.

On my third attempt, when I applied a liberal amount of blood on the sample channel, the device worked fine. I feel that the accuracy of the device is very high. The readings it provided were all within 10% of measurements that I had collected from my Primary Care Provider (PCP) the previous week. For me, the time between depositing your blood on the strip until the results are displayed ranged from 40 to around 60 seconds on three test runs.

This problem that resulted in multiple "pokes" with the lancet makes for an amusing story at my expense. However, I can't emphasize enough; the "TEST ERROR" message really means "Not enough blood". This is an opportunity for improvement in this product for first time users.

On this topic of human-computer interaction, I was also disappointed with the CardioChek PA user interface. For $700, the interface feels like bleeding-edge late 1980s technology.

CardioChek PA user interface

For a $700 device, I would think that the manufacturer could easily upgrade the resolution and include color at a modest increase in manufacturing costs. Ideally, this would include some longitudinal data on changes associated with the data. See my suggested illustration below, again based on Juhan Sonin's designs for a HealthCard for the patient (consumer).

My updated lipid profile with sparkline visualizations

On the positive side, the CardioCheck PA is worth purchasing if you think you will be frequently taking your lipid profile at home. It appears very accurate on the measurement results. Once you learn the interface and how to correctly provide enough of a blood sample, the device works well.

My biggest issue with this product is the variance in the cost of the CardioChek PA device at $700, versus a slightly more limited consumer CardioChek device at $125. I feel that this price point makes the CardioChek cost-prohibitive for most consumers (patients).

I would grade the CardioChek PA device a B- for the purposes of home users.

Somewhat related to where this may go in 5-10 years, I was able to identify an amazing illustration, showing various ranges for numerous blood metrics:

Reference ranges for blood tests

Knowing that a single drop of blood could theoretically yield all these metrics makes for some interesting ideas about how the consumer could have access to these metrics on a daily basis, at their home.

Another interesting opportunity for this market would be to introduce a blood sensor without the embedded interface that could communicate with an iPhone, similar to how the Withings BP Cuff works. I have that Withings device at home, and plan on developing a review of that device later.

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License. © Rob McCready, 2012.

September 2, 2012

Appling Kiviat Visualizations to Clinical Quality Measures

I have been working with Clinical Quality Measures (CQMs) for several years, leading two open source healthcare projects for the Office of the National Coordinator for Health Information Technology (ONC), first popHealth and more recently Cypress. Both of these projects cover intimate details relating to CQMs. While working in this space, I have noticed that there is a need for better techniques for the visualization and presentation of Clinical Quality Measure results.

CQMs are reports that measure the quality of healthcare providers against patient-level data. CQMs are designed to measure the performance and quality care that a healthcare provider applies to a population of patients. Many factors are included in CQMs such as health outcomes, processes and systems in place at a facility, patient perceptions, and treatments provided. The idea behind introducing CQMs into the healthcare provider workflow is that continuously measuring providers against these metrics, the US healthcare system can be shaped to gradually provide higher quality and improved efficiency by monitoring these metrics.

CQMs are a required component of Meaningful Use requirements for the Medicare and Medicaid Electronic Health Record (EHR) Incentive Programs. This program is a significant part of the HITECH Act. The Meaningful Use program can supply healthcare providers with up to $44,000 in incentives if they demonstrate that they meet requirements to "Meaningfully Use" an Electronic Health Record software system in their practice.

While the notion of measuring clinical performance has been around for ages, the importance CQMs withing the Meaningul Use program has been a forcing factor to motivate EHR software vendors. All are motivated to support calculation for the Meaningful Use CQM reports in their products. Since CQMs are a feature EHR vendors must support for this federal program, there have been several visualization techniques to present CQM results to their users:

NextGen Mammography Screening CQM Visualization

This NextGen design requires a good amount of looking back/forth to understand the time interval being applied to the individual bar charts on the mammography report. This alone is my biggest grievence with the design. However, the legend on the upper left of the bar chart, actually cuts off the top of results for women who have had a mammography screening in the past 12 months. This legend background is the same tone black as the background of the bar chart illustration. This decision ultimately results in seeing a visual representation of that metric that is lower than the actual numeric value.

GE Hospital (Inpatient) CQM Dashboard

This GE dashboard makes liberal use of color to differentiate the various clinical families, and provides no visual indication on the actual CQM results. The user is force to click on each box to both understand the value, as well as the name of the specific metric. I found this technique to be one of the more painful to use.

Admittedly, after several years leading popHealth, a Clinical Quality Measure reference implementation software service, we have likewise struggled with the visualization we developed. Our popHealth CQM visualization has been warmly viewed by both clinicians and EHR vendors, but uses significant screen real estate for displaying numerous CQM results:

popHealth Practice-Level CQM Dashboard

One approach to the visualization of Clinical Quality Measures that I haven't seen wide adoption is the use of a Kiviat chart. Kiviat charts, or sometimes referred to as Radar Charts, are two dimensional, multi-metric illustrations which use a variable number of evenly distributed radii. Each spoke is associated with one metric. The length of a value overlaid on each radii is proportional to the magnitude of the metric relative to the maximum value of the variable across all data points.

Use of Kiviat diagrams works well with an apples-to-apples comparison of multiple metrics against an arbitrary number of metrics. Kiviat diagrams can also make it easier to identify patterns in data if the radii are arranged in a consistent order from diagram to diagram.

After searching for Clinical Quality Measures and Kiviat visualizations, I was able to find a good article from PIIM Research detailing numerous visualizations for clinical metrics, "Advancing Meaningful Use: Simplifying Complex Clinical Metrics Through Visual Representation". This paper included a Kiviat visualization for some CMQs, and also notes the value in providing national averages as an overlay to differentiate the delta from the measured results for one provider with a Kiviat approach:

PIIM Research Advancing Meaningful Use: Simplifying Complex Clinical Metrics Through Visual Representation

While I am unaware of target metrics or a national average for the Meaningful Use CQM results, I like the concept of presenting targets for CQMs, here presented as a black underlay. One aspect that is absent from this approach is a way to present when the providers' results exceed the target metrics. The approach above with the national average is good for presenting when there is a deficit in the results. However, it does not present a good way to visualize when CQM results exceed targets and expectations.

I do not endorse the liberal use of color in this design either. From my experiences developing command and control systems, I have found that using red or green in any design will immediately convey goodness or badness associated whatever metric is being presented. This PIIM design uses a spectrum of colors to indicate where each CQM's results are, in addition to having the Kiviat line as way to present this information.

I found the use of red and green to be very distracting when interpreting the notional results in the illustration. In particular, the name color green will appear on the CQM metrics at the same numeric value. However, targets for each measures in the example will vary on a measure-by-measure basis. I could easily imagine a light green CQM result (in the mid-80s) that might actually be very poor against a national average. Similarly, an orange/red CQM result (in the mid-60s) could easily exceed the national average and expectations.

The dark/light alternating background doesn't have any real value aside from showing the upper bounds on the entire space. While this is probably good from some users, I found it to be unnecessary.

What's my suggestion?

A few years ago, I worked with Involution Studio creative director Juhan Sonin when he was at MITRE. I have also attended and enjoyed designer Stephen Few's course on business dashboard design. See my attempt at an amalgam of Juhan's minimalist/clean visualizations using Kiviat graphics with Stephen's constant guidance to focus on the critical information below:

Juhan Sonin's Kiviat illustration, re-purposed against synthetic data for
Meaningful Use Stage 1 Clinical Quality Measures

Taking this design further, and by overlaying best practice targets with the individual results, users can quickly understand where the quality of care meets/exceeds/fails expectations:

The decision to use red as the best practice number is that when the red Kiviat is visible, it implies that the provider's performance is below the targets that she/he should be meeting. The use of blue for the measured results was primarily to made to select a color that would compliment the red. I couldn't find any good tones of green that would work without making the illustration look like it had a "Christmas" theme.

To my knowledge, the best practice metrics for any of the nationally recognized Clinical Quality Measures (Meaningful Use program, PQRS program, Pioneer ACO program... etc) are not yet established. I don't know if there is work being planned to identify what these targets need to be.

A strength and limitation to applying Kiviat illustrations in this way is that the power to rapidly assess metrics "at a glance". For instance, the ability to recognize patterns through shapes alone is could provide the ability for an analyst to rapidly review the results of numerous physicians to identify specific providers who need to address improvements on the care they provide their patients on specific diseases and demographics.

What are some potential downsides to using Kiviat diagrams with CQMs?

A potential problem is that Kiviats benefit from standardization to the placement of the particular metrics to allow for the rapid visual inspection of the CQM results. For example, changing the placement of the hemoglobin A1C metric, after an analyst had been trained on one particular illustration, would probably result in increased cognitive load and/or errors. If the location of the individual CQM metrics are standardized, it will alleviate the need for users to re-read the names of the metrics every time they were presented to users.

Additionally, there are some limitations to Kiviat illustrations endemic to the visualization technique itself. One weak spot I have noticed is the sensitivity to changes in higher-ranging metrics tends to be more exaggerated over the same magnitude of change on smaller-ranging metrics. For example, a before/after 40% increase (from 50% to 90%) on the Diabetes LDL Management CQM below yields a more striking increase to the surface area in chart. Meanwhile, a 40% reduction in the Diabetes Urine Screening CQM (from 50% to 10%) is the same change in the number of diabetic patients, but is less pronounced.

See below for a before/after example that highlights changes of the same magnitude, but a perception of a larger change on the larger valued metric:

Another challenge to using Kiviat visualizations assumes that the higher the CQM result, the better the performance and quality of care by the provider. This isn't always the case with the 44 Meaningful Use Stage 1 Ambulatory CQMs. For instance, NQF 0059: Diabetes HbA1c Poor Control considers patients meet the requirement for this CMS if the hemoglobin A1c value in diabetic patients is in an undesirable range > 9%.

This type of CQM is "legal" logic. However, the fact that the measurement grades an undesirable result complicates the approach to visualize this result with other more desirable metrics, such as NQF 0575: Diabetes HbA1c Control, which considers a hemoglobin A1c value < 8%.

One simple (albeit crude) solution to this problem would be to ensure that all future Clinical Quality Measures endorsed by the National Quality Forum (NQF) are required to associate "goodness" with meeting the numerator criteria for any CQM.

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License. © Rob McCready, 2012.