Utilizing CMS Physician Professional Fee Data as Benchmarks with Alteryx

Each year, Medicare releases physician provider utilization and payment data that provides information on the services and procedures provided to Medicare beneficiaries in the United States. The data set includes a wealth of information about the physician, location of services, HCPCS codes to identify the type of services, utilization, average charges and average Medicare payment amounts. This rich data source has many usage possibilities for analytics and data visualization.

In this article, I’m going to demonstrate how to leverage the data to create benchmarks for reviewing physician billing practices with Alteryx Designer. Developing a benchmark is a way to compare similar services performed by physicians in the same specialty to review coding practices for payment. If your not familiar with Alteryx, you can learn more about the software on the Alteryx site.

My objective is to create benchmarks for clinical physician offices in the state of Pennsylvania. The benchmarks will allow a physician to compare their billing practice to their peers. I am focusing on the HCPCS code range of 99201 – 99205 New Office Visits, 99211-99215 Established Patients, 99221-99223 Initial Hospital Visits and 99231-99233 Subsequent Hospital Visits. These are the evaluation and management (EM) level visits codes that designate the acuity of the patient when seen in the physician clinic and/or seen during a hospital stay.

First step is to download the data directly from the CMS website.

Click on the API Docs
Click on the API Docs

In Alteryx, drag a Text Input tool on the canvas and create three columns labeled CMS, StoreFile, and CMS URL. For the column labeled CMS, enter the name the path name of the file that is being downloaded. For this demonstration the path is https://data.cms.gov/api/views/fs4p-t5eq/rows.csv for the CMS 2017 Professional Fee data set.

CMS Column of the Text Input Tool
CMS Column of the Text Input Tool

For the column labeled StoredFile, enter the location of where the downloaded file will be stored on your computer. Lastly, for the column labeled CMS URL, enter https://data.cms.gov/api/views/fs4p-t5eq/rows.csv?accessType=DOWNLOAD&api_foundry=true. The CMS Url column will be used to download the data and the Store File path will tell Alteryx where to save the file.

CMS URL Path of the Text Input Tool
CMS URL Path of the Text Input Tool

Next drag a Dynamic Download tool onto the canvas and connect it to the Text Input Tool. This tool will be use to download the data directly from the CMS Website. Configure the Basic tab with the URL field assigned as the CMS URL and the To a File filename from a field StoredFile.

Dynamic Tool Set Up
Dynamic Tool Set Up

Last step, click run. This will download the data from the CMS website to the location you specified in the StoredFile location.

Text Input and Dynamic Download Tools
Text Input and Dynamic Download Tools

Upon review of the data, it is determined the following fields are needed: state code of provider, provider type (specialty), hcpcs code, and line service count (number of services) to create a benchmark by specialty and type of service. All fields are V_string. Use a select tool to change line service count (number of services) to a double. Note the field names may change depending on when you download the CMS data.

This data set encompasses the entire United States, therefore the data must be filtered on the state of Pennsylvania to meet my objective. Additionally, the data set contains all HCPCS code utilization and must also be filtered for the HCPCS codes under evaluation.

HCPCS Codes and State Filter
Figure 1. Filter HCPCS Codes and State

Lastly, I would like to assign HCPCS code groupings and levels to use later in a Tableau dashboard. To accomplish this, I need to add a formula tool to the workflow with these new fields.

Formulas for level visits
Figure 2. Formula for EM Levels

When we run the workflow, the detailed data contains a row for each provider and specialty in the state of Pennsylvania with the HCPCS codes under evaluation. It is not summarize yet for the benchmark calculations by specialty and EM Level.

Browse of the Detailed Data Prior to Summation
Figure 3. Browse of the Detailed Data Prior to Summation

The next step is to summarize the data by EM Level, Provider Type, HCPCS Code, Level and Line Service Count. Place the summarize tool after the formula tool with these actions. Please note I renamed the output fields.

Configuration of the Summarize Tool
Figure 4. Configuration of the Summarize Tool

The results should look like Figure 5. The data is summarized by each physician specialty, EM level, and HCPCS Code.

Figure 5. Results of the Summarize Tool
Figure 5. Results of the Summarize Tool

We are almost ready to calculate the benchmark, however, first we must summarize the total by Specialty, EM Level and Line Service Count without the individual HCPCS codes. We will need this total to calculate the percentages. Place another summarize tool after the formula tool with these actions. Please note I renamed the output fields.

Configuration of summarize tool
Figure 6. Configuration of the Summarize Tool

At this point, the workflow set up is like Figure 7.

Figure 7 Workflow
Figure 7. Workflow

The results of second summation will provide us with the total line count by specialty and EM level. We can join this data with the first summation to complete the benchmark calculation.

Figure 8. Results of the Second Summarize Tool
Figure 8. Results of the Second Summarize Tool

Drag a join tool on the canvas and connect the first summarize tool to the left anchor and the second summarize tool to the right anchor. Join the two data sets by EM Level and Specialty. The workflow should look like this:

Figure 9. Workflow
Configuration of the Join
Figure 10. Configuration of the Join

Note in the join configuration, I unchecked the right specialty and EM level. On first run, I left them checked to ensure the data joined correctly. These fields are not needed in the end results. The results of the join should look like Figure 11. The field Spec EM Total is joined correctly to each specialty and EM Level. We will use this field as the denominator for the benchmark calculation.

Figure 11. Results of the Join Tool
Figure 11. Results of the Join Tool

Finally, we are ready to calculate the benchmark. This benchmark will calculate the percentage utilization by specialty for each EM Level. Drag a formula tool onto the canvas and place it after the join. Create a new field called Benchmark % setting the data type to double. Use this formula: [Line Service Count]/[Spec EM Total]*100

Drag a browse to the canvas and run the workflow. The results should be like Figure 12.

Figure 12. Results after adding the benchmark formula
Figure 12. Results after adding the benchmark formula

Now that we have the benchmarks for each specialty and EM Level, we can either merge this with our physician data set to use as a reference in our dashboards or share the information in another fashion.

Below are two examples of how the benchmarks can be utilized in Tableau for comparing a single physician to the benchmark physicians’ percentages. Both graph examples represent the percentage of Established Patient Visits. The blue bars denote a single physician’s percentage of units billed for each E&M Level Visit related to Established Patient Visits. Based on this distribution, it appears this single physician has less Level 3 visits (Blue Bar) than the benchmark physicians (Grey Bar).

Example 1 Comparing the Physician to Benchmarks
Figure 13. Example 1 Comparing the Physician to Benchmarks
Figure 14. Example 2 Comparing Physician to Benchmarks

It is important to note with any benchmarking project, that consideration be given for how the compare group is developed. A few things I like to ensure is that the sample size is good, I’m comparing apples to apples and not apples to oranges, and the demographics of the population; just to name a few. With this CMS data set, I think it is important to compare physicians that practice in the same specialty. Lastly, the acuity of the patients may factor in the results so weighted benchmarks may be needed. This really depends on your data analysis and needs.

Note: All the data used for this article is publicly available on the CMS Medicare Website. The data set used was Medicare Provider Utilization and Payment Data: Physician and Other Supplier PUF CY2017. No personal healthcare data was utilized to demonstrate this technique.

Leave a Reply