Discovering Issues with IPEDS Completions Data

The U.S. Department of Education’s Integrated Postsecondary Education Data System (IPEDS) is a great resource in the field of higher education. While it is the foundation of much of my research, the data are self-reported by colleges and occasionally include errors or implausible values. A great example of some of the issues with IPEDS data is this recent Wall Street Journal analysis of the finances of flagship public universities. When their great reporting team started asking questions, colleges often said that their IPEDS submission was incorrect. That’s not good.

I received grants from Arnold Ventures over the summer to fund two new projects. One of them is examining the growth in master’s degree programs over time and the implications for students and taxpayers. (More on the other project sometime soon.) This led me to work with my sharp graduate research assistant Faith Barrett to dive into IPEDS program completions data.

As we worked to get the data ready for analysis, we noticed a surprisingly large number of master’s programs apparently being discontinued. Colleges can report zero graduates in a given year if the program still exists, so we assumed that programs with no data (instead of a reported zero) were discontinued. But we then looked at years immediately following the apparent discontinuation and there were again graduates. This suggests that programs with missing data periods between when graduates were reported are likely either a data entry error (failing to enter a positive number of graduates) or not reporting zero graduates in an active program instead of truly missing (a program discontinuation). This is not great news for IPEDS data quality.

We then took this a step further by attempting to find evidence that programs that seem to disappear and reappear actually still exist. We used the Wayback Machine (https://archive.org/web/) to look at institutional websites by year to see whether the apparently discontinued program appeared to be active in years without graduates. We found consistent evidence from websites that programs continued to exist during their hiatus in IPEDS data. To provide an example, the Mental and Social Health Services and Allied Professions master’s program at Rollins College did not report data for 2015 after reporting 25 graduates in 2013 and 24 graduates in 2014. They then reported 30 graduates in 2016, 26 graduates in 2017, 27 graduates in 2018, 26 graduates in 2019, and 22 graduates in 2020. Additionally, they had active program websites throughout the period, providing more evidence of a data error.

The table below shows the number of master’s programs (defined at the 4-digit Classification of Instructional Programs level) for each year between 2005 and 2020 after we dropped all programs that never reported any graduates during this period. The “likely true discontinuations” column consists of programs that never reported any graduates to IPEDS following a year of missing data. The “likely false discontinuations” column consists of programs that reported graduates to IPEDS in subsequent years, meaning that most of these are likely institutional reporting errors. These likely false discontinuations made up 31% of all discontinuations during the period, suggesting that data quality is not a trivial issue.

Number of active programs and discontinuations by year, 2005-2020.

YearNumber of programsLikely true discontinuationsLikely false discontinuations
200520,679195347
200621,167213568
200721,326567445
200821,852436257
200922,214861352
201022,449716357
201122,816634288
201223,640302121
201324,148368102
201424,76631189
201525,17041097
201625,80836166
201726,33534435
201826,80438441
201927,572581213
202027,88374223

For the purposes of our analyses, we will recode years of missing data for these likely false discontinuations to have zero graduates. This likely understates the number of graduates for some of these programs, but this conservative approach at least fixes issues with programs disappearing and reappearing when they should not be. Stay tuned for more fun findings from this project!

There are two broader takeaways from this post. First, researchers relying on program-level completions data should carefully check for likely data errors such as the ones that we found and figure out how to best address them in their own analyses. Second, this is yet another reminder that IPEDS data are not audited for quality and quite a few errors are in the data. As IPEDS data continue to be used to make decisions for practice and policy, it is essential to improve the quality of the data.

Author: Robert

I am a professor at the University of Tennessee, Knoxville who studies higher education finance, accountability policies and practices, and student financial aid. All opinions expressed here are my own.

9 thoughts on “Discovering Issues with IPEDS Completions Data”

  1. This is some great analysis. One challenging factor is how institutions can reclassify programs on a year-in-year-out basis. For the Rollins College example, they reported 16 conferrals in Counseling Psychology 6-digit CIP- the first time they reported to that one. That is likely the ‘missing’ conferrals/program.

  2. You may have controlled for this, but when I used to work a lot in IPEDS completions data I found it was relatively common for an institution to switch codes for a year or two and then revert.

    The created the appearance of two programs opening/closing when in fact it was one program using two codes in alternating years.

    1. Yeah, it’s not a perfect system. We rolled up to the 4-digit CIP code to help reduce the likelihood of this happening, but that wouldn’t catch everything.

  3. I wonder to what extent this is due to folks with hand: entered data making mistakes / taking efficiencies, or is it folks using an automated approach that doesn’t know to fill in zeros (and folks don’t know to circle back and fill them in). I guess I’m just saying it bears looking into the process.

  4. I was going to mention the same thing others have about programs being reported under different CIP codes. I recently helped a psychology professor with a question about IPEDS data because he experienced something similar trying to do research on existing programs. He was questioning why certain institutions did not report graduates when he knew that they had active programs. I dug in and discovered the conferrals had been reported but under new CIP codes. It is interesting that CIP codes could change so much year-to-year. That really shouldn’t be the case! At my last institution, program CIP codes were hardwired into the SIS and changing them required various levels of approvals.

  5. Do you think that the assumption of a program no longer existing due to a null value is a greater leap than assuming a null value equates to a zero? Due to the nature of IPEDS data collection, when a program doesn’t have a graduate count for a given collection year, the person entering the data could just leave it blank. That doesn’t necessarily mean the program doesn’t exist. It just means there were no completions in that year.

    More broadly, I do think errors in the IPEDS data is highly likely. It’s the nature of things when you rely on manual human entry to do data collection. It’s also likely that some institutions misinterpret the questions being asked. These issues are of course not isolated to IPEDS. You can apply the same to any data collection that is manual and even a healthy amount that are automated.

    1. Based on our look through the data and institutional websites, we’re pretty confident that a missing value in between years of non-missing values is a zero. We have made that assumption in our analysis. But it’s trickier when a value becomes missing and stays there. We assume a discontinuation, but also recognize that some of those are due to changing CIP codes.

      This is all really tricky stuff. Thanks for the thoughtful comments!

  6. Your data covers three CIP versions, so that could be another twist. Cross-mapping those version changes might eliminate a few of your mysteries, too. Thanks for the work.

Leave a comment