How college can prepare students for a data-driven world

You have /5 articles left.
Sign up for a free account or log in.

Four businesspeople examine a data visualization on a virtual screen.

Pinkypills/iStock/Getty Images Plus

The ability to collect, organize, interrogate and make sense of data is an essential job skill in the 21st-century marketplace. A Harvard Business Review analytic services report notes that nearly 90 percent of organizations say success depends on data-driven decisions made by front-line employees. To ensure students can succeed in this market, colleges and universities of all types and sizes must move rapidly to build students’ data competencies.

Yet higher ed institutions are far from ready to help students level up their data skills. Too few faculty members, instructional staff and librarians have the necessary knowledge and skills to teach students these competencies. Even among those faculty and others who are ready, too often they still face barriers to accessing the tools and materials required to fully integrate data literacies into the curriculum.

At an institutional level, it is difficult to procure access to large-scale data sets, which often come with concerns over intellectual property rights, privacy issues and various data-extraction and delivery options. All these efforts—to learn, to train, to support and to create and update curricula—also require time.

Most Popular

Some institutions are taking steps to address these challenges at the local level in order to help students build data skills. Universities are creating data majors within existing departments, while others—such as the University of Virginia, the University of North Carolina at Charlotte and the University of Texas at San Antonio—have established entire schools of data science. This approach accelerates progress at these schools by adopting new courses and majors rather than retooling entrenched curricula.

Outside of creating a school or department, large institutions are setting up specialized institutes focused on teaching data literacy and working with research data, such as UNC at Chapel Hill’s Research Hub. At smaller institutions, existing staff are reskilling in areas such as Python, R, SQL and natural language processing so they can teach these skills as part of their courses. Still other colleges and universities are developing smaller departments and programs within their libraries, with data literacy acting as a natural extension of the information literacy courses traditionally taught by librarians.

This activity, while important, is uneven, and it risks leaving behind students at schools with fewer resources and missing opportunities to share services, like the aggregation of rights-cleared data sets, in ways that expedite progress.

Solutions exist that address these gaps. My organization, the nonprofit Ithaka, has developed Constellate, a free platform that helps faculty members easily and effectively teach students text analysis and data skills by integrating vast repositories of scholarly content, including from its peer services JSTOR and Portico, and open educational resources into a cloud-based lab.

This year, it also is continuing the Text Analysis Pedagogy (TAP) Institute, a program originally funded by the National Endowment for the Humanities, to train postsecondary educators to teach data and text analysis. The institute—open to faculty, librarians, staff and graduate students interested in teaching text analytics—trains participants in a monthlong series of courses focusing on data analytics, data visualization and machine learning. The goal is to train the trainers, empowering college educators to help students acquire marketable data skills.

Overwhelming interest in the TAP Institute shows the need for such skill development. In its first two years, it has trained more than 400 higher education faculty members, librarians, research staff and graduate students in humanities-focused text analysis. Before the 2022 institute, participants were asked if they were ready to teach text analysis, and 79 percent reported they were not. After the institute, this number fell to 27 percent. Similar interest in Constellate is coming from institutions of all sizes and resource levels in the United States and abroad.

Constellate is one of a range of needed solutions being developed and tested to address this skill gap. Both the Carpentries and the Digital Humanities Research Institute aim to develop technical skills to support teaching and research. There have also been efforts to address the challenge of providing rights-cleared access to big data sets with support for text and data mining, like those offered by the recently retired CADRE Project.

Scaling affordable ways for schools to teach and for students to learn data competencies is essential. This type of educational training must be widespread to produce tomorrow’s workers and promote students’ success; it cannot be a privilege for students at the largest or most affluent institutions. And it should be part of the curriculum for all students, not just the data scientists. The future of data-driven careers is here, and higher education needs to be ready to help students succeed in this world.

Nathan Kelber is the educational manager for Constellate and director of the Text Analysis Pedagogy Institute. Before helping start Constellate, Kelber taught data literacy and digital scholarship at Wayne State University and the University of North Carolina at Chapel Hill. His work focuses on data literacy, social justice and open educational resources for the digital humanities and data science.