The National Weather Service channels large quantities of satellite data into predictive analytic models to improve weather forecasting of storms such as last fall's Superstorm Sandy, which caused damage still evident in Ortley Beach, N.J. (Mark Wilson/Getty Images)
From weather predictions to chasing fraudsters, agencies and their partners are placing far more focus on collecting and crunching massive volumes of data to improve operations and performance.
With the launch of the Obama administration’s open data initiative two months ago, those efforts could accelerate.
In the May executive order, the White House decreed that open and computer-readable data would become the norm for government information. Agencies have six months to report on their progress toward actual implementation.
“We are hoping that will push federal agencies to put those large data sets up online for all to use,” said David Logsdon, a senior director at TechAmerica, an advocacy group for the communications and technology industry.
If implemented correctly, Logsdon said, the new policy could enhance the combination of large data sets across government “to address efficiencies, cost savings and innovation.”
But there are hurdles as well, including long-standing bureaucratic barriers to information-sharing and a shortage of technical expertise. One mid-to-large agency, for example, has only two data scientists on staff to develop its strategic plan for carrying out the executive order, Logsdon said. He declined to name the agency.
On the most basic level, agencies’ increasing reliance on digitized information — and ever cheaper storage — allows them to make massive amounts of information publicly available online. NASA’s Johnson Space Center in Houston, for example, has amassed one of the world’s largest imagery archives encompassing more than 4 million still images, 9.5 million feet of motion picture film and more than 85,000 videotapes, according to a TechAmerica report last fall on how agencies can make better use of big data. That collection — and the imagery systems needed to process them — are housed in eight buildings at the space center and are “growing exponentially,” the report said.
TechAmerica also praised the National Archives and Records Administration for its approach to managing mushrooming quantities of electronically stored documents and other “unstructured” data. As of last week, those holdings stood at 420 terabytes, or triple the amount just 18 months ago, according to an archives spokeswoman and numbers cited in the report. Each terabyte is equal to 1,000 gigabytes. In confronting that challenge, the agency has combined traditional records digitization and storage “with advanced big data capabalities for search, retrieval and presentation,” the report said.
The essence of information’s value, however, rests in how it is used.
Here, agencies and their partners are proceeding on several fronts, pushed by budgetary pressures, mission requirements and a 2011 law intended to drive more effective use of taxpayer money. The law, known as the Government Performance and Results Modernization Act, requires agencies to measure their progress toward selected performance goals with the aid of regular “data-driven” reviews.
At the Energy Department, Treasury Department and Small Business Administration, the reviews are yielding results, according to agency officials cited in a recent Government Accountability Office assessment.
The deputy secretary at Treasury, for example, said it was one such “stat session” that eventually led to a decision to stop minting $1 coins. Performance data revealed that the U.S. Mint was churning out 400 million new coins annually, even though the Federal Reserve already had 1.4 billion in storage and was mulling a costly expansion to house even more, according to GAO.
For DOE officials, data-driven reviews proved helpful in adjusting their strategy to meet goals for energy-saving building weatherization. Instead of focusing on single-family homes, they created a program to leverage resources from business and nonprofits to retrofit larger buildings, GAO said.
But GAO noted that officials risked undercutting one goal of the act — encouraging more agency collaboration — because of a reluctance to allow “outsiders” into the reviews.
Elsewhere, agencies are proceeding as missions dictate.
The National Weather Service channels large quantities of satellite data into predictive analytic models to improve weather forecasting, Logsdon said.
Although definitions vary, predictive analytics roughly means scouring large amounts of data to predict future trends. The Federal Aviation Administration, for example, melds millions of records from various safety databases in hopes of detecting vulnerabilities to aircraft accidents.
The technique has also been applied as a fraud prevention tool. Following passage of the $819 billion economic stimulus bill in 2009, the Recovery Accountability and Transparency Board won plaudits for using analytics to keep grants and contracts away from companies and other entities with dubious track records.
Two years ago, President Obama sought to duplicate that success on a larger scale with the creation of the Government Accountability and Transparency Board, which is charged with finding ways to root out fraud and waste governmentwide. That has proved more challenging, in part because of the task’s scale and because federal law limits agencies’ ability to share information they hold on individuals. Getting the go-ahead to “match” data “takes years to do,” Patrick O’Carroll, the Social Security Administration’s inspector general, told senators in May.
Last year, the Association of Government Accountants offered a mixed report card on agencies’ use of analytics. The Agriculture Department reported that it had cut the rate of food stamp trafficking by more than half, while Medicare administrators had developed a screening system that showed promise for reducing fraud. But of the eight agencies surveyed by AGA, only one was using analytics to measure performance, the report said.