Federal law enforcement agencies face challenges in working with large volumes of data from multiple data bases, such as in the Boston Marathon bombing case. (John Moore//Getty Images)
Agencies rely on vast sources of data to fight tax fraud, improve health care, manage federal buildings and improve delivery of citizen services. But obtaining and making sense of “big data” is a challenge, information technology managers say in a survey released last week.
Among the challenges:
■Poor data quality.
■Identifying a common, authoritative source of data.
■Consolidating data from disparate systems into one place.
■Creating policies and processes for accessing and defining data.
Chief information officers from more than a dozen agencies, including the Defense Department, General Services Administration and the Office of Management and Budget, participated in the annual survey by the trade organization TechAmerica and consulting firm Grant Thornton.
Seventy-eight percent rated their experience in using data analytics as three or less on a scale of one to five, with five being the most mature.
This hasn’t stopped agencies from testing big data tools and techniques to improve internal processes such as managing assets and reducing improper payments. But those efforts are not always coordinated across an agency, some CIOs noted. They suggested creating a Business Intelligence Center of Excellence to better coordinate data analytics projects and reduce duplicate investments.
CIOs also expressed concerns about the quality of data.
“Data quality matters a lot,” said David McQueeney, a vice president with IBM. “If you’re making a significant decision, you have to be cognizant of which [data] sources [you] use” and whether they are reliable for decision-making.
Agencies like the Department of Homeland Security and the intelligence community rely on large volumes of complex and varying data to make decisions that could have national security implications. While some big data tools can generate statistical information about the reliability of data sets, a system can never be a substitute for human judgment, McQueeney said. Agencies have to weigh their confidence in the quality of data versus the consequences of their decisions based on that data.
Agencies’ data sets should include information such as how data were measured, what the data mean and how accurate they are, McQueeney said. This type of information becomes even more important when multiple data sets are analyzed to identify a trend.
The General Services Administration, for example, relies on real-time analysis of data sets from multiple building systems to improve energy efficiency and building management. GSA is working with IBM to develop a system that monitors building performance and streams data to a central facility, where it is analyzed and used to regulate building performance.
These building systems will eventually provide millions of data elements on lighting, electricity use, and heating and cooling that building managers didn’t always have to provide a complete picture of facility operations, said Larry Melton, a former assistant commissioner at GSA. That information will be pushed to a sustainability support center, where managers can view digital displays in real time and act on that data, said Melton, now president and CEO of consulting firm The Building People.
Another challenge for some agencies is identifying the problem they hope to solve by analyzing data. “I find that these things [projects] fail if you don’t know why you’re doing it,” said Mark Herman, an executive vice president at Booz Allen Hamilton. “Whatever you get out of it will be above or below expectations but unlikely to meet your needs.
“There has to be a strategy that says, ‘Here’s what we do for a living, [and] what is it that would be improved,’ ” Herman said, adding that agencies should ask how they could deliver that service for less money and what mission success would look like.
Answering these questions requires agencies to know what data they own and what additional data they can access to analyze trends, he said. Internally, Booz Allen ingests and stores varying types of data so they are readily available for analytics. At GSA, building management systems are being connected to a central cloud-based solution where data are stored and managed for trend analysis. ■