The amount of data produced around the world will soon reach unprecedented levels, and the federal government will have to establish data hygiene and categorization practices now to be able to handle that explosion of data, according to federal CIO Suzette Kent.
“We have to have a comprehensive hygiene and we have to have agreed-upon common-use protocols. Because what we’re building, it’s not just good discipline, it’s not just the right thing to do, it is the way that we create the power for future analytics, for machine learning, for deep learning, for artificial intelligence and to actually leverage the volumes of data that we’ll have in the future,” said Kent at a June 6, 2018, Data Coalition event.
According to Kent, who recently spoke with technology teams across Silicon Valley about the proliferation of data, by 2025 there will be 163 zettabytes of data produced per year. For context, that amount of data would fit on 250 billion DVDs. The world currently only produces about a tenth of that amount.
Already, the government has begun to face situations that require a whole new perspective on data processing. According to Kent, after the Las Vegas mass shooting in October 2017, the FBI received many terabytes of video data from bystanders of the event.
A single terabyte of data would account for watching 400 videos, each 90 minutes long. It would have taken far too long for humans to process all that data, but, according to Kent, the FBI had recently transitioned to a cloud-based system that could process the data by machine within a relatively short amount of time.
Law enforcement is not the only area of government that could see massive returns from the ability to process large amounts of organized data.
Under the data-specific cross-agency priority goal of the President’s Management Agenda, agency leadership will soon be looking at ways to optimize and leverage data in health, homeland security, state and local governments, and financial management, according to Kent.
“We’re looking at very specific problems that agencies or the public are trying to solve right now and using that as a starting point to take a small piece of the data and apply very specific hygiene standards, sharing standards and our agency agreements if we have to do that,” said Kent. “Those areas we feel fairly confident will help us prove those guiding principles and give us outcomes.”
Kent explained that government-held data is some of the most comprehensive and valuable in the world, with many private sector companies already established just to take advantage of it. The cleaner and more organized that data is, the more value it can provide.
In addition to providing monetary value, Kent said that data hygiene also compliments the latest need for increased cybersecurity, as organization and security of data go hand in hand.
“We are purposefully involving a much wider net of individuals across agencies, across our academic partners and private sector,” said Kent. “The team that is working on the data CAP goal wants to listen. We want your input, we want your feedback, but most importantly we want your participation.”