Advertisement

You will be redirected to the page you want to view in  seconds.

Looking for the needle in the big data haystack

Jun. 2, 2014 - 04:12PM   |  
By KEVIN G. COLEMAN   |   Comments
Kevin Coleman is a senior fellow at SilverRhino and former chief strategist at Netscape.

More

When you look into all the computers, smartphones, tablets and devices connected to the Internet, you begin to get a feel for the size of the data set from which cyber intelligence is gathered. This subject has drawn a significant amount of interest ever since the Edward Snowden leaks and that is sure to continue for the foreseeable future.

One unclassified report estimates that over 100 million cyber intelligence sources are used in order to gather the cyber threat intelligence needed to protect and defend our systems and infrastructures from hacktivists, terrorists, cyber criminals and rogue nation states.

In April 2014 the Intelligence and National Security Alliance (INSA) published a whitepaper titled “Strategic Cyber Intelligence.” If you read the report, you get a glimpse of some of the many sources of publically available intelligence and the importance of cyber intelligence to our national security. Cyber intelligence is a rapidly expanding field of interest. Just about everyone now recognizes that to move to a much more proactive approach to cyber security, cyber intelligence is the fundamental ingredient that is needed. Knowing about a digital threat or attack beforehand allows security personnel to take specific defensive actions to thwart such an attack.

What hasn’t drawn nearly as much attention is all the cyber intelligence that is collected and sold for security and defense use as well as all commercial business applications. When you pull at the information together from public sources and begin to think about all the information available on the classified side, you begin to get an idea of just how huge the data set actually is. One estimate places the rate of digital data growth (classified data not included) at nearly 1.2 zettabytes per year. A zettabyte is 1021 power or 1000000000000000000000 bytes. This goes well beyond big data to huge data. Recently the Huffington Post ran a piece titled, “Suffering From Information Overload?” In that piece, Scott Anderson writes. “We are as a business community toiling in a world of information overload” and most would have to agree with that statement.

(Page 2 of 2)

Now consider all the new and emerging data sources from the Internet of Things, Wearable Computing, Smart Cities and the movement to make vehicles a computing and communications platform and the size of the data set grows substantially. If information overload existed before all those initiative, what will it be like after their data is integrated? It will surely tax, if not overwhelm, our current data analysis tools and techniques, as well as being outside the boundaries of cost effective data storage.

During a discussion on this topic, one individual wondered if we will begin to miss important patterns and clues, or critical pieces of data that are highly dispersed among such a large data set. One person called this “data overload on steroids.” The final point of discussion was if there would be as much outrage about the private sector collecting all this information for commerce use? The answer was nearly unanimously a big fat “no way!”

What does it say about our mindset when we don’t mind a business collecting significant amounts of information (our location (within a few feet), time spent there, what is at that location, frequency visiting that location and on and on) so they can market to us, while we are outraged about collecting a lesser amount of data at a higher level in efforts to protect our homeland and improve our security?

Somehow the intelligence community, and the government as a whole, must earn back the trust of the American people. Given the current indicators, it is highly likely that new regulations driven by public demand will restrict collection of data that could be used to defend cyber attacks that could cripple our economy or parts of our critical infrastructure. What is clear is a balance needs to be struck—privacy on one side and national security on the other. If history is any indicator, that will not happen, at least not anytime soon.

More In Federal IT