Pamela Dyson is in the midst of a heavy lift.

As chief information officer for the Securities and Exchange Commission, Dyson has been tasked with both modernizing the information technology of one of the biggest producers of financial data in the federal government, while shrinking the footprint of the agency's data centers.

Dyson sat down to talk with Senior Staff Reporter Carten Cordell about SEC's move to the cloud, updates to EDGAR and cybersecurity.
The SEC, much like the rest of the federal government, is in the middle of a data transformation. Can you talk about how SEC is ushering this transformation and how it's benefitting the agency?
So one of my goals as the CIO is to build the data platforms that will support Big Data—data analytics. We also, of course, have the important mission of protecting and securing the integrity of the data that comes into the commission. One of the things that the agency undertook was a [Business Process Reengineering] effort across the enterprise to identify work flows and processes that could be automated. So I think that's Step One: identifying those work processes that can transform to a digital transformation. I think the second component of that is ensuring that the infrastructure is in place to ingest the data, that tools are in place that will allow the agency to get those insights into Big Data and data analysis that will assist in our examinations and investigative programs. So it's twofold, one is the actual BPR effort to identify and automate those areas that can be automated as well as building out the infrastructure to support it.

Another aspect of that is data consolidation. How do you try to deal with the scalability of reducing your data center footprint at the same time as you are providing more analysis?

So we had two large data centers. We've been able to bring our footprint down from somewhere over 30,000 square-feet to about 4,000 square-feet. We've been able to do that primarily in a number of ways to include server virtualization. We've been able to eliminate again a number of physical servers sitting in data centers, virtualize those and consolidate that. In addition to saving space, there's power consumption and other maintenance costs that we've been able to benefit from as well. So we have virtualized our data centers, I think we are about 97 percent virtualized. We've also modernized our storage technology. Our back-up and retrieval technology used to be tape-based and server-based, that stuff is now all run on our networks. So we've been able to not only consolidate, but we've been able to also benefit from cost savings based on the over-infrastructure.

Now in terms of talking about cybersecurity, one of the issues you have looked at is incorporating third party systems. Can you talk about the control that SEC has over those systems, the collaboration efforts and how you address cyber threats with them?

There's a couple of things we do. First and foremost is we perform our own independent security assessments. So even if it's a third-party system or an externally-hosted system that we are considering, we will look at the security documentation from those organizations and perform our own independent assessment. The other thing really is through continuous monitoring, that we monitor our network. Even if we have an externally-hosted system, we monitor our connections to those systems, our direct connects to those systems.

But in addition to that we're able to leverage the work from other government agencies, such as FedRAMP certifications. We work with both private and public sector entities, such as Homeland Security, the federal services organizations that allow us to ingest the data that they collect and do assessments on those that data as well. So it's really a layered approach, but a lot of work has already been done in this area that we can leverage.

We do full assessments, of course, of any systems that we build internally, but we do have access and evaluate the information security documentation for any third party or hosted system as well.

In talking about the use of structured data, what advice would you provide other federal CIOs that are trying to make this Big Data adjustment to improve or evolve their use of data?

One, I would always say if you don't have the systems and infrastructure in place to analyze the data, mine the data, it's probably not a good idea to ingest the data. So there's a couple of things you need to do. One is I'll go back again to security. You need to make sure that you have security controls in place as well as secure systems to host the data. The second thing that you need to do is build the infrastructure and platforms around storing that data and run analytical tools to analyze the data. So for the SEC during my tenure here, we spent the first couple of years just build an infrastructure that will be required to do Big Data management and analytics. So we have a number of enterprise data platforms. We have an enterprise data warehouse. We have a number of data repositories systems that multiple applications can leverage, and we manage those in a secure manner so that any data that we ingest is stored for rapid retrieval, but even data in transition is secure.

One of the things, as you know, is we're looking into the cloud and how we can move some of those processes into more agile and scalable environments. We continue to put those same requirements in place for the cloud that we do on infrastructure: that we're building a cloud framework that is secure, but also allows the scalability and the agility that we need to manage data, get big data insights and fast turnarounds to support our analytical programs.

You mentioned the cloud, you moved a number of operations from onto the cloud and made a lot of public information more accessible. Can you talk about what that move has been like?

We continue to take a very pragmatic approach to the cloud. Two years ago, the discussion was all around security, is data as secure in the cloud as it is on prem? I think we've overcome that across the federal government. FEDRamp has certainly helped put us in that space where we understand that data is secure. For us it's more about the business use of the data and the business use of the cloud. Can we use the same tools and get the same performance from a cloud instance that we can on prem?

For a lot of third party or hosted systems, it's less of a choice for us as with our legacy applications. So we've made a lot of investment in building a lot of our mission critical applications our own prem, and what would it take to forklift those to the cloud and does it make sense to do that for systems that are so critical to our operations that we use them on a daily basis. Will the retrieval of the information, will the ability to access information yield the same performance in the cloud as it does on prem?

So those are some of the questions and categorization efforts that we would go through as a part of our cloud strategy that we're building, our cloud governance that we're building. That's one piece of advice that I would lend to other federal agencies: that you have to make sure that the governance and strategy piece is well thought out within collaboration with the business to ensure that as you move instances to the cloud that you know it will not have an adverse impact on business operations.

In talking about Electronic Data Gathering, Analysis and Retrieval, or EDGAR, you mentioned before that you want to move it in line with the XBRL [Extensible Business Reporting Language]. Can you talk about what those updates have been like?

So we are moving in that direction to move it to XBRL standards. We also continue to enhance our existing EDGAR applications. There is a modernization effort underway for EDGAR, but we continue to enhance security, enhance accessibility and we're in the integrating XBRL into the current EDGAR system as well.

I think the biggest push around the new EDGAR will be to make it a platform that is more user-friendly to the stakeholders, to the public, for our internal stakeholders as well and move it also into a more financial disclosure digital transformation as we're doing with a number of other platforms. We want it to be mobile, we want it to be agile and, again, we want to be able to embrace all of the innovation around our other data platforms and data collection platforms that we have at the commission with EDGAR as well.

So we have a requirements-gathering effort underway, it's a huge effort. We're sitting down with our internal and external stakeholders, talking about what the new modernized EDGAR should look like, but in the meanwhile we, continue to enhance the current EDGAR to make sure we're incorporating things like XBRL and advanced security in the system as we move forward.

So what would you say the top priorities of SEC analytics are and the future modeling looks like?

I think the top priority for analytics is, again, we ingest a lot of data, we're a data-centric organization. The top priority is working with the quantitative analysis and analysts in other areas to build out innovative computing environments. A lot of what is being done today on our standard desktop systems and so forth, will not yield the results that are necessary to turn around these analytics in the timeframes of an investigation or an examination. So one of the things we're doing is partnering with the divisions to sit down and develop the requirements for innovation around these computing environments to give us the scalability that's necessary so that we can peak-up for large investigations, we can get the storage and compute necessary in short time frames and provision those environments pretty much on the fly so that we can enable the divisions to meet that mission. Of course that is going to take us to the cloud in terms of scalability. So while we virtualized our data centers—certainly we've compressed the footprint—the speed of add data centers and their network capability has been enhanced, we still don't have the scalability and the quick provisioning that the cloud can provide for these environments. So our next step, what's critical on the critical path for these Big Data systems and data analytics is the computing environments that are required to move with those.

In Other News
Load More