You will be redirected to the page you want to view in  seconds.

How CMS managed the repair project

IT response to insurance site problems ongoing

Nov. 18, 2013 - 02:09PM   |  
By ANDY MEDICI   |   Comments
Todd Park, the federal chief technology officer, is leading the effort to repair
Todd Park, the federal chief technology officer, is leading the effort to repair (staff)

On Oct. 2, when it was evident the website was unresponsive, top officials at the Centers for Medicare and Medicaid Services (CMS) and their contractors huddled to decide how to respond. Their plan: divide project members into four teams and tackle the highest-priority of the hundreds of problems identified.

The biggest priorities were insufficient server capacity, debilitating load times and poorly written code that was prone to error messages.

Leading the effort was Todd Park, the federal chief technology officer, who helped coordinate work among the teams and resorted to sleeping on the floor of his office that first week, and Henry Chao, CMS’ deputy CIO.Contractor QSSI created a dashboard to track progress on each problem and to manage employee and contractor staff, according to agency documents.

The project’s leaders decided to focus on improving two key metrics for success: The error rate — how often the system failed to perform as it should have — was at 6 percent; and the response rate — how long it took on average for a user to load a page — was an incredibly long eight seconds.

One of the first steps CMS officials took was to replace a series of virtual servers with physical servers in order to bring down the response time, according to the agency. Combined with some additional software fixes the response time fell to less than one second.

The agency also installed two large-scale data storage units to help ensure stability within the system by balancing the volume and data across servers, said Julie Bataille, CMS spokeswoman. The response teams also made good progress identifying and fixing code problems.

“In addition to these recent fixes, our ongoing monitoring shows that the system is stable with users moving more quickly through it with fewer errors,” she said.

CMS also continued to add servers, create and install new lines of code, crafted shorter and simpler database queries and continued to test overall system performance, which helped lower response times.

The agency gathered highly detailed data on how quickly web pages loaded for the millions of different visitors and which parts of the application processes took the most time — that helped the agency prioritize what it should fix next.

The efforts helped.

The average time to load pages and forms on the website dropped from eight seconds to one second, according to HHS. And the error rate, once at 6 percent, is now less than 1 percent, according to the agency.

But many problems persists. Among the website’s continuing shortcomings:

■It is unable to support 60,000 simultaneous users. The system crashed Oct. 1 with only 1,100 users, but the agency says it can now handle about 20,000 users at the same time.

■It is unable able to successfully register 500,000 visitors for insurance. It was only able to handle 106,000 because of the many log-in issues facing users, according to HHS.

Experts note the intensive repair activity is ratcheting up the website’s price tag. The website and associated IT systems have cost more than $600 million, according to the Government Accountability Office.

David Powner, director of IT management issues at GAO, said the repair costs are unknown. The contracts for are cost-reimbursable, so contractors can charge for the work they perform on an ongoing basis.

“I think that’s a key question, how much that will end up being,” Powner said.

CMS says it has solved dozens of problems associated with the site and is continually testing the systems to identify and fix new problems.

President Obama said in a press conference Nov. 14 there would still be technical glitches with even when it is working for “the vast majority of people.” He also blamed some of the roll-out problems on the “cumbersome” federal IT procurement process.

Former Department of Homeland Security CIO Richard Spires said at a Nov. 13 hearing that fundamental management weaknesses were to blame for the poor roll-out of and the administration should work on reforming the IT procurement system and passing legislation to enhance CIO authority to manage projects and control costs.

“These changes could have helped to address some of the critical failings of the program management of,” he said.

More In IT

More Headlines