The White House’s technology office wants industry input on how federal government should share and manage data that stems from research projects funded by the government, according to a request for comment published in the Federal Register Jan. 17.
The White House Office of Science and Technology Policy notice, titled “Request for Public Comment on Draft Desirable Characteristics of Repositories for Managing and Sharing Data Resulting From Federally Funded Research,” wants to decrease burden on researchers by setting data repository standards for federal agencies to provide “optimization and improved consistency” across the federal government’s repositories.
The draft characteristics of a data repository is to “improve the consistency of guidelines and best practices that agencies provide about the long-term preservation of data from federally funded research, including suitable repositories for preserving and providing access to such data, considering agency missions, best practices, and relevant standards."
The draft characteristics are listed in two different sections. The first section details several attributes that should be in all data repositories, while the second is specific to data repositories storing human data. Here’s what the desired characteristics are:
Section one: Desirable characteristics for all data repositories
- Persistent unique identifiers: Each dataset needs a citable, unique identifier.
- Long-term sustainability: A long-term plan for managing data.
- Metadata: Ensure that datasets have sufficient metadata to be used for discovery, reuse and citation for datasets
- Curation & quality assurance: OSTP wants expert curation and quality assurance to increase dataset accuracy and integrity.
- Access: Data ought to have equal access across the datasets.
- Free & easy to access and reuse: Datasets should be free and have broad terms of reuse.
- Reuse: The repository includes the ability to track reuse.
- Secure: Repositories should provide documentation that security requirements are met to prevent unauthorized access.
- Privacy: Repositories should also have documentation that “administrative, technical, and physical safeguards are employed in compliance with applicable privacy, risk management, and continuous monitoring requirements.”
- Common format: Datasets should be downloadable in a standard format, not in a proprietary one.
- Provenance: Repositories should maintain a detailed file documenting changes made to datasets and their metadata.
Section two: Additional considerations for repositories storing human data (even if de-identified)
- Fidelity to consent: Repositories should restrict access to uses that pertain to the research to which the original human subjects consented.
- Restricted use compliant: Repositories should make sure that that submitters’ data should not be reidentified or redistributed.
- Privacy: Ensure that documentation is provided of security measures in place to protect human subjects.
- Plan for breach: A repository should have a data breach response plan.
- Download control: Dataset downloads should be controlled and audited.
- Clear use guidance: Provides information that details data use restrictions.
- Retention guidelines: Provides information on guidelines for retaining data.
- Request review: A repository should have an established data access review or oversight board that rules on data requests.
The characteristics laid out are meant to be a tool agencies can use if they are are developing a repository for federally funded research, identifying repositories that an agency designate for particular research data, “informing” non-government repositories that feds want to store data in or for evaluating data management plans that want to store data outside the federal government.
The request for comments comes just weeks after the Office of Management and Budget released its federal data strategy year-one action plan, meant to push government to use data as a strategic asset and strengthen policymaking through data.
Comments will be accepted through March 6.
Andrew Eversden covers all things defense technology for C4ISRNET. He previously reported on federal IT and cybersecurity for Federal Times and Fifth Domain, and worked as a congressional reporting fellow for the Texas Tribune. He was also a Washington intern for the Durango Herald. Andrew is a graduate of American University.