Framework for Information Disclosure with Ethical Security (FIDES)

FIDES, funded by DHS’s IMPACT program, aims to enable controlled disclosure and analysis on sensitive data, while keeping that data private from researchers and adversaries

FIDES aims to resolve longstanding conflicts between the desire to share recorded network datasets and the legitimate risk concerns held by potential providers of such data. In 2016, the Federal Cyber Security Research and Development Strategic Plan clarified this desire: the science of cyber security needs data with which to produce and reproduce research results, and operational cyber defense needs to share data for purposes such as defending networks and protecting critical information systems. Legitimate risk concerns are also clear: events recorded in such datasets could negatively impact reputation or business stance; network traffic may reveal personally identifiable information such as political preferences, social contacts, or health information; or network flows may aid adversaries in planning attacks. Reluctance to share such data was reported as early as 19974, resulting in many proposals for ways to share data safely, with corresponding re-identification attacks on each of those proposals. This conflict continues today. For example, the current IMPACT database imposes restrictive contractual terms on users of some IMPACT datasets, yet some datasets such as real-world Netflow data are still not contributed by providers. A key goal of the new IMPACT BAA is to influence potential data providers to share data willingly and without anonymization (which diminishes the data’s utility) by addressing provider privacy concerns while retaining researcher utility.

The solution

FIDES aims to resolve longstanding conflicts between the desire to share recorded network datasets and the legitimate risk concerns held by potential providers of such data. In 2016, the Federal Cyber Security Research and Development Strategic Plan clarified this desire: the science of cyber security needs data with which to produce and reproduce research results, and operational cyber defense needs to share data for purposes such as defending networks and protecting critical information systems. Legitimate risk concerns are also clear: events recorded in such datasets could negatively impact reputation or business stance; network traffic may reveal personally identifiable information such as political preferences, social contacts, or health information; or network flows may aid adversaries in planning attacks. Reluctance to share such data was reported as early as 19974, resulting in many proposals for ways to share data safely, with corresponding re-identification attacks on each of those proposals. This conflict continues today. For example, the current IMPACT database imposes restrictive contractual terms on users of some IMPACT datasets, yet some datasets such as real-world Netflow data are still not contributed by providers. A key goal of the new IMPACT BAA is to influence potential data providers to share data willingly and without anonymization (which diminishes the data’s utility) by addressing provider privacy concerns while retaining researcher utility.

 Left: Existing IMPACT workflow. Right: IMPACT workflow with FIDES.

Today, (1) a researcher accesses the IMPACT database catalog to search among the existing datasets, eventually (2) requesting access to one (or more) of them. Then, an IMPACT administrator (3) interacts with the researcher to negotiate a legal contract regarding how the dataset will be used, how it will be handled by the researcher, and so on. E-mail contact (4) allows for researchers to (5) obtain the desired data.

FIDES streamlines this process while at the same time assuring privacy of the selected datasets. Dataset browsing (1) and sharing requests (2) are now integrated into a single web portal instead of being separate and requiring e-mail interactions with a human administrator. We provide tools to the administrator to ease the cognitive load on decision-making, and integrate that capability with automatic creation and distribution to the researcher of technical control specifications that must be applied (3). Because FIDES replaces contractual controls with automatically enforced technical controls, and because data is only available to researchers in encrypted form, FIDES reduces the time from researcher request to data sharing.

Once the researcher has received the required technical controls from the IMPACT administrator, they then request the dataset directly from the IMPACT data provider (4), who provides it in encrypted form only (5). With FIDES, we replace the human e-mail interaction between researcher and provider with a web service that allows for requesting and securely retrieving datasets. Once received by the researcher, datasets can only be decrypted inside our cryptographically secure enclave, where research analyses are performed. The researcher never has datasets in unencrypted form and only learns the result of queries as allowed by agreed-on technical controls.

FIDES will be an open-source project with iteratively increasing functionality.

This material is based on work supported by DHS and the United States Air Force under Contract No. FA8750-18-C-0051.  Any opinions, finding and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of DHS or the United States Air Force.