How to share data while preserving privacy

| December 1, 2018

Governments and businesses now have a way to more safely share data, thanks to a new technique developed by ACS. The technique is detailed in the new report, Privacy in Data Sharing: A Guide for Business and Government, which contains a guide for sharing data between entities without compromising the privacy of individuals.

The report, released today at the Akolade Australian Data Summit, is the culmination of more than two years’ work by the ACS Data Sharing Committee, led by NSW Chief Data Scientist Dr Ian Oppermann. The committee has members from state and federal governments as well as businesses, including Standards Australia, the CSIRO, Microsoft, Clayton Utz, and many others.

“One of the key issues both governments and business face is how to share data safely, without comprising the privacy of individuals,” said ACS President Yohan Ramasundara.

“Data sharing between businesses and governments offers tremendous potential for new smart services, for creating value. We’re already seeing many organisations exploring the potential of shared data, and it’s predicted that open data is going to be worth $25 billion per year to the Australian economy.

“But we have to be incredibly careful. We have to make sure we keep the social contract with our citizens to maintain their privacy.”

According to editor Oppermann, the report details a usable technique for anonymising data in a way that that is designed to prevent reidentification and preserve privacy. It looks at a variety of factors – the people and organisations who have access to the data, the number of people contained within the dataset, handling processes and more – and provides a defined guide for a ‘reasonable’ level of data privacy.

“Anonymising data is one of the most challenging issues we face today. You can’t just strip names out. In many cases when data sets are combined, it becomes possible to re-identify individuals by cross referencing,” said Oppermann.

“Governments understand the benefit of releasing deidentified data to support research and to help drive industry. However, when important data sets are being considered for release, the concern is always the thought of what other data sets are out there and whether they could be combined.

“When the medical benefits data set was released in 2016, with individual records de-identified, all was fine until that data met other data. For example, if you knew that a particular individual broke their arm in Canterbury in 2014, then you could cross reference that with the medical dataset and be able to link a deidentified medical record with an individual.”

According to Oppermann, the sheer number of datasets and variables available today makes it incredibly hard to share data that is safely deidentified. In some cases, hundreds of datasets may be combined, which could make the risk of finding personal identification very high. The report proposes a framework to address this risk and proposes a standardised technique to ensure that the data is kept as safe as needed for the uses intended.

“What we have developed is a framework for addressing the complexity of data sharing. We have suggested a standard way to safely share data between organisations,” said Oppermann. “We think getting this right is crucial for the future of Australian ICT. The benefits of data sharing are immense, but the risks are high. This is a way to enjoy the benefits while managing the risks.”

The report is available online at