Feeds:
Posts

## A Summary on Noise Addition for Data Privacy

By Kato Mivule

• Noise addition: A stochastic value is added to confidential numeric attributes.

• The stochastic value is chosen from a normal distribution with zero mean and a diminutive standard deviation. First publicized by Jay Kim (1986) with the expression that

• Z = X + ɛ

• Where Z is the transformed data point

• X is the original data point.

• ɛ (epsilon)is the random variable (noise) with a distribution ε∼ N(0, σ2 ).

• The X is then replaced with the Z for the data set to be published, Z = X + ɛ

Gaussian Noise Distribution

• The Normal Distribution( Gaussian distribution), is a bell shaped probability distribution depicting real-valued stochastic variables clustered around a single mean…

• μ (mu) is the mean

• σ2 (Sigma) is the variance

• N(μ, σ2) is the normal distribution with mean μ and variance σ2

• Transformed data has to keep the same statistical properties as the original data.
• Covariance:Cov(X, Y): How affiliated the deviations between points X and Y.

• If Cov(X, Y) is positive, X and Y increase together, otherwise they don’t.
• If Cov(X, Y) is zero, X and Y are each autonomous.
• Correlation rxy calculates tendency of linear relation between two data points.

• If -1, then rxy is a negative linear relation between x and y,
• if 0, no linear relation,
• if +1, a strong linear relation.

Notes

[1] Jay Kim, A Method For Limiting Disclosure in Microdata Based Random Noise and Transformation, Proceedings of the Survey Research Methods, American Statistical Association, Pages 370-374, 1986.

[2] J. Domingo-ferrer, F. Sebe, and J. Castella-Roca, “On the security of noise addition for privacy in statistical databases,” in Privacy in Statistical Databases 2004, 2004, pp. 149-161. [Online]. Available: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.2.4575