By Kato Mivule
-
Noise addition: A stochastic value is added to confidential numeric attributes.
-
The stochastic value is chosen from a normal distribution with zero mean and a diminutive standard deviation. First publicized by Jay Kim (1986) with the expression that
-
Z = X + ɛ
-
Where Z is the transformed data point
-
X is the original data point.
-
ɛ (epsilon)is the random variable (noise) with a distribution ε∼ N(0, σ2 ).
-
The X is then replaced with the Z for the data set to be published, Z = X + ɛ
Statistical Considerations in Noise Addition
Gaussian Noise Distribution
-
The Normal Distribution( Gaussian distribution), is a bell shaped probability distribution depicting real-valued stochastic variables clustered around a single mean…

-
μ (mu) is the mean
-
σ2 (Sigma) is the variance
-
N(μ, σ2) is the normal distribution with mean μ and variance σ2
- Transformed data has to keep the same statistical properties as the original data.
- Covariance:Cov(X, Y): How affiliated the deviations between points X and Y.

- If Cov(X, Y) is positive, X and Y increase together, otherwise they don’t.
- If Cov(X, Y) is zero, X and Y are each autonomous.
- Correlation rxy calculates tendency of linear relation between two data points.

- If -1, then rxy is a negative linear relation between x and y,
- if 0, no linear relation,
- if +1, a strong linear relation.
Notes
[1] Jay Kim, A Method For Limiting Disclosure in Microdata Based Random Noise and Transformation, Proceedings of the Survey Research Methods, American Statistical Association, Pages 370-374, 1986.
[2] J. Domingo-ferrer, F. Sebe, and J. Castella-Roca, “On the security of noise addition for privacy in statistical databases,” in Privacy in Statistical Databases 2004, 2004, pp. 149-161. [Online]. Available: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.2.4575