Vulnerability hole in DP-SGD privateness evaluation
Most sensible implementations of DP-SGD shuffle the coaching examples and divide them into fixed-size mini-batches, however immediately analyzing the privateness of this course of is difficult. Because the mini-batches have a hard and fast dimension, if we all know {that a} sure instance is in a mini-batch, then different examples have a smaller chance of being in the identical mini-batch. Thus, it turns into attainable for coaching examples to leak details about one another.
Consequently, it has turn into widespread apply to make use of privateness analyses that assume that the batches had been generated utilizing Poisson subsampling, whereby every instance is included in every mini-batch independently with some chance. This permits for viewing the coaching course of as a collection of impartial steps, making it simpler to research the general privateness value utilizing composition theorems, a extensively used methodology in varied open-source privateness accounting strategies, together with these developed by Google and Microsoft. However a pure query arises: is the aforementioned assumption an inexpensive one?
The assure of differential privateness is quantified by way of two parameters (ε, δ), which collectively signify the “privateness value” of the algorithm. The smaller ε and δ are, the extra personal the algorithm is. We set up a method to show decrease bounds on the privateness value when utilizing shuffling, which implies that the algorithm isn’t any extra personal (that’s, the ε, δ values are not any smaller) than the bounds that we compute.
Within the determine beneath, we plot the trade-off between the privateness parameter ε and the size σ of noise utilized in DP-SGD, for a hard and fast variety of steps of coaching (10,000 on this case) and the parameter δ (10-6 on this case). The curve ε𝒟 corresponds to creating the batches with none shuffling or sampling, and the curve ε𝒫 corresponds to DP-SGD with batches utilizing Poisson subsampling. The curve ε𝒮 is obtained utilizing our decrease certain method, exhibiting that for small σ, the precise privateness value of utilizing DP-SGD with shuffling (orange line, beneath) may be considerably greater than that of Poisson subsampling (inexperienced line).

