Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.1k views
in Technique[技术] by (71.8m points)

tensorflow2.0 - Track privacy guarantees in a federated learning process with DP-query

I'm a bit new to TFF, I have checked github and followed the EMNIST example to train a differentially private federated model using DP-FedAvg algorithm. Mainly this is done by attaching a dp-query to the aggregation_process then train the federated model.

I have a question please:

1. Given that attaching a dp-query to the aggregation process would result in a participant-level Central-DP , How would I track the privacy guarantee (eps, delta) during training ?

below is a code snippet where a differentially private federated model is set up with 100 participants, that is why both expected_total_weight and expected_clients_per_round are set to 100

def model_fn():
    keras_model = create_keras_model()
    return tff.learning.from_keras_model(
        keras_model=keras_model,
        loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
        input_spec=preprocessed_first_client_dataset.element_spec,
        metrics=[tf.keras.metrics.SparseCategoricalAccuracy()])


dp_query = tff.utils.build_dp_query(
    clip=1.0,
    noise_multiplier=0.3,
    expected_total_weight=100,
    adaptive_clip_learning_rate=0,
    target_unclipped_quantile=0.5,
    clipped_count_budget_allocation=0.1,
    expected_clients_per_round=100
)


weights_type = tff.learning.framework.weights_type_from_model(model_fn)

aggregation_process = tff.utils.build_dp_aggregate_process(weights_type.trainable, dp_query)

iterative_process = tff.learning.build_federated_averaging_process(
    model_fn=model_fn,
    client_optimizer_fn=lambda: tf.keras.optimizers.SGD(0.1),
    server_optimizer_fn=lambda: tf.keras.optimizers.SGD(learning_rate=1.0),
    aggregation_process=aggregation_process
)

I came across several methods to compute epsilon and delta in TF-Privacy, but it seems they are meant to track privacy guarantee of the traditional DP-SGD algorithm and expect to receive parameters such as steps, n and batch_size

Thanks a lot in advance

question from:https://stackoverflow.com/questions/65943395/track-privacy-guarantees-in-a-federated-learning-process-with-dp-query

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

There are a few ways to perform this calculation. We will discuss two options below.

Re-purposing DPSGD analysis tools

You are correct that these tools accept parameters which are named for the DP-SGD setting; however, their arguments can be remapped to the federated setting in a fairly straightforward manner.

Suppose we have the symbol apply_dp_sgd_analysis from TFP's analysis library. We can write a simple function that essentially modifies the body of compute_dp_sgd_privacy for the federated setting:


def compute_fl_privacy(num_rounds, noise_multiplier, num_users, users_per_round):
  # This actually assumes Poisson subsampling, which may not be *quite*
  # right in your setting, but the approximation should be close either way.
  q = users_per_round / num_users  # q - the sampling ratio.

  # These orders are inlined from the body of compute_dp_sgd_privacy
  orders = ([1.25, 1.5, 1.75, 2., 2.25, 2.5, 3., 3.5, 4., 4.5] +
            list(range(5, 64)) + [128, 256, 512])
  
  # Depending on whether your TFF code by default uses adaptive clipping or not,
  # you may need to rescale your noise_multiplier argument.

  return apply_dp_sgd_analysis(
    q, sigma=noise_multiplier, steps=num_rounds, orders=orders, delta=num_users ** (-1))

Using TFP PrivacyLedger

If you're using the relatively new tff.aggregators.DifferentiallyPrivateFactory (which I would suggest over the DP process used above), you can pass an already-constructed DPQuery, which can be decorated with a PrivacyLedger. This ledger could then be passed directly into a function like compute_rdp_from_ledger, and it should have tracked the privacy spent from each of the query calls.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...