MSFLP論文概要

2023-09-12

2024-06-23

論文摘要

原始論文來源：Mitigating Sybils in Federated Learning Poisoning

MSFLP的簡介 (From Abstract)

機器學習（ML）在分散式多方數據上是各種領域所需要的。現有的方法，如聯邦式學習(federated learning)，透過收集由一群設備所計算的輸出，在中央聚合器處運執行迭代演算法去訓練一個全域共享模型。但是，聯邦式學習方法容易受到各種攻擊的影響，包括模型毒化(model posioning)，特別是在存在"混淆者(sybil)"的情況下。

論文首先評估聯邦式學習應對"基於混淆者的毒化攻擊(sybil-based posioning attack)“的弱點。然後，內文提出了FoolsGold，一種新的防禦方法，該方法可以基於分散式學習過程中，客戶端的更新多樣性來判斷該客戶端是否為"混淆者”。與以前的作法不同，內文提出的FoolsGold並不限制攻擊者的預期數量，不需要模型學習過程之外的輔助資訊，並且對客戶端及其持有數據的假設較少。

在論文的評估段落中，清楚的顯示FoolsGold在對抗基於混淆者標籤翻轉(label-flipping)和後門毒化攻擊(backdoor poisoning attack)方面超越現有最先進方法的能力。研究結果適用於客戶數據的不同分佈，不同的毒化目標(poisoning target)和各種混淆策略(sybil strategy)。

Prior Knowledge 先備知識

Federated Learning

federated learning
- train a centralized model on decentralized data
- all clients have equal influence on the system
- the server only observes model parameters
  - normal defense menthod can’t be applied to federated learning
    - anomaly detection, robust loss (loss function) -> central aggregator without clients’ data
- can’t easily verify which client parameter updates are honest
  - unable to to view client training data and does not have a validation dataset
- two forms
  - FEDSGD
    - each client sends every SGD update to the server
  - FEDAVG
    - clients locally batch multiple iterations of SGD before sends to the server
label-flipping attack
- make model misjudge true label to target label (label attacker like)
backdoor poisoning attack
- when training the model, the developer insert a backdoor in the model and use trigger for target class
sybil attack
- happen when clients can easily join and leave the system, adversary can gain influence under multiple colluding (串通) aliases
SGD
- select a batch of training examples, uses them to compute gradients on the parameters of the current model
- use subset to compute the gradient direction, so the gradient direction change very often, unlike normal gradient descent
- less expensive for large datasets and theoretically scales to datasets of infinite size
- early-stopping when the gradient is too small
SGD Challenges and defenses (Under Federated Learning)
- difficult for the aggregator to tell whether the gradient points towards a malicious objective or not
  - each client has a partition of data, may not satisfy the global learning objective
- aggregator cannot assume that updates pointing in sporadic directions are malicious
  - only a small subset of data is used in each iteration, make the gradient direction various
- adversaries can send arbitrary updates to the model, they may not conform to a specified batch size configuration
  - smaller batch size make the variance of updates contributed by clients increases
Multi-Krum
- the top f contributions to the model that are furthest from the mean client contribution are removed from the aggregated gradient
- uses the Euclidean distance to determine which gradient contributions are removed
- requires parameterization of the number of expected adversaries
  - but we don’t know the number of adversaries

Core Knowledge

FoolsGold

new defense against federated learning sybil attacks that adapts the learning rate of clients based on contribution similarity
- because the attackers share the same malicious objective, so their gradient directions may look similar than expected
- does not assume a specific number of attackers
- evaluate FoolsGold on 4 diverse data sets (MNIST, VGGFace2, KDDCup99, Amazon Reviews) and 3 model types (1-layer Soft-max, Squeezenet, VGGNet)

FoolsGold Design

honest clients can be separated from sybils by the diversity of their gradient updates
sybils share a common objective and will contribute updates that appear more similar to each other than honest clients
maintain the learning rate of unique gradient update, while reducing the learning rate of similar gradient update

Goal 1. When the system is not attacked, FoolsGold should preserve the performance of federated learning.
Goal 2. FoolsGold should devalue contributions from clients that point in similar directions.
Goal 3. FoolsGold should be robust to an increasing number of sybils in a poisoning attack.
Goal 4. FoolsGold should distinguish between honest updates that mistakenly appear malicious due to the variance of SGD and sybil updates that operate on a common malicious objective.
Goal 5. FoolsGold should not rely on external assumptions about the clients or require parameterization about the number of attackers

FoolsGold adapts the learning rate based on “the update similarity” and “historical information”
- the update similarity
  - Cosine Similarity [-1,1] -> measure angular distance, not affected by the vector size (no Euclidean Distance)
    - but cosine similarity can’t ensure to find the malicious client (innocent client may be punished)
  - Feature Importance -> only focus on the the features that are relevant to the model correctness or the attack
    - we can know feature importance from model parameters in the output layer of global model and filter them
      - normalized across all classes to avoid biasing one class over another
- historical information
  - Updates History -> maintains a history of updates from each client
    - compute the similarity based on current iteration and historical updates
Some tricks to modify the FoolsGold
- Pardoning
  - avoids penalizing such honest clients by re-weighing the cosine similarity (goal 4)
  - similarity / maximum similarity -> ensure the similarity scores along the [0,1]
    - ensures that at least one client will have an unmodified update (goal 1)
- Logit
  - even for very similar updates, the cosine similarity may be less than one -> encourage a higher divergence by logit function
    - exceeding the 0-1 range is clipped and rounded to its respective boundary value
- Augmenting FoolsGold with other method
  - FoolsGold can’t handle the attack from single adversary (can’t use cosine similarity), so we use FoolsGold with Multi-Krum to handle the attack

Contributions

We consider sybil attacks on federated learning architectures and show that existing defenses against malicious adversaries in ML (Multi-Krum [10] and RONI [6] (shown in Appendix B)) are inadequate.
We design, implement, and evaluate a novel defense against sybil-based poisoning attacks for the federated learning setting that uses an adaptive learning rate per client based on inter-client contribution similarity.
In this context, we discuss optimal and intelligent attacks that adversaries can perform, while suggesting possible directions for further mitigation.

Background

assume SGD as the optimization algorithm in this paper
assume that adversaries leverage sybils to mount more powerful poisoning attacks on federated learning

Assumption

assume that server side (aggregator) is not malicious
assume that some number of honest clients, who possess training data which opposes the attacker’s poisoning goal (at least 1 honest client) -> learn nothing without any honest client
assume that Secure-aggregation (obfuscations of clients’ update) is not used, so aggregator can observe any client’s update
assume that adversary has a goal to make one class being classified as target class without influencing any other class
assume that two kind of attack strategies : label-flipping/backdoor strategy

Evaluation

framework

federeated learning prototype -> 600 lines python (150 for FoolsGold)
scikit-learn to compute cosine similarity
partition data into disjoint non-IID dataset
evaluation dataset -> MNIST(baseline)、VGGFace2(more complex, deep learning)、KDDCup(class imbalance)、Amazon(contain text data)
- MNIST(10 honest), KDDCup(23 honest), and Amazon(50 honest) -> single layer fully-connected softmax
- VGGFace2(10 honest) -> SqueezeNet1.1(727,000)、VGGNet11(128 million)
- FoolsGold only use important feature to compute client similarity

Canonical attack scenarios

Comparison to prior work (6.2)

Varying client data distributions(6.3)

in real world, each client may has overlapping data

What if the attacker knows FoolsGold?(6.4)

attackers are aware of the FoolsGold algorithm, they will try to make updates become less similar

four ways below :

mixing malicious and correct data (Appendix B)
change sybils’ training batch size (Appendix B)
perturbing contributed updates with noise
infrequently and adaptively sending poisoned updates

1. mixing malicious and correct data

attackers may attempt to mix the honest and poisoned data, then attackers may have the opportunity to appear honest and less similar

2. change sybils’ training batch size

Amazon : the resulting attack rate was 4.76%, due to the curse of dimensionality (10,000 features, but only have 1,500 data)

3. perturbing contributed updates with noise

FoolsGold can use feature importance to remove the noise from the cosine similarity

4. infrequently and adaptively sending poisoned updates

FoolsGold can use update history to compute Acosine similarity, so attackers can locally compute cosine similarity among the sybils and use a threshold M to decide whether to add noise to gradient or not

green region (M > 0.27) -> threshold too high, any posioning attack is detected
blue region -> attack is not detected, but the number of sybils is insufficient to overpower the honest clients
red region -> attack succeeds, but use more sybils than required

(star -> expected number of sybils for the optimal attack)

But it is hard for attackers to find a threshold M, because they don’t know :

the number of honest clients
FoolsGold’s feature importance
original data distribution

Effects of design elements(6.5)

FoolsGold performance overhead(6.6)

Limitations

FoolsGood is not successful at mitigating attacks from single poisoning client.

FoolsGood is lack of randomness while computing the cosine similarity. (Knowledgeable Adversary)

技術類文章