Byzantine-Robust Aggregation for Decentralized Federated Learning

Abstract

Decentralized federated learning suffers from Byzantine attacks in which adversarial gradients can be constructed by compromised nodes to destroy the training process. Existing Byzantine-resilient methods cannot be decentralized since they rely on global information or do not scale well under non-IID data. We propose GradTrust, a BFT-aggregation algorithm which dynamically assigns trust scores by credibly assessing multi-dimensional gradient similarity, including the directional alignment, magnitude consistency and temporal stability, without requiring any auxiliary data. An information-theoretic analysis reveals that 3D similarity can recapture 99% of distinguishable Byzantine patterns in O(nd) time. In settings with strongly convex objectives, GradTrust achieves O(1/T) convergence rate with a bounded Byzantine error bound of O(α²σ²/n). By passing only 10% of the gradient components through importance-weighted sparsification, the algorithm reduces communication by 80.7% and still preserves the detection capability. Experiments on MNIST and CIFAR-10 for 100 nodes show that the algorithm achieves 89% accuracy under 30% Byzantine corruption, improving over baselines by 20% while converging 34% faster. The high degree of aggregation and communication efficiency make it practically deployable in bandwidth-limited edge networking environment.

PDF

References

[1]Blanchard, P., El Mhamdi, E. M., Guerraoui, R., & Stainer, J. (2017). Machine learning with adversaries: Byzantine tolerant gradient descent. In Advances in Neural Information Processing Systems 30 (pp. 119-129).

[2]Cao, X., Fang, M., Liu, J., & Gong, N. Z. (2021). FLTrust: Byzantine-robust federated learning via trust bootstrapping. In Proceedings of the Network and Distributed System Security Symposium (pp. 1-18).

[3]Damaskinos, G., El Mhamdi, E. M., Guerraoui, R., Guirguis, A., & Rouault, S. (2019). Aggregathor: Byzantine machine learning via robust gradient aggregation. In Proceedings of Machine Learning and Systems (pp. 81-96).

[4]Fang, M., Cao, X., Jia, J., & Gong, N. Z. (2020). Local model poisoning attacks to Byzantine-robust federated learning. In Proceedings of the 29th USENIX Security Symposium (pp. 1605-1622).

[5]Karimireddy, S. P., Kale, S., Mohri, M., Reddi, S. J., Stich, S. U., & Suresh, A. T. (2020). SCAFFOLD: Stochastic controlled averaging for federated learning. In Proceedings of the 37th International Conference on Machine Learning (pp. 5132-5143).

[6]Li, S., Ngai, E., & Voigt, T. (2023). Byzantine-robust aggregation in federated learning empowered industrial IoT. IEEE Transactions on Industrial Informatics, 19(2), 1165-1175.

[7]Lian, X., Zhang, C., Zhang, H., Hsieh, C. J., Zhang, W., & Liu, J. (2017). Can decentralized algorithms outperform centralized algorithms? A case study for decentralized parallel stochastic gradient descent. In Advances in Neural Information Processing Systems 30 (pp. 5330-5340).

[8]McMahan, B., Moore, E., Ramage, D., Hampson, S., & Arcas, B. A. (2017). Communication-efficient learning of deep networks from decentralized data. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (pp. 1273-1282).

[9]So, J., Guler, B., & Avestimehr, A. S. (2021). Byzantine-resilient secure federated learning. IEEE Journal on Selected Areas in Communications, 39(7), 2168-2181.

[10]Yin, D., Chen, Y., Kannan, R., & Bartlett, P. (2018). Byzantine-robust distributed learning: Towards optimal statistical rates. In Proceedings of the 35th International Conference on Machine Learning (pp. 5650-5659).

[11]Zhang, C., Xie, Y., Bai, H., Yu, B., Li, W., & Gao, Y. (2021). A survey on federated learning. Knowledge-Based Systems, 216, 106775