Speaker
Description
Most machine learning models require a large amount of data for efficient model training. This data is usually expected to be placed in one centralized spot. When enough data is available but not located in one spot, such as data collected by edge devices, sharing data with a central server is necessary. Sharing a large amount of data introduces several issues: data might not be feasible to share because of privacy concerns or data restrictions. In other cases, sharing data is not even possible due to the lack of resources and communication overhead.
Federated Learning (FL) comes into play to solve these problems. It is a machine learning paradigm, which allows distributing a machine learning workflow onto multiple clients. Clients participating within the workflow are able to collaboratively train a machine learning model by training it locally on their own data and just share the updated state of the model after training. The data located on the client itself is not shared with other clients, which leads to a privacy strengthened and more resource saving training process. Based on the architecture, FL can be categorized into centralized and decentralized FL approaches. In the centralized approach, a server for administration and communication purposes is involved. In the decentralized approach, the clients themselves are responsible for the communication, administration and aggregation of a model. Examples of Decentralized FL are Swarm Learning, where in each round an aggregator client is chosen solely for the aggregation of the updated model states of all the clients, or Cyclic Learning, where the results are transferred from client to client in a sequential order, rather than aggregating all together.
A framework for transforming an existing machine learning workflow into a FL workflow is provided open-source with the NVIDIA Federated Learning Application Runtime Environment (NVFlare) library. It can be used for various machine learning workflows and provides plenty of functionality based on the use-case and also offers implementations of Swarm and Cyclic Learning.
We apply FL and NVFlare for the real-world application of UAV-based thermal imaging in urban environments from the third use case of AI4EOSC project. In particular, for detecting thermal anomalies caused by features like cars, manholes and streetlamps. By automatically filtering false alarms, we can improve the efficiency of energy-related systems. Since the data originates from different locations and cities, sharing the data causes data protection and communication costs. Therefore, each client is selected according to the geographical location where the images were taken.
A U-net model is trained to detect thermal anomalies automatically. This workflow is then transformed into a FL workflow using NVFlare. We investigate different centralized FL approaches such as FedAvg, FedOpt, FedProx, Scaffold and Ditto, as well as decentralized FL approaches such as Swarm and Cyclic Learning in terms of scalability, communication, accuracy and performance. Our research demonstrates challenges and opportunities associated with FL and highlights the effectiveness of different approaches in a real-world scenario.
Topic | Needs and solutions in scientific computing: Artificial Intelligence |
---|