Data sets in different hospitals may differ in size, content and quality, they are heterogeneous, which makes model training tricky. Another challenge is data privacy and restrictions on how sensitive data can be shared. Here the authors first used local AI models to deal with heterogenicity and then used global federated models to aggregate all learnings from the local models without moving any sensitive data around.
Hilmkil, Callh, Barbieri, Listo Zec, Sütfeld, Mogren, Scaling Federated Learning for Fine-tuning of Large Language Models, 25th International Conference on Natural Language & Information Systems, NLDB 2021.
There are several language models already publicly available but not all of them are suitable for federated learning which is used for sensitive data which cannot be moved around. This publication can give you a hint on which model to choose for certain text classification tasks such as for health records.
Onoszko, N., Karlsson, K., Mogren, O., Listo Zec, E. Decentralized federated learning of deep neural networks on non-iid data, Workshop on Federated Learning for User Privacy and Data Confidentiality at the 38th International Conference on Machine Learning, 2021
In this publication the authors found ways to improve federated learning (moving models around between data sets instead of vice versa) by training models on similar data sets distributed among different clients to avoid training losses. This approach could be used for hospitals collaborating to train models for prediction of adverse events.