Technologies

The building blocks of privacy-preserving AI

From model selection to deployment frameworks — a breakdown of the core technologies that underpin federated learning systems in real-world environments.

ML Models LR · NN · CNN · Decision Trees

Data Partitioning Horizontal · Vertical · Combined

Privacy Techniques DP · Masking · Homomorphic Enc.

Aggregation Strategies FedAvg · FedProx

FL Frameworks Flower · TensorFlow Federated

ML Models

Compatible ML models

FL is model-agnostic. The right model depends on your data type, task complexity and privacy requirements. Some of the ML models are:

Logistic Regression

A linear classifier suited to binary and multiclass tasks. Low complexity makes it ideal for resource-constrained FL participants with tabular data.

Neural Networks

Multi-layer perceptrons capable of learning complex patterns from structured data. The de facto backbone for most FL research.

Convolutional Neural Networks

Spatial feature extractors designed for image and signal data. Widely used in federated medical imaging — training across hospital silos without sharing scans.

Decision Trees

Interpretable rule-based classifiers that partition feature space. In FL, gradient-boosted variants such as XGBoost can be federated across participants without sharing raw data.

Data Partitioning

How data is split across participants

The structure of your data across participants determines which FL topology to use. There are three fundamental partitioning types:

Horizontal Partitioning

Participants share the same feature space but hold different samples. The most common FL scenario — e.g. multiple hospitals with the same patient schema but different patients.

Vertical Partitioning

Participants share the same sample IDs but hold different feature sets. Common when different organisations hold complementary data on the same users — e.g. a bank and a retailer.

Combined Partitioning

Participants differ in both their samples and feature sets — the most complex and realistic scenario. Requires careful alignment strategies and is common in large cross-silo deployments.

Privacy Techniques

How data is kept private

Privacy-preserving techniques ensure that sensitive data is never exposed during the FL process — even to the aggregating server. Some of the techniques are:

Differential Privacy

Calibrated statistical noise is added to model updates before sharing, making it mathematically impossible to reverse-engineer individual records from the aggregated output.

Data Masking

Sensitive fields are replaced with redacted or pseudonymised values before any data is used in training. Identifiers, diagnoses and locations are hidden while retaining data utility.

Homomorphic Encryption

Model updates are encrypted before leaving the client. The server performs aggregation directly on ciphertexts — the data is never decrypted, even during computation.

Aggregation Strategies

How local models are combined

After each training round, client model updates must be aggregated into a single global model. The aggregation strategy directly affects convergence speed, fairness and robustness. Some of the techniques are:

FedAvg

Federated Averaging is the foundational FL aggregation algorithm. Each client trains locally for several steps, then sends its model weights to the server. The server computes a weighted average — proportional to each client's dataset size — to produce an updated global model.

FedProx

An extension of FedAvg designed for heterogeneous FL systems. It adds a proximal term to each client's local objective that penalises divergence from the global model — preventing any single client's non-IID data from pulling the global model too far off course.

FL Frameworks

Tools for building FL systems

Open-source frameworks handle the communication, aggregation and orchestration infrastructure — letting teams focus on model development rather than FL plumbing. A couple of FL tools are:

Flower (flwr)

A framework-agnostic FL library that works with any ML backend — PyTorch, TensorFlow, JAX, scikit-learn or custom code. Designed for real-world cross-silo and cross-device deployments with a simple client-server API that abstracts all communication and aggregation logic.

Python cross-silo cross-device production-ready

TensorFlow Federated (TFF)

Google's FL framework built natively on TensorFlow. TFF provides a layered architecture — from low-level federated operators up to high-level simulation APIs — making it particularly well-suited for research and prototyping federated algorithms before production deployment.

Python TensorFlow simulation research

Get Started

Start your journey toward privacy-preserving AI

Whether you're exploring FL or ready to deploy, we can help you assess, design and implement the right solution for your organisation.

Book a consultation