Entropy-Controlled Partition creates non-IID distributions with precise control over partition “hardness” measured by label entropy. This enables systematic benchmarking across controlled heterogeneity levels.
Label entropy for client $k$:
\[H(p_k) = -\sum_{c=1}^C p_{k,c} \log p_{k,c}\]where $p_{k,c}$ is the proportion of class $c$ in client $k$’s data.
The partitioner targets a specific entropy level:
\[H_{target} \in [0, \log C]\]where:
The implementation is located at src/unbitrium/partitioning/entropy_controlled.py.
Achieved entropy within specified tolerance:
\[|H(p_k) - H_{target}| \leq \epsilon\]Verification: Entropy constraint satisfied for all clients.
Label proportions form valid distributions:
\[\sum_{c=1}^C p_{k,c} = 1, \quad p_{k,c} \geq 0\]Verification: Proportions normalized.
Lower target entropy implies higher heterogeneity:
\[H_1 < H_2 \implies \text{EMD}(p_1, p_{uniform}) > \text{EMD}(p_2, p_{uniform})\]Verification: EMD correlates inversely with entropy.
Fixed seed produces identical partitions.
Configuration:
Expected Behavior:
Configuration:
Expected Behavior:
Configuration:
Expected Behavior:
Configuration:
Expected Behavior:
| Entropy $H$ | $H / H_{max}$ | Hardness | FL Impact |
|---|---|---|---|
| 0.0 - 0.5 | 0 - 20% | Very hard | Severe degradation |
| 0.5 - 1.0 | 20 - 40% | Hard | Significant degradation |
| 1.0 - 1.5 | 40 - 65% | Medium | Moderate impact |
| 1.5 - 2.0 | 65 - 85% | Easy | Minor impact |
| 2.0 - 2.3 | 85 - 100% | Very easy | Near-IID |
| Metric | Range | Notes |
|---|---|---|
target_entropy |
$[0, \log C]$ | Requested entropy |
achieved_entropy |
$[0, \log C]$ | Actual mean entropy |
entropy_variance |
$[0, \infty)$ | Variance across clients |
entropy_tolerance |
$(0, 1]$ | Acceptable deviation |
Input: Target $H = 0$
Expected Behavior:
Input: Target $H = \log C$
Expected Behavior:
Input: Constraints cannot be satisfied
Expected Behavior:
Input: Few samples per client
Expected Behavior:
from unbitrium.partitioning import EntropyControlledPartition
partitioner = EntropyControlledPartition(
target_entropy=1.0,
tolerance=0.1,
num_clients=100,
seed=42,
)
# Verify achieved entropy
from scipy.stats import entropy
for client_data in partitions:
labels = [y for x, y in client_data]
label_counts = np.bincount(labels, minlength=num_classes)
label_probs = label_counts / label_counts.sum()
client_entropy = entropy(label_probs, base=np.e)
assert abs(client_entropy - target_entropy) < tolerance
Entropy control enables:
where $I$ is iterations to achieve target entropy.
subject to probability simplex constraints.
FedSym Paper (2021). Entropy-based heterogeneity benchmarking in federated learning.
Wang, J., et al. (2020). Tackling the objective inconsistency problem in heterogeneous federated optimization. In NeurIPS.
Li, Q., et al. (2022). Federated learning on non-IID data silos: An experimental study. In ICDE.
| Version | Date | Changes |
|---|---|---|
| 1.0.0 | 2026-01-04 | Initial validation report |
Copyright 2026 Olaf Yunus Laitinen Imanov and Contributors. Released under EUPL 1.2.