Jensen-Shannon Divergence (JSD) is a symmetric, bounded measure of similarity between probability distributions. It addresses the asymmetry of KL divergence and is always finite.
where $m = \frac{1}{2}(p + q)$ is the mixture distribution.
Expanding the KL terms:
\[\text{JS}(p \| q) = \frac{1}{2} \sum_i p_i \log\frac{2p_i}{p_i + q_i} + \frac{1}{2} \sum_i q_i \log\frac{2q_i}{p_i + q_i}\]The square root $\sqrt{\text{JS}(p | q)}$ is a proper metric (satisfies triangle inequality).
The implementation is located at src/unbitrium/metrics/distribution.py.
Verification: All computed values non-negative.
Verification: Order-independent.
Verification: Same distribution yields zero.
Using natural logarithm:
\[0 \leq \text{JS}(p \| q) \leq \ln 2\]Using base-2 logarithm:
\[0 \leq \text{JS}(p \| q) \leq 1\]Verification: All values within bounds.
Input: $p = q$
Expected Output: JS = 0
Input:
Expected Output: JS = $\ln 2 \approx 0.693$ (natural log)
Input:
Expected Output: JS = $\ln 2$
Input: Small perturbation from uniform
Expected Behavior: JS close to zero
| JS (base e) | JS (base 2) | Heterogeneity |
|---|---|---|
| 0.0 - 0.1 | 0.0 - 0.14 | Minimal |
| 0.1 - 0.3 | 0.14 - 0.43 | Low |
| 0.3 - 0.5 | 0.43 - 0.72 | Moderate |
| 0.5 - 0.693 | 0.72 - 1.0 | High |
| Property | KL | JS |
|---|---|---|
| Symmetric | No | Yes |
| Bounded | No | Yes |
| Metric | No | sqrt(JS) is |
| Handles zeros | Problematic | Safe |
Input: $p_i = 0$ for some $i$
Expected Behavior:
Input: Most probability mass on few classes
Expected Behavior:
Input: $C = 1$
Expected Behavior:
from unbitrium.metrics import JSDivergence
metric = JSDivergence(base='e') # or 'base2'
p = np.array([0.3, 0.3, 0.4])
q = np.array([0.5, 0.3, 0.2])
js = metric.compute(p, q)
print(f"JS Divergence: {js:.4f}")
# Compute for all clients vs global
global_dist = compute_label_distribution(global_data)
js_values = []
for client_data in partitions:
client_dist = compute_label_distribution(client_data)
js = metric.compute(client_dist, global_dist)
js_values.append(js)
JS divergence reveals:
Same as EMD: aggregate reporting, differential privacy.
Linear in number of classes.
JS is the mutual information between:
where TV is total variation distance.
Lin, J. (1991). Divergence measures based on the Shannon entropy. IEEE Transactions on Information Theory, 37(1), 145-151.
Endres, D. M., & Schindelin, J. E. (2003). A new metric for probability distributions. IEEE Transactions on Information Theory, 49(7), 1858-1860.
Fuglede, B., & Topsoe, F. (2004). Jensen-Shannon divergence and Hilbert space embedding. In IEEE International Symposium on Information Theory.
| Version | Date | Changes |
|---|---|---|
| 1.0.0 | 2026-01-04 | Initial validation report |
Copyright 2026 Olaf Yunus Laitinen Imanov and Contributors. Released under EUPL 1.2.