[edit]
Uncovering Latent Subgroups: Spectral Clustering for Fairness Analysis in Contrastive Embeddings
Proceedings of the The 39th Canadian Conference on Artificial Intelligence, PMLR 318:1076-1083, 2026.
Abstract
Contrastive learning enables scalable representation learning in computer vision and healthcare, yet embedding spaces may encode unequal geometric structure across latent subpopulations, leading to downstream performance disparities. Conventional fairness audits relying on demographic labels often fail to detect such structural bias. This work examines whether contrastive embeddings contain fairness relevant latent subgroups that can be identified without demographic supervision. We introduce a label free spectral fairness audit that constructs similarity graphs over CLIP embeddings and applies eigengap based spectral clustering. Experiments on CheXpert reveal stable latent subgroups with noticeable geometric distortions and performance gaps, exposing hidden fairness risks missed by demographic based evaluations. This work enables label-free discovery of hidden fairness and reliability risks in contrastive embeddings, supporting safer, more transparent deployment of foundation models in healthcare and other high-stakes domain.