Many learning algorithms, especially nonparametric ones, use distance measures as a source of prior knowledge about the domain. This paper shows how the work of Baxter and Yianilos provides a formal equivalence between distance measures and prior probability distributions in Bayesian inference. The prior distribution applies either to how the data was generated or to the shape of the discrimination boundary. This perspective is useful for extending distance-based algorithms to new feature spaces and especially for learning distance measures on those spaces.
Also see Learning distance measures from labeled data -- An overview.