Find the index of the min value in a pdist condensed distance matrix


I have used scipy.spatial.distance.pdist(X) to calculate the euclidian distance metric between each pair of elements of the below list X:

X = [[0, 3, 4, 2], [23, 5, 32, 1], [3, 4, 2, 1], [33, 54, 5, 12]]

This returns a condensed distance matrix:

array([ 36.30426972, 3.87298335, 61.57109712, 36.06937759, 57.88782255, 59.41380311])

For each element X, I need to find the index of the closest other element.

Converting the condensed distance matrix to square form help visualize the results, but I can't figure out how to programmatically identify the index of the closest element X for each element in X.

array([[ 0. , 36.30426972, 3.87298335, 61.57109712], [ 36.30426972, 0. , 36.06937759, 57.88782255], [ 3.87298335, 36.06937759, 0. , 59.41380311], [ 61.57109712, 57.88782255, 59.41380311, 0. ]])

I believe argmin() is the function to use, but I'm lost from here. Thanks for any help in advance.


We'll operate on the square form of the results. First, to exclude "New York is closest to New York" answers,

numpy.fill_diagonal(distances, numpy.inf)

Then, it's a simple argmin along an axis:

closest_points = distances.argmin(axis=0)


