python - Find closest/similar value(vector) inside a matrix

Question

Welcome To Ask or Share your Answers For Others

python - Find closest/similar value(vector) inside a matrix

posted Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - Find closest/similar value(vector) inside a matrix

let's say I have the following numpy matrix (simplified):

matrix = np.array([[1, 1],
               [2, 2],
               [5, 5],
               [6, 6]]
              )

And now I want to get the vector from the matrix closest to a "search" vector:

search_vec = np.array([3, 3])

What I have done is the following:

min_dist = None
result_vec = None
for ref_vec in matrix:
    distance = np.linalg.norm(search_vec-ref_vec)
    distance = abs(distance)
    print(ref_vec, distance)
    if min_dist == None or min_dist > distance:
        min_dist = distance
        result_vec = ref_vec

The result works, but is there a native numpy solution to do it more efficient? My problem is, that the bigger the matrix becomes, the slower the entire process will be. Are there other solutions that handle these problems in a more elegant and efficient way?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-17T03:06:45+0000

Approach #1

We can use Cython-powered kd-tree for quick nearest-neighbor lookup, which is very efficient both memory-wise and with performance -

In [276]: from scipy.spatial import cKDTree

In [277]: matrix[cKDTree(matrix).query(search_vec, k=1)[1]]
Out[277]: array([2, 2])

Approach #2

With SciPy's cdist -

In [286]: from scipy.spatial.distance import cdist

In [287]: matrix[cdist(matrix, np.atleast_2d(search_vec)).argmin()]
Out[287]: array([2, 2])

Approach #3

With Scikit-learn's Nearest Neighbors -

from sklearn.neighbors import NearestNeighbors

nbrs = NearestNeighbors(n_neighbors=1).fit(matrix)
closest_vec = matrix[nbrs.kneighbors(np.atleast_2d(search_vec))[1][0,0]]

Approach #4

With Scikit-learn's kdtree -

from sklearn.neighbors import KDTree
kdt = KDTree(matrix, metric='euclidean')
cv = matrix[kdt.query(np.atleast_2d(search_vec), k=1, return_distance=False)[0,0]]

Approach #5

From eucl_dist package (disclaimer: I am its author) and following the wiki contents, we could leverage matrix-multiplication -

M = matrix.dot(search_vec)
d = np.einsum('ij,ij->i',matrix,matrix) + np.inner(search_vec,search_vec) -2*M
closest_vec = matrix[d.argmin()]

Categories

python - Find closest/similar value(vector) inside a matrix

python - Find closest/similar value(vector) inside a matrix

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags