# utils.knn_graph¶

utils.knn_graph(X, k, method='brute_force', leaf_size=30)

Compute the symmetric k-nearest neighbor graph for a set of points. Assume a Euclidean distance metric.

Parameters: X : numpy array | list [numpy arrays] Data points, with each row as an observation. k : int The number of points to consider as neighbors of any given observation. method : {‘brute-force’, ‘kd-tree’, ‘ball-tree’}, optional Computing method. ‘brute-force’: computes the (Euclidean) distance between all O(n^2) pairs of rows in ‘X’, then for every point finds the k-nearest. It is limited to tens of thousands of observations (depending on available RAM). ‘kd-tree’: partitions the data into axis-aligned rectangles to avoid computing all O(n^2) pairwise distances. Much faster than ‘brute-force’, but only works for data in fewer than about 20 dimensions. Requires the scikit-learn library. ‘ball-tree’: partitions the data into balls and uses the metric property of euclidean distance to avoid computing all O(n^2) distances. Typically much faster than ‘brute-force’, and works with up to a few hundred dimensions. Requires the scikit-learn library. leaf_size : int, optional For the ‘kd-tree’ and ‘ball-tree’ methods, the number of observations in the leaf nodes. Leaves are not split further, so distance computations within leaf nodes are done by brute force. ‘leaf_size’ is ignored for the ‘brute-force’ method. neighbors : numpy array Each row contains the nearest neighbors of the corresponding row in ‘X’, indicated by row indices. radii : list[float] For each row of ‘X’ the distance to its k’th nearest neighbor (including itself).

Examples

>>> X = numpy.random.rand(100, 2)
>>> knn, radii = debacl.utils.knn_graph(X, k=8, method='kd-tree')