utils.knn_graph¶

utils.
knn_graph
(X, k, method='brute_force', leaf_size=30)¶ Compute the symmetric knearest neighbor graph for a set of points. Assume a Euclidean distance metric.
Parameters: X : numpy array  list [numpy arrays]
Data points, with each row as an observation.
k : int
The number of points to consider as neighbors of any given observation.
method : {‘bruteforce’, ‘kdtree’, ‘balltree’}, optional
Computing method.
 ‘bruteforce’: computes the (Euclidean) distance between all O(n^2) pairs of rows in ‘X’, then for every point finds the knearest. It is limited to tens of thousands of observations (depending on available RAM).
 ‘kdtree’: partitions the data into axisaligned rectangles to avoid computing all O(n^2) pairwise distances. Much faster than ‘bruteforce’, but only works for data in fewer than about 20 dimensions. Requires the scikitlearn library.
 ‘balltree’: partitions the data into balls and uses the metric property of euclidean distance to avoid computing all O(n^2) distances. Typically much faster than ‘bruteforce’, and works with up to a few hundred dimensions. Requires the scikitlearn library.
leaf_size : int, optional
For the ‘kdtree’ and ‘balltree’ methods, the number of observations in the leaf nodes. Leaves are not split further, so distance computations within leaf nodes are done by brute force. ‘leaf_size’ is ignored for the ‘bruteforce’ method.
Returns: neighbors : numpy array
Each row contains the nearest neighbors of the corresponding row in ‘X’, indicated by row indices.
radii : list[float]
For each row of ‘X’ the distance to its k’th nearest neighbor (including itself).
See also
Examples
>>> X = numpy.random.rand(100, 2) >>> knn, radii = debacl.utils.knn_graph(X, k=8, method='kdtree')