Operators and Keywords

C++ API

Function File: y = pdist (x)

Function File: y = pdist (x, metric)

Function File: y = pdist (x, metric, metricarg, ...)

Return the distance between any two rows in x.

x is the nxd matrix representing q row vectors of size d.

The output is a dissimilarity matrix formatted as a row vector y, (n-1)*n/2 long, where the distances are in the order [(1, 2) (1, 3) ... (2, 3) ... (n-1, n)]. You can use the `squareform` function to display the distances between the vectors arranged into an nxn matrix.

`metric` is an optional argument specifying how the distance is computed. It can be any of the following ones, defaulting to "euclidean", or a user defined function that takes two arguments x and y plus any number of optional arguments, where x is a row vector and and y is a matrix having the same number of columns as x. `metric` returns a column vector where row i is the distance between x and row i of y. Any additional arguments after the `metric` are passed as metric (x, y, metricarg1, metricarg2 ...).

Predefined distance functions are:

"euclidean"
Euclidean distance (default).
"seuclidean"
Standardized Euclidean distance. Each coordinate in the sum of squares is inverse weighted by the sample variance of that coordinate.
"mahalanobis"
Mahalanobis distance: see the function mahalanobis.
"cityblock"
City Block metric, aka Manhattan distance.
"minkowski"
Minkowski metric. Accepts a numeric parameter p: for p=1 this is the same as the cityblock metric, with p=2 (default) it is equal to the euclidean metric.
"cosine"
One minus the cosine of the included angle between rows, seen as vectors.
"correlation"
One minus the sample correlation between points (treated as sequences of values).
"spearman"
One minus the sample Spearman's rank correlation between observations, treated as sequences of values.
"hamming"
Hamming distance: the quote of the number of coordinates that differ.
"jaccard"
One minus the Jaccard coefficient, the quote of nonzero coordinates that differ.
"chebychev"
Chebychev distance: the maximum coordinate difference.

Package: statistics