Operators and Keywords

C++ API

Function File: p = anderson_darling_cdf (A, n)

Return the CDF for the given Anderson-Darling coefficient A computed from n values sampled from a distribution. For a vector of random variables x of length n, compute the CDF of the values from the distribution from which they are drawn. You can uses these values to compute A as follows:

```          A = -n - sum( (2*i-1) .* (log(x) + log(1 - x(n:-1:1,:))) )/n;
```

From the value A, `anderson_darling_cdf` returns the probability that A could be returned from a set of samples.

The algorithm given in [1] claims to be an approximation for the Anderson-Darling CDF accurate to 6 decimal points.

Demonstrate using:

```          n = 300; reps = 10000;
z = randn(n, reps);
x = sort ((1 + erf (z/sqrt (2)))/2);
i = [1:n]' * ones (1, size (x, 2));
A = -n - sum ((2*i-1) .* (log (x) + log (1 - x (n:-1:1, :))))/n;
p = anderson_darling_cdf (A, n);
hist (100 * p, [1:100] - 0.5);
```

You will see that the histogram is basically flat, which is to say that the probabilities returned by the Anderson-Darling CDF are distributed uniformly.

You can easily determine the extreme values of p:

```          [junk, idx] = sort (p);
```

The histograms of various p aren't very informative:

```          histfit (z (:, idx (1)), linspace (-3, 3, 15));
histfit (z (:, idx (end/2)), linspace (-3, 3, 15));
histfit (z (:, idx (end)), linspace (-3, 3, 15));
```

More telling is the qqplot:

```          qqplot (z (:, idx (1))); hold on; plot ([-3, 3], [-3, 3], ';;'); hold off;
qqplot (z (:, idx (end/2))); hold on; plot ([-3, 3], [-3, 3], ';;'); hold off;
qqplot (z (:, idx (end))); hold on; plot ([-3, 3], [-3, 3], ';;'); hold off;
```

Try a similarly analysis for z uniform:

```          z = rand (n, reps); x = sort(z);
```

and for z exponential:

```          z = rande (n, reps); x = sort (1 - exp (-z));
```

[1] Marsaglia, G; Marsaglia JCW; (2004) "Evaluating the Anderson Darling distribution", Journal of Statistical Software, 9(2).

Package: statistics