**By Allan Roberts**

Statistical correlation may or may not be an easy concept to grasp. A typical stats textbook might show you clouds of data points, attach numbers to them, and suggest that you should try to get the general idea; it’s either that or work through the algebra. You can also resort to wordy explanations such as this: If X and Y are positively correlated, X will tend to increase when Y increases, and *vice versa*. Note that this last statement sounds a lot like a definition based on probabilities. Wouldn’t it be nice if a statistical correlation coefficient had a simple probabilistic interpretation? Such a coefficient exists. It’s called the Kendall Tau Rank Correlation Coefficient.

Dividing the number of line segments with positive slope in Figure 1 by the total number of line segments would yield a number between 0 and 1;** **this proportion can be interpreted as the probability that a randomly selected line segment will have positive slope. Re-scaling this proportion, with the equation given in Figure 1, yields a number between -1 and 1 that is the correlation coefficient tau.

**Reference**

Kendall Tau Rank Correlation Coefficient. Wikipedia:

wikipedia.org/wiki/Kendall_tau_rank_correlation_coefficient

**R Script (Short Version)
#We want to write **

**method = “kendall”**with straight quotation marks, which the #settings for this webpage seem to not allow. Thus the convoluted code that follows. #In R, names(kendall) will be interpreted as “kendall” with straight quotes.

**kendall = numeric(0); kendall = as.data.frame(kendall); #To avoid quotation marks.
X <- sample(20,20); Y <- sample(20,20); cor(X, Y, method = names(kendall) );**

**R Script (Long Version)**

#written by Allan Roberts, Feb 2013.

KendallExample <- function(n=20){

X <- sample(n,n);

Y <- sample(n,n);

plot(X,Y, las=1,xlim=c(0,n+4),ylim=c(0,n+4));

A <- matrix(0,n,n);

for (i in 1:n) for (j in (1:n)[-i]) if ( ((Y[j]-Y[i])/(X[j]-X[i]))>0) A[i,j] <- 1;

for (i in 1:n) for (j in (1:n)[-i]) if ( ((Y[j]-Y[i])/(X[j]-X[i]))<0) A[i,j] <- -1;

for (i in 1:n) for (j in (1:n)[-i]){

if (A[i,j]== 1){col= 2; lty=1};

if (A[i,j]==-1){col= 4; lty=3};

if (A[i,j] != 0) lines(c(X[i],X[j]),c(Y[i],Y[j]),col=col,lty=lty);

}

up <- (sum(A>0)/2);

down <- (sum(A<0)/2);

tau <- 2*(up-down)/(n*n-n);

text(n-4,n+4, paste(expression(Upward), up ));

text(n-4,n+3,paste(expression(Downward), down ));

text(n-4,n+1, expression( frac( 2*(up-down), n^2-n )) );

text(n-2,n+1, adj=c(0,0.5), paste(rawToChar(as.raw(61)),round(tau,digits=3)) );

}

KendallExample(n=10);