X, Y, and Z are three random variables with mutual pairwise correlation rho. What is the minimum possible value that rho can take?
Definitions and concepts:
Before moving on to the actual question, let’s establish some base concepts and terminology. If you are comfortable with the notions of correlation, covariance, and positive semi-definite matrices, you can jump right ahead.
Definition 1: The correlation coefficient between two RVs, X and Y is equal to
Using the linearity of expectations and the standard deviation formula, this can be further expanded as:
As, from linearity, , we get:
Property 1: The correlation coefficient takes values from -1 to 1
Proof:
Let X and Y be two random variables. We know that is also an RV for any , and, by definition, the variance of any RV is positive.
We thus now have:
Looking now at RHS, as a quadratic function of that is non-negative, we can affirm that its discriminant is non-positive.
Consequently,
If then X or Y must be a constant, thus ✅.
Otherwise, we can divide by it, and get:
We further simplify:
We notice that we can replace LHS with corr, and get:
From this we get our final conclusion:
Definition 2: The covariance correlation matrix between two RVs, X, Y and Z is a square matrix giving the covariance/correlation between each pair of elements, i.e.:
Definition 3: A minor of a matrix is the determinant of some smaller square matrix
Definition 4: The leading principal minor of order k is the minor of order k obtained by deleting the last n-k rows and columns from the matrix
Definition 5: An symmetric matrix M is said to be positive-semidefinite if .
Property 2: A positive semi-definite matrix has all leading principal minors non-negative
Proof:
This proof is rather long and out of scope, but you can give it a read, for example, here: Sylvester’s Criterion (Math 484: Nonlinear Programming, University of Illinois, 2019). Do keep in mind that we only need the “” implication of the first bullet.
Property 3: For any set of random variables, both their covariance matrix and covariance matrix are positive semi-definite
Proof:
We will use the same trick that we employed in our previous proof. Consider a set of random variables; then, we know that the variance of any weighted sum is non-negative, i.e.:
Thus:
Denoting and , we get that
.
Similarly, denoting and , we get that:
.
By Definition 5, (1) and (2) imply that the covariance matrix, respectively the correlation matrix are positive semi-definite✅.
Property 4: If then .
Proof:
The proof of this property is a direct consequence of the previous one. Let , and write the correlation matrix of the 3 random variables:
The determinant of this matrix is since the matrix is positive semi-definite.
Solutions:
Particular case: 3 RVs
When asked to give a minimum value, two things must be done. Find the lower bound and then prove that this bound is attainable, by providing an example.
From Property 2, we know that the loosest bounds for are -1 and 1.
We can easily see that can’t be equal to -1. If the pairs and have a correlation of -1, the correlation between X and Z must be 1.
We could also choose X, Y, and Z independent, in which case would be 0.
So, our minimal value is in the interval .
To get a tighter inequality, we use the necessary properties of the correlation matrix, outlined in the first part of the video. Write the correlation matrix of X, Y, and Z and set the condition for it to be positive-semi-definite. Recursively eliminate trailing rows and columns to get the principal leading minors.
✅
(by Property 2) ✅
(To get to this, use, for example, the Rule of Sarrus + Factorize)
Now we have reduced the interval for , and the minimal value that can take is at least . If we find a triplet of random variables with pairwise correlations of , we have proved that this value is attainable, hence the minimum. Unfortunately, this is the tricky part. One way to construct such variables is:
Let be independent identically distributed standard uniform random variables and consider . If we expand the average and compute the coefficients, we get a simplified formula for the ‘s:
We compute the variance of the ‘s by using the formula for the linear combination of independent random variables (in our case the s). The variance of is equal to 1 and the covariance of and , with is 0 as they are independent:
Thinking back to the correlation formula, we are missing the covariance between and . We again linearly expand the covariance, keeping in mind that the covariance of independent random variables is 0, and the covariance of a random variable with itself is its variance.
Thus,
For this construction, all the pairwise correlations are equal to . Thus, we’ve obtained the minimum possible value for . ✅
General Case
We have the correct result for the case of 3 random variables. Can we generalize it? How about the minimum value of when you have n random variables with pairwise correlations equal?
Just as before, we consider the correlation matrix and its properties.
Like before, its determinant must be at least 0. Computing it is not as trivial, since there is no generalized formula for the determinant of an matrix. However, we can use the decomposition along a column and induction to prove that its value is:
For this to be greater than or equal to 0, we must have that:
.
We again construct the random variables as the difference between and the mean of the s, where are iid standard uniform. With similar rationing as in the previous part, we compute the variance of and get . The covariance of and turns out to be . From the correlation formula, we obtain the correlation between any distinct and to be , just the lower bound we observed above.
This generalization is consistent with our result for n equals three. At the same time, we can see that the value of the minimal correlation converges to 0 when n goes to infinity. Our expectation that adding more random variables makes it impossible to have their correlation small, is supported by this convergence.