Computes the Chatterjee correlation coefficient, a robust measure of dependence.
Unlike classical correlation coefficients (Pearson, Spearman), Chatterjee's coefficient
measures the degree to which y is a function of x (functional dependence),
capturing non-linear relationships.
Characteristics:
Functional Dependence: Value is 1 if and only if
yis a measurable function ofx.Independence: Value is 0 if
xandyare independent.Range: The coefficient is theoretically in \([0, 1]\).
Asymmetry: The measure is asymmetric; \(C(X, Y) \neq C(Y, X)\). It specifically tests if $Y = f(X)$.
Value
A numeric value representing the Chatterjee correlation coefficient.
A numeric vector containing:
Correlation Coefficient: The Chatterjee correlation estimate.
Details
The coefficient is calculated using the ranks of y when sorted by x.
This implementation computes the sample version of the coefficient as described by Chatterjee (2021).
Formula: Given pairs \((X_i, Y_i)\), sort them such that \(X_{(1)} \le \dots \le X_{(n)}\). Let \(r_i\) be the rank of \(Y_{(i)}\). The coefficient is: $$ \xi_n(X, Y) = 1 - \frac{3 \sum_{i=1}^{n-1} |r_{i+1} - r_i|}{n^2 - 1} $$
References
Chatterjee, S. (2021). A new coefficient of correlation. Journal of the American Statistical Association, 116(536), 2009-2022.
Examples
# Functional dependence (Y = X^2)
x <- runif(50, -1, 1)
y <- x^2
chatterjee_correlation(x, y) # Should be high (near 1)
#> [1] 0.03601441
# Independence
x <- runif(50)
y <- runif(50)
chatterjee_correlation(x, y) # Should be low (near 0)
#> [1] -0.05402161
# Asymmetry check
chatterjee_correlation(x, y)
#> [1] -0.05402161
chatterjee_correlation(y, x)
#> [1] -0.03721489