Tuesday, September 24, 2019

Covariance and Correlation - Are two different attributes are related to each other?

  • Covariance - measures how two variables vary from their means.
  • Covariance  is the result of a calculation that returns a number that indicates whether there is a correlation between two attributes but this number is not a measurement.So we use the covariance  to calculate the correlation that gives us a standard measurement (-1 to 1).
  • Correlation  -1  means perfect inverse correlation
    Correlation  0  means no correlation.
    Correlation  1  means perfect correlation.

Let's calculate covariance and correlation and also check
the built-in functions in  Python numpy lib
import numpy as np
import matplotlib.pyplot as plt

def de_mean(x):
    xmean = np.mean(x)
    return [xi - xmean for xi in x]

def covariance(x, y):
    n = len(x)
    return np.dot(de_mean(x), de_mean(y)) / (n-1)

def covrrelation(x, y):
    stdx = np.std(v1)
    stdy  = np.std(v2)
    return covariance(x, y) / stdx / stdy

v1 = [1, 2, 3, 4, 5]
v2 = [1, 3, 2, 4, 5]

plt.scatter(v1, v2)
plt.show()

print(de_mean(v1))
print(de_mean(v2))

print(np.std(v1))
print(np.std(v2))

# use our defined covariance function
covar = covariance(v1, v2)
print(covar)

# use numpy covariance function - cov
print(np.cov(v1, v2))

# use our defined covrrelation function
corr = covrrelation(v1, v2)
print(corr)

# use numpy covrrelation function - corrcoef
print(np.corrcoef(v1, v1))

No comments:

Post a Comment