Correlation,

Definition

  • Correlation refers to any of a broad class of statistical relationship involving dependence
    • dependence is any statistical relationship between 2 random variables or 2 sets of data
    • a typical way of showing the correlation between 2 related variables - scatter diagram

iow

  • Correlation:
    • This is a statistical measure that indicates the extent to which two or more variables fluctuate together
    • A positive correlation indicates the extent to which those variables increase or decrease in parallel; a negative correlation indicates the extent to which one variable increases as the other decreases
    • For example, let’s consider the relationship between the amount of time spent studying and the grades obtained. These two variables are likely to be positively correlated because as the amount of time spent studying increases, the grades obtained also tend to increase.

Description:

  • The correlation of two random variables and , denoted by
    • defined as long as is positive
    • Denotes the degree of linearity:
      • Near +1 or -1 denotes there is a high degree of linearity
      • Near 0 indicates such linearity is absent

Calculation:

  • implies for
  • implies for

Degree of correlation

Correlation coefficient

  • The correlation coefficient is
    • measuring the degree of correlation between two variables
    • to predict the values for one variable y given the other variable x β†’ find line of best fit

Coefficient of determination:

    • coef of determination = sqr of correlation coef
    • measures the explanatory power of the regression model; iow, how well our data fits the regression model
    • ranges from 0 to 1
      • 0: none of the variability of the response data around its mean
      • 1: model explains all the variability of the response data around its mean
    • eg: β†’ 85% of the variation in the dependent variable is explained by the independent variables in the model. the remaining 15% is not explained by the model

Spearman’s rank correlation coefficient

  • Spearman’s rank correlation coefficient: