The correlation co-efficient, r, for a set of (x,y) data pairs is calculated as follows:

The following steps can be followed:

- Calculate the average values for both x (x*) & y (y*);
- For each row, calculate (x – x*) and (y – y*);
- For each row, calculate (x-x*)
^{2}, (y – y*)^{2}, and (x-x*)(y-y*); - Add up the values in each column, and store the totals
- For both x & y values, calculate the standard deviation, s
_{x}and s_{y, }using the totals for (x-x*)^{2}and (y-y*)^{2 }dividing each by the number of rows, and taking the square root of the results. - Calculate r using the total for the column of (x-x*)(y-y*) and dividing by (n*s
_{x}*s_{y}) where n is the number of rows in the table (i.e. the number of x,y pairs.

The following table should help to illustrate the calculation:

| x | y | (x-x*) | (x-x*) | (y-y*) | (y-y*) | (x-x*)(y-y*) |

1 | 8.56 | -4 | 16 | 1.865556 | 3.480297531 | -7.46222222 | |

2 | 8.23 | -3 | 9 | 1.535556 | 2.357930864 | -4.60666667 | |

3 | 7.62 | -2 | 4 | 0.925556 | 0.856653086 | -1.85111111 | |

4 | 7.12 | -1 | 1 | 0.425556 | 0.181097531 | -0.42555556 | |

5 | 6.99 | 0 | 0 | 0.295556 | 0.087353086 | 0 | |

6 | 7.05 | 1 | 1 | 0.355556 | 0.126419753 | 0.355555556 | |

7 | 4.98 | 2 | 4 | -1.71444 | 2.939319753 | -3.42888889 | |

8 | 5.37 | 3 | 9 | -1.32444 | 1.754153086 | -3.97333333 | |

9 | 4.33 | 4 | 16 | -2.36444 | 5.590597531 | -9.45777778 | |

mean x* | 5 | | Total | 60 | | 17.37382222 | -30.85 |

mean y* | | 6.694444 | std dev | 2.581989 | | 1.38939724 | |

| r= | -0.9555 | |

In the above table, the x values represent the number of guests; and the y values represent the conversion rate given as a percentage. The columns headed by (x-x*)^{2} and (y-y*)^{2} are used in the calculation of the standard deviations for x and y – s_{x} and s_{y}. Once the last column is calculated, the values are totaled, giving the numerator (upper value of the fraction) in the equation for r.

For the above example, r is calculated as:

The use of the correlation co-efficient enables a determination as to whether or not there exists a relationship between the variables. A strong correlation does not indicate a causal relationship in the data; although causal relationships show strong correlation.

Note: the value for the correlation co-efficient r can range from -1 to 1. A value towards either end of the range indicates a strong correlation between the variables; values close to 0 indicate very little or no correlation.

## 1 comment:

Correlation is computed into what is known as the correlation coefficient, which ranges between -1 and +1. Perfect positive correlation (a correlation co-efficient of +1) implies that as one security moves, either up or down, the other security will move in lockstep, in the same direction.

perason correlation

Post a Comment