Question:

Does the word correlation refer to the actual relationship of to or more things or just statistical relation?

by  |  earlier

0 LIKES UnLike

I'm 99.9 percent sure that the word correlation only refers to a statistic relationship between two or more things but I could use a little confirmation from you guys.

If at the same time I decide to start playing more boardgames with my buddies in my awesome clubhouse president Bush decides to send more troops into the middle east there is almost certainly no actual relation between the two things but there is a positive correlation... right?

If I decided to increase the amount of money I spend on breadsticks for my buddies in da clubhouse every week and at the same time world hunger rates went down, again, there is almost certainly no actual relationship between these two things whatsoever but there is still a negative correlation... right?!

Thanks you guys!

Nano

 Tags:

   Report

2 ANSWERS


  1. lol no it's not just statistical relation, but from your teminology and sentence structure i get the idea that you know that and just wanted to ask what seemed to be a hard question and you know the answer already


  2. You are correct.  here is a good example of how correlation is not causation.

    The shoe size of grade school students and the student's vocabulary are highly correlated.  In other words, the larger the shoe size, the larger the vocabulary the student has.  Now it is easy to see that shoe size and vocabulary have nothing to do with each other, but they are highly correlated.   The reason is that there is a confounding factor, age.  the older the grade school student the larger the shoe size and the larger the vocabulary.

    The correlation coefficient, r, is a measure of the linear relationship between two variables.  If the data is non-linear then the correlation coefficient is meaningless.

    r takes on values between -1 and 1.  negative values indicate the relationship between the variables is indirect, i.e., on a scatter plot the data tends to have a negative slope.  Positive values for r indicate the data tends to have a positive slope.  if r = 0 we say the variables are uncorrelated.

    the closer the absolute value of r is to 1, the stronger the linear association between the two variables.

    there are many different formulas for calculating the value of r.  if we let xbar and ybar be the means of two data sets.  sx and sy are the standard deviations in the data sets and n = total sample size then:

    r = 1/(n - 1) * Σ( ((xi - xbar)/sx) * ((yi - ybar)/sy)) with the sum going from i = 1 to n

    r = Covariance(X,Y) / [(√(Var(X))√(Var(Y))]

    the second equation shows that the correlation coefficient the ratio between the measure of spread between the variables and the product of the spread within each variable.

    r is unit less.

    r is not affected by multiplying each data set by a constant, and a constant to each data set or interchanging x and y.

    r is subject to outliers.

    r² is called the coefficient of determination.  It is a measure of the proportion of variance in y explained by regression.

    you cannot compare models by comparing the r values.  This is a long discussion, a full day lecture in the prob/stat courses I've instructed.  Model comparison is a topic usually saved for high level under grad courses or graduate level courses.

    good sites with info about correlation are:

    http://mathworld.wolfram.com/Correlation...

    http://mathworld.wolfram.com/LeastSquare...

Question Stats

Latest activity: earlier.
This question has 2 answers.

BECOME A GUIDE

Share your knowledge and help people by answering questions.