Tuesday, December 9, 2014

regression

You have uncovered associations between the column and row variables.


Have you wondered if there is a causal relationship between your column and row variables?


I wonder that all the time.  This is beyond what you will be doing in this course, but if you take a social science statistics or methods course, you will learn about multiple regression.  I did a little exploring with multiple regression (using spss) based on a student paper that produced five tables on Latin music (race, religion, gender, age and education).  My thought was that four of the table findings were probably spurious, because race was clearly the only cause.  I was wrong:


I think this all stems back to race. The reason education, age and gender statistics came out the way they did was the same reason as religion – not only are Hispanics in our sample likely to be Catholic, they are also likely to be of a lower education, younger, and female, compared to people in our survey who identified as a race other than Hispanic. I did a quick regression analysis and it shows that race explains most of the variation in liking Latin music, but also that education and gender have an independent influence apart from race. Age and religion were cancelled out, and did not have an independent influence apart from race. So in the end that tells us that age and religion seemed like they had an influence on Latin music, but that was a spurious relationship, masked by race. Out of the five column variables, race, education and gender provide the best predictors of why people like Latin music.

No comments:

Post a Comment