[This is the first in a several part series on creating representative samples from convenience sampling data]
Earlier Jon Haidt discussed the “problem” of representativeness of the YourMorals data and concluded that it wasn’t such a problem after all. Convenience samples drawn from the internet can produce reliable data. This is particularly true when we are more interested in taking valid measurements than in painting a representative picture of some underlying population.
But what if we would also like to know something about the underlying population? If we had data that were representative of the country as a whole, we would be able to ask a new set of questions. Does knowing where the states fall in terms of their Moral Foundations tell us anything about voting behavior? We might expect scores on the purity foundation to explain state-level attitudes about gay marriage or the fairness foundation to explain attitudes about tax policy. To answer these kinds of questions, we need representative samples (also see Jesse Graham’s comment in the above link).
In sampling theory, the gold standard is the probability sample. When all individuals in the population have a known (but not necessarily equal) probability of being included in the sampling frame, we can construct reliable estimates of the population parameters and, given sufficient sample size, be confident that these estimates are within some distance of the true values in the population. However, the central assumptions of sampling theory are violated in convenience sampling (but see this discussion of the representation problems in traditional “random” sample polls).
First, we would like to get a sense of how the YourMorals data stacks up against other population measures. We collected data on several demographic characteristics of individuals in the YourMorals dataset. We can easily compare these against population values collected from the census or other representative samples.
One area where we can clearly see the representation problems in the YourMorals data is self-reported ideology. Considering only U.S. respondents for the time being (as all of the following analyses do), recent national samples put the proportion of people who consider themselves “liberal” at between 18 and 22 per cent. In the YourMorals data, this figure is nearly 65 percent.* Given this skew in the data, we might be hesitant in trying to make inferences about the general population from a sample that looks so much different.
The figures below show how the YourMorals data compares with the population values across a handful of demographic and attitudinal variables.
Source: Pew Center for the People and the Press, 2001-2008
This figure shows how even with a significant intercept shift (almost 50 points), the rank ordering of the states stays pretty close to the same. This is encouraging as it means we are not drawing the same type of individual from each state. Put differently, knowing the state that an individual resides in tells us something about the probability that he or she identifies as a liberal. What we would not want to see here would be a horizontal line (indicating no relationship).
Source: American Community Survey, 2006-2008
Source: American Community Survey, 2006-2008
With race it is much the same story as ideology. For whites, there is a substantial intercept shift (almost 70 points), but states with larger white populations also are proportionally more white in the YourMorals data. The data for African Americans is noisier (there were fewer than 900 in the sample of over 60,000), but shows the same pattern. Here there is not a large intercept shift (as we have reached the floor of the data), but we see the same kind of increasing pattern.
Source: American Community Survey, 2006-2008
With respect to education, the data are further afield. The figure shows that the YourMorals sample is significantly more educated than the general population, but it becomes more difficult to draw a convincing trend line through the data. Individuals who came from states with higher levels of education were only marginally more likely to be highly educated themselves.
So where does all of this leave us? It is obvious from the plots that the individuals who self-selected into the YourMorals data look very different than the general population. It would clearly be inappropriate to use the raw data in trying to make inferences about the general population parameters (average levels of a particular foundation in a particular state, for example). The sample is much more liberal, highly educated, and white than the general population. But it is not as bad as it could be. The worst-case scenario would show uniformly weird sample across the states. Instead, what we saw in the figures above is a picture that is more-or-less proportionally correct. It is encouraging that the general relationships hold up.
All of this is not to say that we should throw out the analyses presented elsewhere in this blog and in publications based on the YourMorals data. If we condition on ideology (which we saw was particularly skewed) and make statements like “Liberals generally score higher than conservatives on the Harm/Care and Fairness/Reciprocity foundations,” we are probably treading on safe ground.
In the next few posts, I will be revisiting the question of how to construct a representative picture from a convenience sample.
*Beyond the obvious sampling issues, there are a few other problems with directly comparing the measure of ideology in YourMorals with that in nationally representative samples. First, there is a mode difference that could account for some of the discrepancy (although certainly not all or even a very significant portion of it). Another (and more serious) difference between nationally representative samples and the YourMorals data is the choice of a seven point scale rather than a five point scale. Five point scales are used more regularly in telephone samples with the options being “Very Conservative,” “Conservative,” “Moderate,” “Liberal,” and “Very Liberal.” The YourMorals data includes options for “Slightly liberal” and “Slightly Conservative” as well as “Libertarian” and “other” categories. The 65 percent figure lumps all of the “liberals” together. If you believe that the “slightly liberal” respondents might have self-identified as “Moderate” given fewer options, the proportion turns out to be just over 50.














On Hyperpartisanship, Hypermoralism, and the Supernormal Stimuli of Modern Politics
July 23rd, 2010 by Ravi IyerToday’s lead story from Politico, The Age of Rage, probably summarizes a lot of what people think is wrong with politics. Rather than make good policy, politicians and media are more concerned with scoring points for their political ideology (hyperpartisanship). However, as the Politico article points out, their actions are largely driven by the general populace. Politicians and media reflect what people respond to, which happens to be hyperpartisanship, rather than causing the incivility we see.
We reward politicians and news organizations, with our attention and our money, that engage in the very incivility that makes politics so ugly. This is true on both sides of the aisle.
At the recent meeting of the International Society of Political Psychology, Linda Skitka gave a talk which puts a lot of this in perspective for me. Her lab studies the dark side of moral conviction, which I call hypermoralism in the hope that the term catches on. Roy Baumeister studies a similar concept, idealistic evil. In Skitka’s talk, she demonstrates in a Chinese sample that political intolerance (e.g. “people with different positions than your own about this issue should be allowed to have their phones tapped by the Chinese government”) and social intolerance (e.g. “How willing would you be to have someone who did not share your views on this issue as a close personal friend?”) were best predicted by moral conviction (e.g. “To what extent are your feelings about this issue or policy based on your fundamental beliefs about right and wrong?”). When controlling for moral conviction, all other variables (e.g. demographics, political position, attitude importance, and attitude strength) were all insignificant predictors of social and political intolerance. I look forward to seeing how this replicates on a US sample and how political intolerance is operationalized. Perhaps something along the lines of liberal consideration of censoring Fox news or conservative publication of what many would consider private discussion would make good operationalizations of political intolerance as they mirror what we see in reality, where considerations of privacy, context, and free speech are considered secondary to partisanship. Moral conviction may underlie the hyperpartisanship that Politico talks about.
Hyperpartisanship and hypermoralism may be another instance of the effects of what evolutionary psychologist Deirdre Barrett calls “Supernormal Stimuli”. As the Wall Street Journal writes about her book:
In the case of hyperpartisanship and hypermoralism, our evolved moral senses, which allow human beings to cooperate, are now subject to the stimulus which is the 24 hour news cycle and the non-stop political campaign. Moral emotions are powerful forces, which are now activated routinely, rather than rarely.
If anybody has ideas on how to escape this cycle, I would love to hear them. Humanizing and getting to know the opposition, along the lines of intergroup contact theory, is an idea. Perhaps moral emotions can be activated against hyperpartisanship itself, rather than against individual ideologies. Or maybe with greater understanding, we can all learn to recognize supernormal moral stimuli and give them less power in our lives. Ideas welcome and I’m open to operationalizing particularly promising ideas as studies to be run on yourmorals.org.
- Ravi Iyer