<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>YourMorals.Org Moral Psychology Blog &#187; YourMorals Data</title>
	<atom:link href="http://www.yourmorals.org/blog/tag/yourmorals-data/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.yourmorals.org/blog</link>
	<description>Moral Psychology Findings and Discussion</description>
	<lastBuildDate>Wed, 25 Jan 2012 16:22:22 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Having your cake&#8230; part 2</title>
		<link>http://www.yourmorals.org/blog/2010/08/having-your-cake-part-2/</link>
		<comments>http://www.yourmorals.org/blog/2010/08/having-your-cake-part-2/#comments</comments>
		<pubDate>Tue, 03 Aug 2010 12:47:15 +0000</pubDate>
		<dc:creator>Brad</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[conservatives]]></category>
		<category><![CDATA[liberals]]></category>
		<category><![CDATA[moral foundations]]></category>
		<category><![CDATA[Representatitive]]></category>
		<category><![CDATA[YourMorals Data]]></category>

		<guid isPermaLink="false">http://www.yourmorals.org/blog/?p=186</guid>
		<description><![CDATA[[This is the second post in a series of posts dealing with the representativeness of the YourMorals data, see here to read the first post]
Last time, I gave a broad overview of the descriptive representation of the YourMorals dataset. In a nutshell, we discovered that the YourMorals respondents were much more educated, more likely to [...]]]></description>
			<content:encoded><![CDATA[<p>[This is the second post in a series of posts dealing with the representativeness of the YourMorals data, see <a href="http://www.yourmorals.org/blog/2010/07/having-your-cake-and-eating-it-too-representativeness-and-the-yourmorals-data/">here</a> to read the first post]</p>
<p>Last time, I gave a broad overview of the descriptive representation of the YourMorals dataset. In a nutshell, we discovered that the YourMorals respondents were much more educated, more likely to self-identify as liberal, and more likely to be white than the population.</p>
<p>In this post, I will explore the question of whether the YourMorals respondents are representative of the population after we condition on observable characteristics. Put another way, would we expect two individuals, one randomly chosen from the population and one drawn from the YourMorals data, who share all the same demographic characteristics (age, race, education, political ideology, place of residence) to look the same in terms of their scores on the Moral Foundations Questionnaire?</p>
<p>To conduct this kind of analysis, first we need a benchmark against which to compare the YourMorals data. As I mentioned in my previous post, the gold standard is a randomly drawn sample from the population. Luckily, we have just such a survey. Prior to the 2008 election, Knowledge Networks* fielded a version of the Moral Foundations Questionnaire to a representative sample of the U.S. population. This provides a good point of comparison for our (much larger) convenience sample.</p>
<p>The first task is to process the YourMorals data so that it looks more like the general population. I used a basic sample matching technique to match individuals from the YourMorals data and the Knowledge Networks data. This is a crude technique, but effective. Basically for each individual in the Knowledge Networks sample (the “match target”), I found an individual (or individuals) in the YourMorals data that matched the demographic information for the “match target.” These cases then become the comparison group. After the samples have been balanced in terms of observable characteristics, any differences we observe between the two can be ascribed to the compounding factors that we cannot observe.**</p>
<p>The following figures show how the distributions of the matched YourMorals data compares with the distributions in the sample from Knowledge Networks. The dashed lines show the distribution for Knowledge Networks, the solid lines represent the YourMorals data.</p>
<p><a href="http://www.yourmorals.org/blog/wp-content/uploads/2010/08/fig1.jpg"><img class="aligncenter size-full wp-image-187" src="http://www.yourmorals.org/blog/wp-content/uploads/2010/08/fig1.jpg" alt="Figure 1" width="683" height="397" /></a></p>
<p>The distributions of the foundations in the two data sources look very similar for the Fairness/Reciprocity foundation, but for all of the others, there are significant differences between the YourMorals and the Knowledge Networks respondents.</p>
<p>A little more digging reveals some interesting patterns. Splitting up the sample by ideology yields:</p>
<p>Liberals:</p>
<p><a href="http://www.yourmorals.org/blog/wp-content/uploads/2010/08/fig2.jpg"><img class="aligncenter size-full wp-image-188" src="http://www.yourmorals.org/blog/wp-content/uploads/2010/08/fig2.jpg" alt="Figure 2 - Liberals only" width="683" height="397" /></a></p>
<p>Conservatives:</p>
<p><a href="http://www.yourmorals.org/blog/wp-content/uploads/2010/08/fig3.jpg"><img class="aligncenter size-full wp-image-189" src="http://www.yourmorals.org/blog/wp-content/uploads/2010/08/fig3.jpg" alt="Figure 3 - Conservatives only" width="683" height="397" /></a>Two of the foundations seem to stand out in these comparisons. Liberals in the YourMorals data are particularly low on the Purity foundation (when compared against liberals in the Knowledge Networks data), and conservatives from the YourMorals sample seem to score lower on the Harm foundation. In both cases, YourMorals liberals seem more like population liberals on the first two foundations (Harm and Fairness), and the conservatives in the sample seem more like population conservatives on the last two foundations (Authority and Purity). No matter how the data is cut, the YourMorals sample seems to score lower on the Ingroup foundation.</p>
<p>The comparisons between the general population sample and the convenience sample in this post raise some significant questions about the possibility of using the self-selected respondents in the YourMorals sample to make inferences about the population. These problems in the data are particularly evident in the Ingroup foundation, the purity foundation (for liberals), and the harm foundation (for conservatives).</p>
<p>As was the case with demographics, all is not lost. One last look at the data shows that again the foundations are more or less proportionally correct. Liberals score higher in on the Harm and Fairness foundations in relation to their scores on the other three, and conservatives show more or less equal scores across each of the foundations. The bar chart below shows the average scores of the foundations broken out by survey source (KN and YM for Knowledge Networks and YourMorals respectively) and ideology:</p>
<p><a href="http://www.yourmorals.org/blog/wp-content/uploads/2010/08/fig4.jpg"><img class="aligncenter size-full wp-image-192" src="http://www.yourmorals.org/blog/wp-content/uploads/2010/08/fig4.jpg" alt="Figure 4" width="683" height="397" /></a></p>
<p>Next time, I’ll discuss how we might correct for some of these demographic and attitudinal biases in the data.</p>
<p>*For the uninitiated, Knowledge Networks is a survey research firm that has gone to great lengths to put together a panel of internet users that is nationally representative. They have recruited a large panel of individuals to take internet surveys. These individuals were generally contacted by telephone, and in cases where the respondent did not have internet access, Knowledge Networks provided access. See <a href="http://www.knowledgenetworks.com/knpanel/index.html">this link</a> for more information.</p>
<p>**For a quick primer on the theory behind sample matching see <a href="http://en.wikipedia.org/wiki/Rubin_Causal_Model">this</a> Wikipedia entry.  I am using exact matching on categories of age, race, education, ideology, and state of residence.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.yourmorals.org/blog/2010/08/having-your-cake-part-2/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Having your cake and eating it too: Representativeness and the YourMorals Data</title>
		<link>http://www.yourmorals.org/blog/2010/07/having-your-cake-and-eating-it-too-representativeness-and-the-yourmorals-data/</link>
		<comments>http://www.yourmorals.org/blog/2010/07/having-your-cake-and-eating-it-too-representativeness-and-the-yourmorals-data/#comments</comments>
		<pubDate>Wed, 28 Jul 2010 11:40:24 +0000</pubDate>
		<dc:creator>Brad</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[yourmorals.org]]></category>
		<category><![CDATA[Convenience Sampling]]></category>
		<category><![CDATA[Representatitive]]></category>
		<category><![CDATA[YourMorals Data]]></category>

		<guid isPermaLink="false">http://www.yourmorals.org/blog/?p=166</guid>
		<description><![CDATA[[This is the first in a several part series on creating representative samples from convenience sampling data]
Earlier Jon Haidt discussed the “problem” of representativeness of the YourMorals data and concluded that it wasn’t such a problem after all. Convenience samples drawn from the internet can produce reliable data. This is particularly true when we are [...]]]></description>
			<content:encoded><![CDATA[<p>[This is the first in a several part series on creating representative samples from convenience sampling data]</p>
<p><a href="http://www.yourmorals.org/blog/2010/03/nationally-representative-data-is-bad-data-for-psychology/">Earlier</a> Jon Haidt discussed the “problem” of representativeness of the YourMorals data and concluded that it wasn’t such a problem after all. Convenience samples drawn from the internet can produce reliable data. This is particularly true when we are more interested in taking valid measurements than in painting a representative picture of some underlying population.</p>
<p>But what if we would also like to know something about the underlying population? If we had data that were representative of the country as a whole, we would be able to ask a new set of questions. Does knowing where the states fall in terms of their Moral Foundations tell us anything about voting behavior? We might expect scores on the purity foundation to explain state-level attitudes about gay marriage or the fairness foundation to explain attitudes about tax policy. To answer these kinds of questions, we need representative samples (also see Jesse Graham’s comment in the above link).</p>
<p>In sampling theory, the gold standard is the probability sample. When all individuals in the population have a known (but not necessarily equal) probability of being included in the sampling frame, we can construct reliable estimates of the population parameters and, given sufficient sample size, be confident that these estimates are within some distance of the true values in the population. However, the central assumptions of sampling theory are violated in convenience sampling (but see <a href="http://www.pollster.com/blogs/doug_rivers.php">this</a> discussion of the representation problems in traditional &#8220;random&#8221; sample polls).</p>
<p>First, we would like to get a sense of how the YourMorals data stacks up against other population measures. We collected data on several demographic characteristics of individuals in the YourMorals dataset. We can easily compare these against population values collected from the census or other representative samples.</p>
<p>One area where we can clearly see the representation problems in the YourMorals data is self-reported ideology. Considering only U.S. respondents for the time being (as all of the following analyses do), recent national samples put the proportion of people who consider themselves “liberal” at between 18 and 22 per cent. In the YourMorals data, this figure is nearly 65 percent.* Given this skew in the data, we might be hesitant in trying to make inferences about the general population from a sample that looks so much different.</p>
<p>The figures below show how the YourMorals data compares with the population values across a handful of demographic and attitudinal variables.</p>
<p><a href="http://www.yourmorals.org/blog/wp-content/uploads/2010/07/fig1.jpg"><img class="aligncenter size-full wp-image-174" src="http://www.yourmorals.org/blog/wp-content/uploads/2010/07/fig1.jpg" alt="Figure 1" width="683" height="397" /></a></p>
<p>Source: Pew Center for the People and the Press, 2001-2008</p>
<p>This figure shows how even with a significant intercept shift (almost 50 points), the rank ordering of the states stays pretty close to the same. This is encouraging as it means we are not drawing the same type of individual from each state. Put differently, knowing the state that an individual resides in tells us something about the probability that he or she identifies as a liberal. What we would not want to see here would be a horizontal line (indicating no relationship).</p>
<p><a href="http://www.yourmorals.org/blog/wp-content/uploads/2010/07/fig21.jpg"><img class="aligncenter size-full wp-image-181" src="http://www.yourmorals.org/blog/wp-content/uploads/2010/07/fig21.jpg" alt="Figure 2" width="683" height="397" /></a></p>
<p>Source: American Community Survey, 2006-2008</p>
<p><a href="http://www.yourmorals.org/blog/wp-content/uploads/2010/07/fig3.jpg"><img class="aligncenter size-full wp-image-177" src="http://www.yourmorals.org/blog/wp-content/uploads/2010/07/fig3.jpg" alt="Figure 3" width="683" height="397" /></a></p>
<p>Source: American Community Survey, 2006-2008</p>
<p>With race it is much the same story as ideology. For whites, there is a substantial intercept shift (almost 70 points), but states with larger white populations also are proportionally more white in the YourMorals data. The data for African Americans is noisier (there were fewer than 900 in the sample of over 60,000), but shows the same pattern. Here there is not a large intercept shift (as we have reached the floor of the data), but we see the same kind of increasing pattern.</p>
<p><a href="http://www.yourmorals.org/blog/wp-content/uploads/2010/07/fig4.jpg"><img class="aligncenter size-full wp-image-178" src="http://www.yourmorals.org/blog/wp-content/uploads/2010/07/fig4.jpg" alt="Figure 4" width="683" height="397" /></a></p>
<p>Source: American Community Survey, 2006-2008</p>
<p>With respect to education, the data are further afield. The figure shows that the YourMorals sample is significantly more educated than the general population, but it becomes more difficult to draw a convincing trend line through the data. Individuals who came from states with higher levels of education were only marginally more likely to be highly educated themselves.</p>
<p>So where does all of this leave us? It is obvious from the plots that the individuals who self-selected into the YourMorals data look very different than the general population. It would clearly be inappropriate to use the raw data in trying to make inferences about the general population parameters (average levels of a particular foundation in a particular state, for example). The sample is much more liberal, highly educated, and white than the general population. But it is not <em>as</em> bad as it could be. The worst-case scenario would show uniformly weird sample across the states. Instead, what we saw in the figures above is a picture that is more-or-less proportionally correct. It is encouraging that the general relationships hold up.</p>
<p>All of this is not to say that we should throw out the analyses presented elsewhere in this blog and in publications based on the YourMorals data. If we condition on ideology (which we saw was particularly skewed) and make statements like “Liberals generally score higher than conservatives on the Harm/Care and Fairness/Reciprocity foundations,” we are probably treading on safe ground.</p>
<p>In the next few posts, I will be revisiting the question of how to construct a representative picture from a convenience sample.</p>
<p>*Beyond the obvious sampling issues, there are a few other problems with directly comparing the measure of ideology in YourMorals with that in nationally representative samples. First, there is a mode difference that could account for some of the discrepancy (although certainly not all or even a very significant portion of it). Another (and more serious) difference between nationally representative samples and the YourMorals data is the choice of a seven point scale rather than a five point scale. Five point scales are used more regularly in telephone samples with the options being “Very Conservative,” “Conservative,” “Moderate,” “Liberal,” and “Very Liberal.” The YourMorals data includes options for “Slightly liberal” and “Slightly Conservative” as well as “Libertarian” and “other” categories. The 65 percent figure lumps all of the “liberals” together. If you believe that the “slightly liberal” respondents might have self-identified as “Moderate” given fewer options, the proportion turns out to be just over 50.</p>
<div style="width: 1px;height: 1px;overflow: hidden"><img src="/Users/Brad/AppData/Local/Temp/moz-screenshot.png" alt="" /></div>
]]></content:encoded>
			<wfw:commentRss>http://www.yourmorals.org/blog/2010/07/having-your-cake-and-eating-it-too-representativeness-and-the-yourmorals-data/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>

