<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Data Journalism Blog</title>
	<atom:link href="http://www.datajournalismblog.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.datajournalismblog.com</link>
	<description>Get the latest news on data driven journalism with interviews, reviews and news features</description>
	<lastBuildDate>Tue, 22 May 2012 09:37:20 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
		<item>
		<title>Measuring Facebook&#8217;s reach</title>
		<link>http://www.datajournalismblog.com/2012/05/22/facebooks-reach/</link>
		<comments>http://www.datajournalismblog.com/2012/05/22/facebooks-reach/#comments</comments>
		<pubDate>Tue, 22 May 2012 09:37:05 +0000</pubDate>
		<dc:creator>Alistair-Walker</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.datajournalismblog.com/?p=1731</guid>
		<description><![CDATA[It seems you can&#8217;t leave the house (or, indeed, stay in the house) without fear of hearing something about Facebook these days. With the ongoing public offering, the &#8216;is it worth $100bn&#8217; debate or, indeed, the steady stream of stories on social media trolls, the site has never been as ubiquitous as it is today. Nielsen marked Facebook&#8217;s $1oobn floating in New York this week by releasing their latest figures on Facebook&#8217;s reach &#8211; the percentage of Internet users who visited Facebook last year &#8211; in twelve major international countries. All good and well, but I thought it would be a little more interesting to present number of unique users as percentage of population. The population numbers are current up to 2010 and taken from the World Bank&#8217;s figures, with the exception of Taiwan, which I got from the CIA World Factbook. Of course, there is the obligatory small print &#8211; presenting unique users as a percentage of the population doesn&#8217;t account for those who have multiple accounts, nor does it factor in the fact that brands and companies have official accounts, so the map can&#8217;t be entirely representative of the percentage of the population.]]></description>
				<content:encoded><![CDATA[<p>It seems you can&#8217;t leave the house (or, indeed, stay in the house) without fear of hearing something about Facebook these days. With the ongoing public offering, the &#8216;is it worth $100bn&#8217; debate or, indeed, the steady stream of stories on social media trolls, the site has never been as ubiquitous as it is today. Nielsen marked Facebook&#8217;s $1oobn floating in New York this week by releasing their <a href="http://blog.nielsen.com/nielsenwire/global/global-and-social-facebooks-rise-around-the-world/">latest figures on Facebook&#8217;s reach</a> &#8211; the percentage of Internet users who visited Facebook last year &#8211; in twelve major international countries.</p>
<p>All good and well, but I thought it would be a little more interesting to <a href="https://www.google.com/fusiontables/embedviz?gco_region=world&amp;gco_dataMode=regions&amp;containerId=gviz_canvas&amp;q=select+gvizcountry(col0)%2C+col1%2C+col0+from+1CoNx9yik_t_3j_nh6I4xkgk-rtTrY5ITjRk8hik&amp;qrs=+where+gvizcountry(col0)+%3E%3D+&amp;qre=+and+gvizcountry(col0)+%3C%3D+&amp;qe=+limit+12&amp;viz=GVIZ&amp;t=MAP&amp;width=500&amp;height=300">present number of unique users as percentage of population</a>. The population numbers are current up to 2010 and taken from the <a href="http://data.worldbank.org/indicator/SP.POP.TOTL">World Bank&#8217;s figures</a>, with the exception of Taiwan, which I got from the CIA World Factbook. Of course, there is the obligatory small print &#8211; presenting unique users as a percentage of the population doesn&#8217;t account for those who have multiple accounts, nor does it factor in the fact that brands and companies have official accounts, so the map can&#8217;t be entirely representative of the percentage of the population.</p>
<p style="text-align: center"><a href="https://www.google.com/fusiontables/embedviz?gco_region=world&amp;gco_dataMode=regions&amp;containerId=gviz_canvas&amp;q=select+gvizcountry(col0)%2C+col1%2C+col0+from+1CoNx9yik_t_3j_nh6I4xkgk-rtTrY5ITjRk8hik&amp;qrs=+where+gvizcountry(col0)+%3E%3D+&amp;qre=+and+gvizcountry(col0)+%3C%3D+&amp;qe=+limit+12&amp;viz=GVIZ&amp;t=MAP&amp;width=500&amp;height=300"><img class="size-medium wp-image-1735 aligncenter" src="http://www.datajournalismblog.com/wp-content/uploads/2012/05/Facebook-map-300x108.jpg" alt="" width="300" height="108" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.datajournalismblog.com/2012/05/22/facebooks-reach/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A useful video for data journalism students</title>
		<link>http://www.datajournalismblog.com/2012/05/08/useful-data-video/</link>
		<comments>http://www.datajournalismblog.com/2012/05/08/useful-data-video/#comments</comments>
		<pubDate>Tue, 08 May 2012 09:31:27 +0000</pubDate>
		<dc:creator>Lauren York</dc:creator>
				<category><![CDATA[Videos]]></category>
		<category><![CDATA[BBC]]></category>
		<category><![CDATA[CoJo]]></category>
		<category><![CDATA[College of Journalism]]></category>
		<category><![CDATA[data journalism]]></category>
		<category><![CDATA[Frontline Club]]></category>
		<category><![CDATA[Michael Blastland]]></category>
		<category><![CDATA[Simon Rogers]]></category>

		<guid isPermaLink="false">http://www.datajournalismblog.com/?p=1676</guid>
		<description><![CDATA[&#160; A short post this time, but I just wanted to share a video that I have  found particularly useful with you. It&#8217;s from the BBC&#8217;s College of Journalism website, and is a recording of a Frontline club discussion on the importance of data journalism. I know it&#8217;s from 2010, but a lot of the points are still relevant and it is an interesting dissection of why all this matters, and contains some good analyses of the skills and techniques you need to make it as a data journalist. I&#8217;ve included the bios of the speakers from the site below. - Simon Rogers, news editor (data) at the Guardian and editor of Guardian.co.uk&#8217;s Datablog, played a key role in turning some of the 90,000 documents given to Wikileaks into graphics and interactive charts. Read this fascinating article by Simon on how he did it. - David McCandless, writer, designer and author of Information is Beautiful, which &#8220;explores the potential of data visualisation as a new direction for journalists and storytelling&#8221;. - Julian Burgess, programmer and editorial developer at the Times, who talked about using data in a practical newsroom environment and how journalists can add a real-time dimension to their work. - Michael Blastland, journalist and creator of BBC Radio 4&#8242;s More or Less programme. Michael spoke about how to use official sources and data and make sure you&#8217;re getting the real story behind the figures. &#160;]]></description>
				<content:encoded><![CDATA[<p>&nbsp;</p>
<p>A short post this time, but I just wanted to share a video that I have  found particularly useful with you. It&#8217;s from the BBC&#8217;s College of Journalism website, and is a recording of a Frontline club discussion on the importance of data journalism.</p>
<p>I know it&#8217;s from 2010, but a lot of the points are still relevant and it is an interesting dissection of why all this matters, and contains some good analyses of the skills and techniques you need to make it as a data journalist. I&#8217;ve included the bios of the speakers from the site below.</p>
<p><iframe width="620" height="349" src="http://viddler.com/embed/ded27591" frameborder="0" allowfullscreen></iframe></p>
<div>- Simon Rogers, news editor (data) at the <em>Guardian </em>and editor of Guardian.co.uk&#8217;s Datablog, played a key role in turning some of the 90,000 documents given to Wikileaks into graphics and interactive charts. <a href="http://www.journalism.co.uk/5/articles/540109.php">Read this fascinating article</a> by Simon on how he did it.</div>
<div></div>
<div></div>
<div>- David McCandless, writer, designer and author of <em><a href="http://www.informationisbeautiful.net/2009/the-visual-miscellaneum/">Information is Beautiful</a></em>, which &#8220;explores the potential of data visualisation as a new direction for journalists and storytelling&#8221;.</div>
<div></div>
<div></div>
<div>- Julian Burgess, programmer and editorial developer at the <em>Times</em>, who talked about using data in a practical newsroom environment and how journalists can add a real-time dimension to their work.</div>
<div></div>
<div></div>
<div>- <a href="http://www.bbc.co.uk/journalism/blog/michael-blastland/">Michael Blastland</a>, journalist and creator of BBC Radio 4&#8242;s <em>More or Less </em>programme. Michael spoke about how to use official sources and data and make sure you&#8217;re getting the real story behind the figures.</div>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.datajournalismblog.com/2012/05/08/useful-data-video/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The year in numbers: jubilee and olympic data for 2012</title>
		<link>http://www.datajournalismblog.com/2012/05/08/the-year-in-numbers/</link>
		<comments>http://www.datajournalismblog.com/2012/05/08/the-year-in-numbers/#comments</comments>
		<pubDate>Tue, 08 May 2012 09:06:30 +0000</pubDate>
		<dc:creator>Lauren York</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[2012]]></category>
		<category><![CDATA[jubilee]]></category>
		<category><![CDATA[lonodn]]></category>
		<category><![CDATA[olympics]]></category>
		<category><![CDATA[queen]]></category>
		<category><![CDATA[work experience]]></category>
		<category><![CDATA[year in numbers]]></category>

		<guid isPermaLink="false">http://www.datajournalismblog.com/?p=1672</guid>
		<description><![CDATA[&#160; 2012 is a big year for Britain, and particularly for London, as the city swings into gear for two enormous celebrations this summer. When exchanging notes with students from my course as we all came back from a month&#8217;s work experience, we realised that we had all been working on the same project for different papers. The occasions are of course the London Olympics and the Queen&#8217;s Jubilee and the task that almost everyone was assigned was &#8216;a year in numbers&#8217;, or finding interesting numbers to represent both events. In retrospect it was probably a good thing that we didn&#8217;t confer at the time, as several papers could have ended up publishing very similar statistics! The most difficult task that I have seen and the one that made me pity the hapless work experience student the most, was to mark the 100 day countdown to the games with an Olympic-themed fact for each number between 1 and 100. This is deceptively difficult and involves a huge amount of work. In honour of the hours spent on the projects, I thought I would catalogue some of the best examples of 2012 in numbers. Some news organisations chose to stick with straight lists of numbers, like the Telegraph&#8217;s take on the opening and closing ceremonies. Others made the article more visual by designing static infographics.This example from Time Out is one of my favourites. The Independent added another layer of interactivity by accompanying their numbers with a slideshow. The Guardian and the BBC both compiled videos, the Guardian&#8217;s in particular is very smooth. Finally the Telegraph, as you would expect, is leading the way on the Jubilee data, but I have it on good authority that quite a few similar articles will be appearing in national newspapers over the coming weeks!]]></description>
				<content:encoded><![CDATA[<p>&nbsp;</p>
<p>2012 is a big year for Britain, and particularly for London, as the city swings into gear for two enormous celebrations this summer.</p>
<p>When exchanging notes with students from my course as we all came back from a month&#8217;s work experience, we realised that we had all been working on the same project for different papers.</p>
<p>The occasions are of course the London Olympics and the Queen&#8217;s Jubilee and the task that almost everyone was assigned was &#8216;a year in numbers&#8217;, or finding interesting numbers to represent both events. In retrospect it was probably a good thing that we didn&#8217;t confer at the time, as several papers could have ended up publishing very similar statistics!</p>
<p>The most difficult task that I have seen and the one that made me pity the hapless work experience student the most, was to mark the 100 day countdown to the games with an Olympic-themed fact for each number between 1 and 100. This is deceptively difficult and involves a huge amount of work.</p>
<p>In honour of the hours spent on the projects, I thought I would catalogue some of the best examples of 2012 in numbers.</p>
<p>Some news organisations chose to stick with straight lists of numbers, like the Telegraph&#8217;s take on the opening and closing ceremonies.</p>
<p>Others made the article more visual by designing static infographics.This example from Time Out is one of my favourites.</p>
<p><a href="http://www.datajournalismblog.com/wp-content/uploads/2012/05/time-out-infographic-olympics.jpg"><img class="alignnone size-full wp-image-1714" src="http://www.datajournalismblog.com/wp-content/uploads/2012/05/time-out-infographic-olympics.jpg" alt="" width="812" height="1738" /></a></p>
<p>The Independent added another layer of interactivity by accompanying their numbers with a <a href="http://www.independent.co.uk/sport/olympics/the-london-2012-olympics-in-numbers-2326316.html?action=gallery&amp;ino=2" target="_blank">slideshow</a>.</p>
<p>The <a href="http://www.guardian.co.uk/sport/video/2012/jan/03/olympics-2012-numbers-animation" target="_blank">Guardian</a> and the <a href="www.bbc.co.uk/news/uk-17747643  " target="_blank">BBC</a> both compiled videos, the Guardian&#8217;s in particular is very smooth.</p>
<p>Finally the Telegraph, as you would expect, is leading the way on the <a href="http://www.telegraph.co.uk/news/uknews/the_queens_diamond_jubilee/9057637/Queens-Diamond-Jubilee-The-Queen-in-Numbers-Part-1.html" target="_blank">Jubilee data</a>, but I have it on good authority that quite a few similar articles will be appearing in national newspapers over the coming weeks!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.datajournalismblog.com/2012/05/08/the-year-in-numbers/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Surveys: some basic rules to avoid bias</title>
		<link>http://www.datajournalismblog.com/2012/05/08/survey-rules/</link>
		<comments>http://www.datajournalismblog.com/2012/05/08/survey-rules/#comments</comments>
		<pubDate>Tue, 08 May 2012 08:42:52 +0000</pubDate>
		<dc:creator>Lauren York</dc:creator>
				<category><![CDATA[Data skills]]></category>
		<category><![CDATA[bias]]></category>
		<category><![CDATA[data gathering]]></category>
		<category><![CDATA[hints]]></category>
		<category><![CDATA[quiz]]></category>
		<category><![CDATA[rules]]></category>
		<category><![CDATA[Survey]]></category>
		<category><![CDATA[tips]]></category>

		<guid isPermaLink="false">http://www.datajournalismblog.com/?p=1659</guid>
		<description><![CDATA[&#160; Surveys are useful tools to gather information, but when done wrong they can lead to terrible datasets. The video below was played to us at one of our lectures, and is a handy reminder of why surveys need to avoid leading their respondents down a particular path… With this in mind, I thought I&#8217;d follow on from what I learnt from conducting my own survey and put together a list of some basic dos and don&#8217;ts to keep in mind when writing a poll or a survey. Rule 1: Don&#8217;t make assumptions about your respondent. Example: When you are driving, do you ever feel sleepy? Rule 2: Don&#8217;t use one question when you should use two. Example: Do you think that coffee is stimulating and delicious? Rule 3: Don&#8217;t use technical terms that won&#8217;t be understood. Example:  When you log on to the internet, do you use a fibre-optic or copper wire connection? Rule 4: Don&#8217;t use biased language Example: Do you think that footballers are greedy and stupid? Rule 5: Don&#8217;t make your questions too vague Example: Do you like nice things? Rule 6: Do make your questions clear and easy to understand Example: Did you enjoy the play? rather than Did you find that the emotional resonances of the play were agreeable? Rule 7: Do use scales rather than Yes/No when appropriate Example: How often do you use the library&#8217;s wifi? Never/Sometimes/Often Rule 8: Do give the respondents a range of options and an &#8216;other&#8217; choice with a text box as well Example: How useful did you find the talk? Very useful/quite useful/useful/not very useful/not useful Rule 9: Do make your answer options mutually exclusive Example: 20-29, 30-39 rather than 20-30, 30-40 Rule 10: Do make sure that your answers match your questions Example: Should cannabis be legalised? -Yes/No rather than -Always/Often/Sometimes/Never]]></description>
				<content:encoded><![CDATA[<p>&nbsp;</p>
<p>Surveys are useful tools to gather information, but when done wrong they can lead to terrible datasets. The video below was played to us at one of our lectures, and is a handy reminder of why surveys need to avoid leading their respondents down a particular path…</p>
<p><iframe width="500" height="375" src="http://www.youtube.com/embed/G0ZZJXw4MTA?feature=oembed" frameborder="0" allowfullscreen></iframe></p>
<p>With this in mind, I thought I&#8217;d follow on from what I learnt from conducting my own survey and put together a list of some basic dos and don&#8217;ts to keep in mind when writing a poll or a survey.</p>
<p><strong>Rule 1:</strong> Don&#8217;t make assumptions about your respondent.</p>
<p>Example: When you are driving, do you ever feel sleepy?</p>
<p><strong>Rule 2</strong>: Don&#8217;t use one question when you should use two.</p>
<p>Example: Do you think that coffee is stimulating and delicious?</p>
<p><strong>Rule 3:</strong> Don&#8217;t use technical terms that won&#8217;t be understood.</p>
<p>Example:  When you log on to the internet, do you use a fibre-optic or copper wire connection?</p>
<p><strong>Rule 4:</strong> Don&#8217;t use biased language</p>
<p>Example: Do you think that footballers are greedy and stupid?</p>
<p><strong>Rule 5:</strong> Don&#8217;t make your questions too vague</p>
<p>Example: Do you like nice things?</p>
<p><strong>Rule 6:</strong> Do make your questions clear and easy to understand</p>
<p>Example: Did you enjoy the play? rather than Did you find that the emotional resonances of the play were agreeable?</p>
<p><strong>Rule 7:</strong> Do use scales rather than Yes/No when appropriate</p>
<p>Example: How often do you use the library&#8217;s wifi? Never/Sometimes/Often</p>
<p><strong>Rule 8:</strong> Do give the respondents a range of options and an &#8216;other&#8217; choice with a text box as well</p>
<p>Example: How useful did you find the talk? Very useful/quite useful/useful/not very useful/not useful</p>
<p><strong>Rule 9:</strong> Do make your answer options mutually exclusive</p>
<p>Example: 20-29, 30-39 rather than 20-30, 30-40</p>
<p><strong>Rule 10:</strong> Do make sure that your answers match your questions</p>
<p>Example: Should cannabis be legalised? -Yes/No rather than -Always/Often/Sometimes/Never</p>
]]></content:encoded>
			<wfw:commentRss>http://www.datajournalismblog.com/2012/05/08/survey-rules/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Writing a survey: the successes and failures of a first attempt</title>
		<link>http://www.datajournalismblog.com/2012/05/08/writing-a-survey/</link>
		<comments>http://www.datajournalismblog.com/2012/05/08/writing-a-survey/#comments</comments>
		<pubDate>Tue, 08 May 2012 08:40:49 +0000</pubDate>
		<dc:creator>Lauren York</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[bias]]></category>
		<category><![CDATA[colour scheme]]></category>
		<category><![CDATA[keys]]></category>
		<category><![CDATA[leading question]]></category>
		<category><![CDATA[pie charts]]></category>
		<category><![CDATA[pimp my prospects]]></category>
		<category><![CDATA[Survey]]></category>

		<guid isPermaLink="false">http://www.datajournalismblog.com/?p=1657</guid>
		<description><![CDATA[&#160; As part of our course we put together an online blog that addresses the needs of a particular community. The one that I was involved in works on the premise that if you put good career advice on the internet, it is accessible to all rather than limited to a few privileged people. We publish hints and tips from career veterans as well as graduates who are just starting out in the hope that it will be a resource for people who are ambitious, but don’t have access to people working in the professions that they are interested in. As the website is purely aimed at young people, it was important that we got their feedback on what we were doing. Our strategy for this was two-pronged; we arranged a meeting with a class of young adults and discussed the site with them, and put a survey online for people to fill out. The survey was my responsibility, and the first that I had written by myself. In retrospect, although we had a good response and we were able to learn some important things about what we had succeeded in and what we needed to include, there were some things about the survey that I was unhappy with so I thought I would blog about what I did right and what I got wrong. Example 1: Two questions that overlapped This is a classic example of what happens when you write a quiz with an agenda. I was so interested in finding out how well we were serving the community that I totally failed to notice that two of my questions covered very similar territory. Solution: These needed to either be made more distinctive, or one of them should have been removed. Example 2: Vague questions In my determination not to ask leading questions, I left some of them vague. While the upside of this is that it allowed for some diverse, and surprising answers, the downside was that I felt it lacked a bit of structure. Solution: This could probably have been clearer and I could have left the blank answer box to still give more room for original answers than a tickbox. Example 3: Results Part of the work I did was to analyse the results and blog about them. The positive thing that came out of the survey was that we were able to respond to the requests we received for some clearly explained advice on interviews, CVs and cover letters. This allowed us to listen to the needs of the online community and directly serve them. The one thing that I wasn’t happy about though was my visualisations of some of the data. Solution: If I go back to the post, I will change the colour scheme on this pie chart (the other one was fine) so that the segments stand out more. I would also get rid of the key in the corner, as it adds nothing new and simply repeats the information already...]]></description>
				<content:encoded><![CDATA[<p>&nbsp;</p>
<p>As part of our course we put together an online blog that addresses the needs of a particular community. The one that I was involved in works on the premise that if you put good career advice on the internet, it is accessible to all rather than limited to a few privileged people. We publish hints and tips from career veterans as well as graduates who are just starting out in the hope that it will be a resource for people who are ambitious, but don’t have access to people working in the professions that they are interested in.</p>
<p>As the website is purely aimed at young people, it was important that we got their feedback on what we were doing. Our strategy for this was two-pronged; we arranged a meeting with a class of young adults and discussed the site with them, and put a survey online for people to fill out.</p>
<p>The survey was my responsibility, and the first that I had written by myself. In retrospect, although we had a good response and we were able to learn some important things about what we had succeeded in and what we needed to include, there were some things about the survey that I was unhappy with so I thought I would blog about what I did right and what I got wrong.</p>
<p><a href="http://www.datajournalismblog.com/wp-content/uploads/2012/05/Screen-Shot-2012-05-08-at-15.32.40.png"><img class="alignnone size-full wp-image-1718" src="http://www.datajournalismblog.com/wp-content/uploads/2012/05/Screen-Shot-2012-05-08-at-15.32.40.png" alt="" width="813" height="308" /></a></p>
<p><strong>Example 1:</strong> Two questions that overlapped</p>
<p>This is a classic example of what happens when you write a quiz with an agenda. I was so interested in finding out how well we were serving the community that I totally failed to notice that two of my questions covered very similar territory.</p>
<p><strong>Solution:</strong> These needed to either be made more distinctive, or one of them should have been removed.</p>
<p><a href="http://www.datajournalismblog.com/wp-content/uploads/2012/05/Screen-Shot-2012-05-08-at-15.35.13.png"><img class="alignnone size-full wp-image-1717" src="http://www.datajournalismblog.com/wp-content/uploads/2012/05/Screen-Shot-2012-05-08-at-15.35.13.png" alt="" width="583" height="179" /></a></p>
<p><strong>Example 2:</strong> Vague questions</p>
<p>In my determination not to ask leading questions, I left some of them vague. While the upside of this is that it allowed for some diverse, and surprising answers, the downside was that I felt it lacked a bit of structure.</p>
<p><strong>Solution</strong>: This could probably have been clearer and I could have left the blank answer box to still give more room for original answers than a tickbox.</p>
<p><a href="http://www.datajournalismblog.com/wp-content/uploads/2012/05/Referrals.png"><img class="alignnone size-full wp-image-1719" src="http://www.datajournalismblog.com/wp-content/uploads/2012/05/Referrals.png" alt="" width="729" height="390" /></a></p>
<p><strong>Example 3: Results</strong></p>
<p>Part of the work I did was to analyse the results and blog about them. The positive thing that came out of the survey was that we were able to respond to the requests we received for some clearly explained advice on interviews, CVs and cover letters. This allowed us to listen to the needs of the online community and directly serve them. The one thing that I wasn’t happy about though was my visualisations of some of the data.</p>
<p><strong>Solution:</strong> If I go back to the post, I will change the colour scheme on this pie chart (the other one was fine) so that the segments stand out more. I would also get rid of the key in the corner, as it adds nothing new and simply repeats the information already found on the pie chart labels.</p>
<p>For my next blog post, I thought I would follow on from this one and put together a how-to guide to surveys.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.datajournalismblog.com/2012/05/08/writing-a-survey/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Sex trafficking: a story of data gone wrong</title>
		<link>http://www.datajournalismblog.com/2012/05/08/sextraffickingdata/</link>
		<comments>http://www.datajournalismblog.com/2012/05/08/sextraffickingdata/#comments</comments>
		<pubDate>Tue, 08 May 2012 08:38:00 +0000</pubDate>
		<dc:creator>Lauren York</dc:creator>
				<category><![CDATA[Infographics]]></category>
		<category><![CDATA[Dr Brooke Magnanti]]></category>
		<category><![CDATA[infographic]]></category>
		<category><![CDATA[moral panic]]></category>
		<category><![CDATA[Natalie Rothschild]]></category>
		<category><![CDATA[prostitutes]]></category>
		<category><![CDATA[sex trafficking]]></category>
		<category><![CDATA[the sex myth]]></category>

		<guid isPermaLink="false">http://www.datajournalismblog.com/?p=1655</guid>
		<description><![CDATA[&#160; One of the coursework pieces we have been set for our Journalism and Society module is about the moral panic surrounding sex trafficking. It struck me that the topic is a model example of what can happen when recycling data goes wrong. It is also the subject of one of the chapters of a new book ‘The Sex Myth’ by Dr Brooke Magnanti, otherwise known as the high class call girl Belle de Jour. Dr Magnanti, who is a research scientist as well as a onetime call girl (although her research subject is in children’s health), has set out to dispel some ‘myths’ about the sex trade. Her book, subtitled ‘Why Everything We’re Told is Wrong’, takes a different ‘myth’ in each chapter and attempts to blow them out of the water. In chapter seven she takes issue with the idea that thousands of girls are trafficked against their will to be sex workers in the UK. She blames women’s magazines like Glamour, who in May 2010 published an article called ‘Sex Slave in Suburbia’ claiming that 500,000 women are trafficked in the EU for sex, without offering any source for the figure. Magnanti’s book is interesting, pertinent, and passionately argued. However, she lets herself down by including statistical errors, for example she cites a Keele University study, but gets the amount of people participating in the study wrong.  When your main gripe is people who use incorrect figures to back up spurious arguments, you have to be extra careful with your data. She is not the only one to take issue with the numbers, Nick Davies writing for the Guardian has also criticised the overinflated statistics surrounding the ‘moral panic’ of sex trafficking, and Spiked contributor Natalie Rothschild is positively fuming in numerous articles where she points out the dangers of policies based on inflated numbers. I thought it would be interesting to make my own infographic, taking a look at what happened in this situation, and get a sense of the danger of recycled data. This is a great example of what can happen when people assume other people’s figures are correct and quote them, or an exaggerated version of them as fact. This is a story of data done badly. This shows what can happen when people use incorrect statistics. The problem is particularly acute when MPs or newspapers contain incorrect statistics, as these then go down as matters of record. The fact is that there are victims of trafficking in the UK, and the first thing we can do to get a handle on the problem is to get our facts right.]]></description>
				<content:encoded><![CDATA[<p>&nbsp;</p>
<p>One of the coursework pieces we have been set for our Journalism and Society module is about the moral panic surrounding sex trafficking. It struck me that the topic is a model example of what can happen when recycling data goes wrong.</p>
<p>It is also the subject of one of the chapters of a new book ‘The Sex Myth’ by Dr Brooke Magnanti, otherwise known as the high class call girl Belle de Jour.</p>
<p>Dr Magnanti, who is a research scientist as well as a onetime call girl (although her research subject is in children’s health), has set out to dispel some ‘myths’ about the sex trade. Her book, subtitled ‘Why Everything We’re Told is Wrong’, takes a different ‘myth’ in each chapter and attempts to blow them out of the water.</p>
<p>In chapter seven she takes issue with the idea that thousands of girls are trafficked against their will to be sex workers in the UK. She blames women’s magazines like Glamour, who in May 2010 published an article called ‘Sex Slave in Suburbia’ claiming that 500,000 women are trafficked in the EU for sex, without offering any source for the figure.</p>
<p>Magnanti’s book is interesting, pertinent, and passionately argued. However, she lets herself down by including statistical errors, for example she cites a Keele University study, but gets the amount of people participating in the study wrong.  When your main gripe is people who use incorrect figures to back up spurious arguments, you have to be extra careful with your data.</p>
<p>She is not the only one to take issue with the numbers, Nick Davies writing for the Guardian has also criticised the overinflated statistics surrounding the ‘moral panic’ of sex trafficking, and Spiked contributor Natalie Rothschild is positively fuming in numerous articles where she points out the dangers of policies based on inflated numbers.</p>
<p>I thought it would be interesting to make my own infographic, taking a look at what happened in this situation, and get a sense of the danger of recycled data. This is a great example of what can happen when people assume other people’s figures are correct and quote them, or an exaggerated version of them as fact. This is a story of data done badly.</p>
<p><a href="http://www.datajournalismblog.com/wp-content/uploads/2012/05/Incorrect-statistics.jpg"><img class="alignnone  wp-image-1706" src="http://www.datajournalismblog.com/wp-content/uploads/2012/05/Incorrect-statistics.jpg" alt="" width="607" height="911" /></a></p>
<p>This shows what can happen when people use incorrect statistics. The problem is particularly acute when MPs or newspapers contain incorrect statistics, as these then go down as matters of record. The fact is that there are victims of trafficking in the UK, and the first thing we can do to get a handle on the problem is to get our facts right.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.datajournalismblog.com/2012/05/08/sextraffickingdata/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Smartphone theft: the story behind the data</title>
		<link>http://www.datajournalismblog.com/2012/05/08/smartphonetheft/</link>
		<comments>http://www.datajournalismblog.com/2012/05/08/smartphonetheft/#comments</comments>
		<pubDate>Tue, 08 May 2012 08:34:00 +0000</pubDate>
		<dc:creator>Lauren York</dc:creator>
				<category><![CDATA[The life-cycle of a data story]]></category>
		<category><![CDATA[400%]]></category>
		<category><![CDATA[CBBC]]></category>
		<category><![CDATA[Guardian]]></category>
		<category><![CDATA[Islington]]></category>
		<category><![CDATA[mobile phone theft]]></category>
		<category><![CDATA[smartphone]]></category>
		<category><![CDATA[snatching]]></category>

		<guid isPermaLink="false">http://www.datajournalismblog.com/?p=1652</guid>
		<description><![CDATA[&#160; In this post I want to look at how data can get recycled once it is published. Specifically, I want to look at how a piece of data that I had gathered through a Freedom of Information request made it’s way into a national newspaper and broadcaster. Way back in November, I was working on stories about Barnsbury ward in Islington as part of my patch portfolio work for my MA at City University. The brief was to really get to know a small area of London, and use it as a place to make contacts and gather stories from your set patch. As part of this project I arranged a meeting with my local Community Support Officer, who as someone who regularly patrolled the area, was able to give me a unique insight into the character of the ward. One of the things that stood out in the meeting was that the ward, despite having a police station bang in the middle of it, was subject to a very high level of mobile phone theft, a pattern that I was assured was repeated across the borough. With this information, I decided to submit a Freedom of Information request and received an acknowledgement of my request on November 15: Dear Miss York Freedom of Information Request Reference No: 2011110002133 I write in connection with your request for information which was received by the Metropolitan Police Service (MPS) on 14/11/2011.  I note you seek access to the following information: &#8221; -Number of mobile phone thefts in Islington borough by year from 2005. any additional breakdown of the information, eg by month or the location of the incidents would also be helpful, although please do not include this if it will go over the limit. &#8220; Your request will now be considered in accordance with the Freedom of Information Act 2000 (the Act).  You will receive a response within the statutory timescale of 20 working days as defined by the Act, subject to the information not being exempt or containing a reference to a third party. &#160; My instinct was that there was a story here, and sure enough within a few weeks I had heard back from the police, who sent me the relevant figures. Although I received all the figures between 2005 and 2011, I have put the 2010-2011 figures below: &#160; DECISION I have today decided to disclose the located information to you in full.  Please find attached information pursuant to your request above. 2010 ROBBERY &#8211; 370         SNATCH &#8211; 157         PICKPOCKET &#8211; 752         OTHER THEFT &#8211; 1527 2011 ROBBERY &#8211; 486         SNATCH &#8211; 786         PICKPOCKET &#8211; 689         OTHER THEFT – 1617 The number that really jumped out at me was the increase in snatches between 2010 and 2011. In order to understand this it was back to the police station, this time for...]]></description>
				<content:encoded><![CDATA[<p>&nbsp;</p>
<p>In this post I want to look at how data can get recycled once it is published. Specifically, I want to look at how a piece of data that I had gathered through a Freedom of Information request made it’s way into a national newspaper and broadcaster.</p>
<p>Way back in November, I was working on stories about Barnsbury ward in Islington as part of my patch portfolio work for my MA at City University. The brief was to really get to know a small area of London, and use it as a place to make contacts and gather stories from your set patch.</p>
<p>As part of this project I arranged a meeting with my local Community Support Officer, who as someone who regularly patrolled the area, was able to give me a unique insight into the character of the ward.</p>
<p>One of the things that stood out in the meeting was that the ward, despite having a police station bang in the middle of it, was subject to a very high level of mobile phone theft, a pattern that I was assured was repeated across the borough.</p>
<p>With this information, I decided to submit a Freedom of Information request and received an acknowledgement of my request on November 15:</p>
<p>Dear Miss York</p>
<p><strong>Freedom of Information Request Reference No: 2011110002133</strong><br />
I write in connection with your request for information which was received by the Metropolitan Police Service (MPS) on 14/11/2011.  I note you seek access to the following information:</p>
<p><em>&#8221; -Number of mobile phone thefts in Islington borough by year from 2005. any additional breakdown of the information, eg by month or the location of the incidents would also be helpful, although please do not include this if it will go over the limit. &#8220;</em></p>
<p>Your request will now be considered in accordance with the Freedom of Information Act 2000 (the Act).  You will receive a response within the statutory timescale of 20 working days as defined by the Act, subject to the information not being exempt or containing a reference to a third party.</p>
<p>&nbsp;</p>
<p>My instinct was that there was a story here, and sure enough within a few weeks I had heard back from the police, who sent me the relevant figures. Although I received all the figures between 2005 and 2011, I have put the 2010-2011 figures below:</p>
<p>&nbsp;</p>
<p><strong>DECISION</strong></p>
<p>I have today decided to disclose the located information to you in full.  Please find attached information pursuant to your request above.</p>
<p><strong>2010</strong></p>
<p>ROBBERY &#8211; 370         SNATCH &#8211; 157         PICKPOCKET &#8211; 752         OTHER THEFT &#8211; 1527</p>
<p><strong>2011</strong></p>
<p>ROBBERY &#8211; 486         SNATCH &#8211; 786         PICKPOCKET &#8211; 689         OTHER THEFT – 1617</p>
<p>The number that really jumped out at me was the increase in snatches between 2010 and 2011. In order to understand this it was back to the police station, this time for an interview with a detective. He explained that the increase in snatches was down to a new technique, where gangs riding on bicycles coasted down streets looking for a blissfully unaware smartphone user holding his mobile out in front of his face, following gps instructions, checking their email or just simply taking a photo. All the criminals needed to do was reach out, snatch the mobile phone from their hand, and ride off at speed, making it virtually impossible for the victim to catch up, even if they did react quickly to the situation (although more often than not they stand looking confused for several seconds until the reality of the theft sinks in.)</p>
<p>The percentage change from 157 to 786 was 400.637%. So now I had my story, and published it online with the strapline ‘400% rise in mobile phone snatching’.</p>
<p>Several months later, a colleague from City was writing an article for a national newspaper on the subject, and needed some figures. Suddenly the Islington example was on the map again, as the 400% increase figure made it into the Guardian.</p>
<p><a href="http://www.datajournalismblog.com/wp-content/uploads/2012/05/Guardian.png"><img class="alignnone size-full wp-image-1700" src="http://www.datajournalismblog.com/wp-content/uploads/2012/05/Guardian.png" alt="" width="503" height="128" /></a></p>
<p>Before I knew it, the BBC had published an article on mobile phone theft. And what was the example used? Yup, you guessed it, Islington, snatching, 400%.</p>
<p><a href="http://www.datajournalismblog.com/wp-content/uploads/2012/05/CBBC.png"><img class="alignnone size-full wp-image-1701" src="http://www.datajournalismblog.com/wp-content/uploads/2012/05/CBBC.png" alt="" width="652" height="487" /></a></p>
<p>This figure is accurate and informative, so was recycled as an example several times. Nothing wrong with that as my original data was accurate. But in my next post I want to look at the dangers of this phenomenon, or what happens when recycling goes wrong.</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.datajournalismblog.com/2012/05/08/smartphonetheft/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>How to: change the icons on fusion table maps</title>
		<link>http://www.datajournalismblog.com/2012/05/08/fusionmapicons/</link>
		<comments>http://www.datajournalismblog.com/2012/05/08/fusionmapicons/#comments</comments>
		<pubDate>Tue, 08 May 2012 08:31:02 +0000</pubDate>
		<dc:creator>Lauren York</dc:creator>
				<category><![CDATA[Data skills]]></category>
		<category><![CDATA[Google Fusion]]></category>
		<category><![CDATA[google fusion maps]]></category>
		<category><![CDATA[how to guide]]></category>
		<category><![CDATA[icons]]></category>

		<guid isPermaLink="false">http://www.datajournalismblog.com/?p=1649</guid>
		<description><![CDATA[&#160; Maps made with Google Fusion are an incredibly useful tool in the newsroom, but they don’t have to all look the same. I wanted to see how you can make yours a bit more distinctive. One of the main variables that you can change is the icon that you use to pinpoint something on the map. For example, you can use traditional markers like these: Or you can experiment with slightly more obscure ones like these: Here’s a quick how to guide to changing the icons on your Google map. First, choose the symbol(s) that you want to use by following the link below and looking at your options: http://bit.ly/fusionmapicons In order to apply the icon to your map, press File, then Add New Column. Call the new column ‘icon’. In the column, copy and paste the icon name from the map marker page. You can find the icon’s individual name by clicking directly on the one that you want, and looking at the text next to where it says ‘icon name’. Then, when you are configuring the map style, choose ‘column’, and select the ‘icon’ column from the dropdown menu. This should apply the icons that you have chosen to the rows of data in your fusion table, giving you personalised entries on the map. So that’s how you do it, but when would this ever be useful? Let’s look at some examples. &#160; Name: airports Possible use: Perhaps an Icelandic volcano decides to erupt and grounds most of the flights coming in to and leaving from UK airports. Your news organisation wants to give their readers up to date information on what the situation is at the airports near them. A traditional marker would also work, but this shows instantly that the symbol represents an airport and lends the map some visual interest. Name: factory Possible use: The government decide to decommission 40% of the remaining nuclear power plants. You want to show which ones will be affected by the plans, and which will remain running unchanged. One of these markers could represent each power plant, allowing the reader to see whether their area will be altered. Name: placemark_circle_highlight Possible use: The dark red of this would be more appropriate to map a serious incident, for example an outbreak of rioting, then the pastelly coral red that the normal markers are made with. Name: 10_blue Possible use: It might be useful to have numbers in the actual icon rather than just in the pop up text box, as this could give a reader an instant sense of scale, for example, by ranking something from one to ten, or showing the number of incidents. There are plenty more to experiment, and while some can look gimmicky, others really can lend an extra dimension to your mapping work.]]></description>
				<content:encoded><![CDATA[<p>&nbsp;</p>
<p>Maps made with Google Fusion are an incredibly useful tool in the newsroom, but they don’t have to all look the same. I wanted to see how you can make yours a bit more distinctive. One of the main variables that you can change is the icon that you use to pinpoint something on the map.</p>
<p>For example, you can use traditional markers like these:</p>
<p><a href="http://www.datajournalismblog.com/wp-content/uploads/2012/05/Screen-Shot-2012-05-08-at-15.40.20.png"><img class="alignnone size-full wp-image-1727" src="http://www.datajournalismblog.com/wp-content/uploads/2012/05/Screen-Shot-2012-05-08-at-15.40.20.png" alt="" width="212" height="61" /></a></p>
<p>Or you can experiment with slightly more obscure ones like these:</p>
<p><a href="http://www.datajournalismblog.com/wp-content/uploads/2012/05/Screen-Shot-2012-05-08-at-15.41.10.png"><img class="alignnone size-full wp-image-1726" src="http://www.datajournalismblog.com/wp-content/uploads/2012/05/Screen-Shot-2012-05-08-at-15.41.10.png" alt="" width="304" height="191" /></a></p>
<p>Here’s a quick how to guide to changing the icons on your Google map.</p>
<p>First, choose the symbol(s) that you want to use by following the link below and looking at your options:</p>
<p><a href="http://bit.ly/fusionmapicons" target="_blank">http://bit.ly/fusionmapicons</a></p>
<p>In order to apply the icon to your map, press File, then Add New Column. Call the new column ‘icon’.</p>
<p>In the column, copy and paste the icon name from the map marker page. You can find the icon’s individual name by clicking directly on the one that you want, and looking at the text next to where it says ‘icon name’.</p>
<p>Then, when you are configuring the map style, choose ‘column’, and select the ‘icon’ column from the dropdown menu.</p>
<p>This should apply the icons that you have chosen to the rows of data in your fusion table, giving you personalised entries on the map.</p>
<p>So that’s how you do it, but when would this ever be useful? Let’s look at some examples.</p>
<p>&nbsp;</p>
<p><a href="http://www.datajournalismblog.com/wp-content/uploads/2012/05/Screen-Shot-2012-05-08-at-15.41.37.png"><img class="alignleft size-full wp-image-1725" src="http://www.datajournalismblog.com/wp-content/uploads/2012/05/Screen-Shot-2012-05-08-at-15.41.37.png" alt="" width="115" height="77" /></a><strong>Name</strong>: airports</p>
<p><strong>Possible use</strong>: Perhaps an Icelandic volcano decides to erupt and grounds most of the flights coming in to and leaving from UK airports. Your news organisation wants to give their readers up to date information on what the situation is at the airports near them. A traditional marker would also work, but this shows instantly that the symbol represents an airport and lends the map some visual interest.</p>
<p><a href="http://www.datajournalismblog.com/wp-content/uploads/2012/05/Screen-Shot-2012-05-08-at-15.41.56.png"><img class="alignleft size-full wp-image-1724" src="http://www.datajournalismblog.com/wp-content/uploads/2012/05/Screen-Shot-2012-05-08-at-15.41.56.png" alt="" width="103" height="79" /></a><strong>Name</strong>: factory</p>
<p><strong>Possible use:</strong> The government decide to decommission 40% of the remaining nuclear power plants. You want to show which ones will be affected by the plans, and which will remain running unchanged. One of these markers could represent each power plant, allowing the reader to see whether their area will be altered.</p>
<p><a href="http://www.datajournalismblog.com/wp-content/uploads/2012/05/Screen-Shot-2012-05-08-at-15.42.30.png"><img class="alignleft size-full wp-image-1722" src="http://www.datajournalismblog.com/wp-content/uploads/2012/05/Screen-Shot-2012-05-08-at-15.42.30.png" alt="" width="85" height="71" /></a><strong>Name</strong>: placemark_circle_highlight</p>
<p><strong>Possible use:</strong> The dark red of this would be more appropriate to map a serious incident, for example an outbreak of rioting, then the pastelly coral red that the normal markers are made with.</p>
<p><a href="http://www.datajournalismblog.com/wp-content/uploads/2012/05/Screen-Shot-2012-05-08-at-15.43.09.png"><img class="alignleft size-full wp-image-1723" src="http://www.datajournalismblog.com/wp-content/uploads/2012/05/Screen-Shot-2012-05-08-at-15.43.09.png" alt="" width="106" height="82" /></a><strong>Name</strong>: 10_blue</p>
<p><strong>Possible use</strong>: It might be useful to have numbers in the actual icon rather than just in the pop up text box, as this could give a reader an instant sense of scale, for example, by ranking something from one to ten, or showing the number of incidents.</p>
<p>There are plenty more to experiment, and while some can look gimmicky, others really can lend an extra dimension to your mapping work.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.datajournalismblog.com/2012/05/08/fusionmapicons/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Mapping the English Defence League</title>
		<link>http://www.datajournalismblog.com/2012/05/08/edlmap/</link>
		<comments>http://www.datajournalismblog.com/2012/05/08/edlmap/#comments</comments>
		<pubDate>Tue, 08 May 2012 08:28:35 +0000</pubDate>
		<dc:creator>Lauren York</dc:creator>
				<category><![CDATA[Interactive]]></category>
		<category><![CDATA[EDL]]></category>
		<category><![CDATA[English Defence League]]></category>
		<category><![CDATA[Google Fusion]]></category>
		<category><![CDATA[maps]]></category>

		<guid isPermaLink="false">http://www.datajournalismblog.com/?p=1646</guid>
		<description><![CDATA[&#160; In January I was asked to do a map using Google fusion tables for the newspaper that I was doing work experience at. The topic of the article was the English Defence League, and my job was to map all the EDL marches that had taken place over the last few years. It was left up to me to decide what data to include in the map, and in this post I want to take you through the decision process. Step 1. I googled EDL marches. A fairly obvious first move. A Wikipedia article came up with  list of EDL marches. Three years of academic work at university has drummed into me that I must never take Wikipedia as a source in itself, but the article did usefully have a list of referenced newspaper articles about the marches. &#160; Step 2. The next part was time consuming. It involved reading each article to look for the pertinent facts, and then cross checking the data against other articles using Factiva. Step 3. Another Factiva trawl, this time to see if I had missed any major marches out from the list. Step 4. Set up the fusion table and started inputting the data. The problem was that I had much more information about some of the marches than others. I needed to decide what information I should focus on without being too vague or having such a broad spectrum that only a few entries would be complete. I decided on; Location (self-explanatory for a Google map) Date (quite important to get a sense of whether the marches are getting bigger or smaller over time) Then I tried to give, where possible, the number of people marching for the EDL and any interesting facts about whether there were arrests or clashes with the police. One of the things that struck me when I was doing the research was that whenever there was an EDL march, there was almost always a counter-march as well, so I thought it was only fair to put the numbers of counter-marchers in where I possibly could. Quite often the counter-marches were actually bigger than the EDL ones. The data inputting took less time than the fact checking had, and by the end of a long day the map was ready. Not totally without glitches, as the width of the page rejected the map at first, and  we had to play around with it at length to try and get the map embedded. Finally though, it was finished, and I think illustrated the article better than the factbox that accompanied it in the paper copy.]]></description>
				<content:encoded><![CDATA[<p>&nbsp;</p>
<p>In January I was asked to do a map using Google fusion tables for the newspaper that I was doing work experience at.</p>
<p>The topic of the article was the English Defence League, and my job was to map all the EDL marches that had taken place over the last few years.</p>
<p>It was left up to me to decide what data to include in the map, and in this post I want to take you through the decision process.</p>
<p>Step 1. I googled EDL marches. A fairly obvious first move. A Wikipedia article came up with  list of EDL marches. Three years of academic work at university has drummed into me that I must never take Wikipedia as a source in itself, but the article did usefully have a list of referenced newspaper articles about the marches.</p>
<p><a href="http://www.datajournalismblog.com/wp-content/uploads/2012/05/Wikipedia1.png"><img class="alignnone size-full wp-image-1695" src="http://www.datajournalismblog.com/wp-content/uploads/2012/05/Wikipedia1.png" alt="" width="1004" height="146" /></a></p>
<p>&nbsp;</p>
<p>Step 2. The next part was time consuming. It involved reading each article to look for the pertinent facts, and then cross checking the data against other articles using Factiva.</p>
<p>Step 3. Another Factiva trawl, this time to see if I had missed any major marches out from the list.</p>
<p>Step 4. Set up the fusion table and started inputting the data. The problem was that I had much more information about some of the marches than others. I needed to decide what information I should focus on without being too vague or having such a broad spectrum that only a few entries would be complete. I decided on;</p>
<p>Location (self-explanatory for a Google map)</p>
<p>Date (quite important to get a sense of whether the marches are getting bigger or smaller over time)</p>
<p>Then I tried to give, where possible, the number of people marching for the EDL and any interesting facts about whether there were arrests or clashes with the police. One of the things that struck me when I was doing the research was that whenever there was an EDL march, there was almost always a counter-march as well, so I thought it was only fair to put the numbers of counter-marchers in where I possibly could. Quite often the counter-marches were actually bigger than the EDL ones.</p>
<p>The data inputting took less time than the fact checking had, and by the end of a long day the map was ready. Not totally without glitches, as the width of the page rejected the map at first, and  we had to play around with it at length to try and get the map embedded.</p>
<p><a href="http://www.datajournalismblog.com/wp-content/uploads/2012/05/EDL-map.png"><img class="alignnone size-full wp-image-1696" src="http://www.datajournalismblog.com/wp-content/uploads/2012/05/EDL-map.png" alt="" width="764" height="315" /></a></p>
<p>Finally though, it was finished, and I think illustrated the <a href="http://www.independent.co.uk/news/uk/politics/edl-to-spread-its-farright-creed-to-new-target-towns-6289162.html">article</a> better than the factbox that accompanied it in the paper copy.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.datajournalismblog.com/2012/05/08/edlmap/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>When names become data: what your baby&#8217;s name says about your royal aspirations</title>
		<link>http://www.datajournalismblog.com/2012/05/08/babynames/</link>
		<comments>http://www.datajournalismblog.com/2012/05/08/babynames/#comments</comments>
		<pubDate>Tue, 08 May 2012 08:26:04 +0000</pubDate>
		<dc:creator>Lauren York</dc:creator>
				<category><![CDATA[Amusing Data]]></category>
		<category><![CDATA[2011]]></category>
		<category><![CDATA[baby names]]></category>
		<category><![CDATA[Harry]]></category>
		<category><![CDATA[Isabelle]]></category>
		<category><![CDATA[royal family]]></category>
		<category><![CDATA[William]]></category>

		<guid isPermaLink="false">http://www.datajournalismblog.com/?p=1644</guid>
		<description><![CDATA[&#160; Over the Christmas holidays I was doing some work experience at a national paper. As it was December, people were throwing around ideas to try and decide how to mark the passing year. One article proposal was to try and see what names had been the most popular in 2011. The newspaper is the place where many people announce and record the births of their children. The plan was to tally all the names from the birth announcements and see if there were any particular names that were popular with the readers this year. As this was the first time that this had been attempted, there would be no previous years to compare the data with, but if the concept was successful, it was something that could be repeated every year and patterns could be established. Of course this was not an exhaustive list, but limited to the readers of that particular paper, who had chosen to register their child’s birth that year. It is possible to get data on the most popular baby names nationwide, but this was a project limited to this particular demographic. &#160; &#160; I was drafted in and given copies of all the birth announcements. Of course this was not a straightforward task. For one thing, the records were hard copy rather than digital. As well as this, there was no database or spreadsheet that contained the information. Hard copies it was. The first thing that I decided to do was to limit this to first names and ignore middle names unless they were hyphenated to the first name. Then I decided to produce two different data sets, one for boys’ names and one for girls’ names, as the two are not directly comparable. (You wouldn’t be considering the name James for a girl. Or maybe you would…I don’t know how your mind works…) Within two excel spreadsheets, I then produced a different column for each letter of the alphabet. Then it was time for the hard work to start. The paper had received over 1,330 birth announcements in that year alone, and each one was buried between deaths and engagements. Fortunately they did have their own clearly marked subheading, so it wasn’t too hard to spot them, but occasionally there weren’t any birth announcements, and every few weeks they had larger features on newborns, which also had to be spotted and included. The autofill function on Excel became both a blessing and a curse, filling in the most common names for me and saving me typing time, but also assuming that I meant to type things that I didn’t, meaning that I had to be constantly alert. When all the names had been inputted, I ordered the columns alphabetically, scanned for the most frequently occurring by eye (they stood out fairly easily) and then selected the cells with the names in to get a count for each high frequency name. I was fairly sure that there was a whizzy automated way of doing...]]></description>
				<content:encoded><![CDATA[<p>&nbsp;</p>
<p>Over the Christmas holidays I was doing some work experience at a national paper. As it was December, people were throwing around ideas to try and decide how to mark the passing year. One article proposal was to try and see what names had been the most popular in 2011.</p>
<p>The newspaper is the place where many people announce and record the births of their children. The plan was to tally all the names from the birth announcements and see if there were any particular names that were popular with the readers this year.</p>
<p>As this was the first time that this had been attempted, there would be no previous years to compare the data with, but if the concept was successful, it was something that could be repeated every year and patterns could be established.</p>
<p>Of course this was not an exhaustive list, but limited to the readers of that particular paper, who had chosen to register their child’s birth that year. It is possible to get data on the most popular baby names nationwide, but this was a project limited to this particular demographic.</p>
<p>&nbsp;</p>
<p><a href="http://www.datajournalismblog.com/wp-content/uploads/2012/05/baby-names.png"><img class="alignnone size-full wp-image-1711" src="http://www.datajournalismblog.com/wp-content/uploads/2012/05/baby-names.png" alt="" width="763" height="567" /></a></p>
<p>&nbsp;</p>
<p>I was drafted in and given copies of all the birth announcements. Of course this was not a straightforward task. For one thing, the records were hard copy rather than digital. As well as this, there was no database or spreadsheet that contained the information. Hard copies it was.</p>
<p>The first thing that I decided to do was to limit this to first names and ignore middle names unless they were hyphenated to the first name. Then I decided to produce two different data sets, one for boys’ names and one for girls’ names, as the two are not directly comparable. (You wouldn’t be considering the name James for a girl. Or maybe you would…I don’t know how your mind works…)</p>
<p>Within two excel spreadsheets, I then produced a different column for each letter of the alphabet. Then it was time for the hard work to start. The paper had received over 1,330 birth announcements in that year alone, and each one was buried between deaths and engagements.</p>
<p>Fortunately they did have their own clearly marked subheading, so it wasn’t too hard to spot them, but occasionally there weren’t any birth announcements, and every few weeks they had larger features on newborns, which also had to be spotted and included.</p>
<p>The autofill function on Excel became both a blessing and a curse, filling in the most common names for me and saving me typing time, but also assuming that I meant to type things that I didn’t, meaning that I had to be constantly alert. When all the names had been inputted, I ordered the columns alphabetically, scanned for the most frequently occurring by eye (they stood out fairly easily) and then selected the cells with the names in to get a count for each high frequency name.</p>
<p>I was fairly sure that there was a whizzy automated way of doing this at the time, but just couldn’t get my head around a solution, and as time was tight before the deadline, this had to do. I would certainly be open to ideas to make the process more efficient</p>
<p>The article turned out to be royal themed, as William and Harry topped the list. Surprisingly, Isabella made the top of the girls’ list, although we had a debate over whether to include variants like Isabelle, Isobel and Isabel along with the name. (IThey did get put together in the end, but with a proviso that it was ‘variants of Isabelle’, Max and Maximillian also got lumped together.)</p>
<p>So, names as data? Well they can tell an interesting story about the hopes and aspirations of a demographic of readers. What you choose to call your children is data in its own right. And just like your life expectancy and your house price, it can say a lot about you. It certainly made an interesting study.</p>
<p>Please do comment below if you know how I could have improved my system, I might be able to pass the information on to the next work experience girl to take on the task!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.datajournalismblog.com/2012/05/08/babynames/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
