plotting public data using R graphics libraries (MURDER!!!)

@todo this is a work in progress, its only live because I am testing the charts links...

I was looking for some data for population datasets, and during the googling process I found this post on the BBC Blogs website by Mark Easton with this lead in paragraph:
"On an average day in Britain, two or three people will be murdered. The UK currently has a homicide rate equivalent to the mid-Victorian period.
The prevalence of murder seems a reasonable proxy for the health or sickness of a society and this deteriorating picture of our islands perhaps tells us something about the profound problems of social cohesion."
It has some interesting observations and some visualizations and some links to "Centre for Crime & Justice Studies" which I have made a note @todo to go and mine for some data and reportage at a future point.

However I noticed this chart, which pops to a larger version, which shows the crude currently homicide count apparently almost exactly correlated to the homicide rate. (chart/data/discussion is for England & Wales)

Show the BBC chart

Given that its generally understood that the population of England and Wales has increased over the time period of the chart, and that more recently the population has grown at an increased rate, I would have expected that the rate trend would be slightly less than the crude trend and that pattern increasing along the positive X axis.

However it almost exactly matches within reasonably expectations of plotting accuracy, point for point (or bar for point) as what would be expected if (god forbid!) the BBC (or CFC&JS) had made a mistake.

Obviously if the population multiplier change is of a much lower magnitude than the crude homicide rate change, then that chart is exactly as would be expected.

So I thought it would be worth trying to recreate the chart from the dataset in order to test whether this is a an artefact of the chart decisions, or whether the crude changes are the dominant factor over the population denominator in the rate, like the chart indicates.

So a starting point was to use the ONS data to generate the crude population estimates covering the time-series for the homicide data. This fucking chart actually took me about 3 hours to draw. Fucking R. Fuckity fuck. ;-)

This is the plot for the estimated population of England and Wales from 1971 through to 2010 from the ONS population estimates here... @TODO

So Mark Eastons blog post was published in 2008, but the chart covers 1967 though to 2002, the figures appear to come from the dataset used in the Supplementary report for the statistical report linked in the references, though I have cited the 2010/11 version, presumably?? they have the subset of the same set.

So using R, I overlaid a line plot on a bar chat, treating each row as a bar. This appears to be a simplification as the bin start and dates appear to change slightly during the collections. Any way my first pass is pretty close;

Homicides, Firearm Offences and Intimate Violence
2010/11: Supplementary Volume 2 to Crime in England and Wales 2010/11
Kevin Smith (Ed.), Sarah Osborne, Ivy Lau and Andrew Britton

No comments:

Post a Comment

Don't be nasty. Being rude is fine.