statistical analysis 32
Order Description
Length – Graphs, code and maximum of 1000 words (equivalent to 1500 words)
• Due – Friday 13th January 2017
• Submission – Electronic, via TurnItIn
• Weight – 60% of final mark
In this assignment, you will take on the role of a data scientist for the UK government who has been
asked to investigate perceptions of crime in England and Wales. The particular issue of interest for the
government is the public’s fear of crime; that is, the extent to which people in England and Wales are
worried about being a victim of crime (e.g. burglary, mugging or physical attacks). This is the subject
of much debate within the media, and the government wishes to know both the extent of the issue
and its relationships with other factors.
Your task is to create a report which offers insight into this issue for government employees involved
in crime reduction. In order to present a compelling outline of the problem, you will have to create a
report which includes rigorous statistical analysis, presented in a way which is both attractive and
accessible to non-experts. Descriptive statistics and statistical tests should be used to support the
points that you make, but should be described so that the anticipated audience – policymakers within
government – will be able to understand the key principles involved.
Since some of the consumers of the research may be particularly data-savvy, and wish to interrogate
the analysis themselves, the report should be prepared with transparency and reproducibility in mind.
In particular, the code used to produce the analysis should be included so that it can be checked and
re-run if necessary. The report should therefore be delivered in the form of a Jupyter Notebook which
records and displays the analytical workflow.
Using data from the 2013/14 Crime Survey for England & Wales (CSEW), you should describe the levels
of worry in the population in whatever way you feel is most appropriate and informative. You should
also examine the relationships between other factors and worry about crime, using statistical tests
and modelling methods. In particular, you should examine the relationship with at least one variable
of each measurement type (nominal, ordinal and interval/ratio). Your report should include at least 3
graphics and at most 5, where each image counts as a single graphic (regardless of whether it is a
multi-panel plot, for example). These should be selected on the basis of their value to the overall
narrative of the report.
Minimum requirements:
• Provide a brief introduction to the issue being analysed and motivate the analysis which
follows. This does not need to be extensive or detailed – it can be assumed that consumers of
the report will be broadly familiar with the context – but should set the scene for the report.
• Open the CSEW dataset and identify the variable(s) relevant to this issue.
• Present descriptive statistics for the level of worry about crime among survey respondents.
• Identify several contextual variables which might have relevance to the issue. These should
include at least one nominal, one ordinal and one interval/ratio variable. The choice of
variables should be informed by the context of the task – that is, there should be some
rationale for your choices, which should be given.
• Examine, using appropriate statistical methods, the relationships between these variables and
worry about crime.
• Document, using comments, the code used to perform your analysis, so that a reader can
understand what each section of code is doing.
• Produce at least 3 graphics illustrating aspects of your analysis.
• Describe and interpret your analysis, and reflect on its real-world consequences and
implications.
Optional suggestions:
• Perform some aspects of the analysis on a disaggregated basis; i.e. identify subgroups which
may have relevance to the issue in question, and examine these groups separately.
• Identify and examine other potential relationships in the data; that is, rela