共计 2303 个字符,预计需要花费 6 分钟才能阅读完成。
94842 Programming R for Analytics Final Project
Section I (14 points)
Suppose the District of Columbia has data on recent traffic stops by some Metropolitan Police Department officers. The data are in two data sets, available as StopsData11 and StopsData21. Using these data, answer the following questions.
1.Which ward has the most stops?
2.Estimate a model predicting whether or not a traffic stop results in a ticket, using age and sex as predictors. From the estimated model, compared to a male driver of the same age who is stopped, how much more likely or less likely is a female driver to get a ticket?
Section II (20 points)
Suppose the DC government enacted a policy that allows people with low incomes (less than $30,000 per year) to have their ticket amount reduced 20 to 40 percent. The goal of the new policy is to reduce the number of overdue tickets among low-income residents.
In this section, you will use the same traffic stop data as in Section I (StopsData11 and StopsData21), as well as data on the court outcomes of tickets (CourtsData1) and data on residents’ income (IncomeData). Use the four data sets to answer the following questions.
3.Is there a large difference between the ticket amounts for people with incomes less than $30,000 and those with incomes greater than or equal to $30,000? (Tickets are given out in $0.50 increments). Explain your answer in a paragraph.
4.You have been asked to determine whether the new policy has a detectable impact on residents’ likelihood of paying on time. Explain what approach you took for your analysis. What key (1-3) assumptions did you make?
5.In a paragraph, explain to your colleague, a data scientist, whether or not the new policy had a detectable impact on residents’ likelihood of having an overdue ticket. Assume that your colleague is generally familiar with the policy and the data.
6.The Chair of the Council’s Committee on the Judiciary and Public Safety, Councilmember Allen, asks you whether the data show that the policy is working and for a recommendation on whether the policy should be expanded to include residents with incomes less than $60,000. In a paragraph or two, provide your answer. Provide some indication of how confident you are in your response.
Section III (6 points)
A researcher presents a study of security cameras in the city. The researcher ran 40 separate experiments, one in each of 40 Police Service Areas (PSAs). Two of the PSAs showed significant declines in crime (p = 0.05), while the others showed no statistically significant effects.
7.Assess the evidence that the cameras reduced crime in those two PSAs with significant declines.
Weak evidence
Moderate evidence
Strong evidence
8.Explain your answer to the above question in a paragraph.