I'm new to Pittsburgh. I don't know much about the city, so I want to know where the dangerous areas are in this city to avoid possible accident. I used Pittsburgh Police Arrest Data from The Western Pennsylvania Regional Data Center (WPRDC) to answer my question. The data provided me the locations of arrests happened and the demographic info of the suspects.
I first used map to get the whole picture of the data. As we can see, the areas with more red dots happened more incidents.
Although the dots could give use most precise locations of each incidents and the details for each incident by clicking on it, it's still not clear where is the most dangerous area. Thus, I use heat map to display the data. As we can see, around downtown Pittsburgh, East Liberty and the area between East Liberty and Wilkinsburg are three areas where we can recognize the clusters of incidents.
We can also use the bar chart to see the distribution of numbers of incidents in each area:
It's very possible that the area with the highest arrest number is the most dangerous area, because the police detected and found more criminals there.
I am also interested in who is tend to be arrested. By using pie charts, we can see the majority of arrested people are black male. However, although we can easily know the result, we can't not conclude that this group of people are more dangerous than others. Because the race and gender of the suspects could have two possible implications: 1. If the race/gender category has more incidents, means that race/gender is more possible to commit crime and arrested by the police. 2. It isn't that race/gender people commit more crime, but it is that the police tend to see that race/gender as suspects, so they pay more attention on that race/gender and catch more suspects in that category.
Using the data at hand can't help us to decide which possibility is more likely to be true. If we want to dig deeper into this question, we will need more data.
(left: B= Black, W= White, right:F= Female, M= Male)
As we can see above, the data has some limitations, so it just tells us a half of the story. Do black people commit more crime or just because the police has biased view toward black people? Theoretically, we can use another data set to verify the answer -- there is another dataset which contains all the incidents that are reported, but only the cases are confirmed as crimes and the person was arrested are in this dataset. So we could compare the two dataset, to see if in the reported cases, the numbers of white and black people are the same, but finally more black people were arrested. However, in the other dataset (the reported incidents) there is no race data, thus, we still can't know which hypothesis is correct.
Another limitation might also be noticed. Some might think, if we really can use the arrest data to claim that these areas are more dangerous? Because maybe some cases are not reported, and maybe the arrested cases aren't related to crime that much. However, because the unreported data is always missing and hard to be inferred, by now it's still the most reliable data to know where is the most dangerous places in the city. In addition, the arrested cases must to somewhat serious (say, it's not just a car accident) so the cases are not only reported but also the suspects are arrested. So the data and the question are still fairly related.