Visualizing College Campus Safety
May 2017
Group Project For Info201: Technical Foundations
About the Project:
In 2014, the Department of Education publicized investigations into mishandled sexual assault cases in an effort to bring to light just how widespread these mishandled cases are.
Prospective and current college students, as well as their parents, should have easy access to information about their college campus’ safety. In order to help students make informed decisions regarding what colleges to attend and increase current students awareness regarding their college’s safety, we want to make it easier to understand both reported cases and how reported data could be somewhat misleading.
This data will show the viewers at a glance:
-
which campuses have the most reported cases of sexual assault?
-
rates of sexual assault proportional to their student body size?
-
how long do various colleges tend to take to close reported cases?
-
which colleges have the worst track record with mishandled sexual assault cases?
These answers will help viewers to develop a clear picture of their safety and how different colleges handle sexual assault.

Process:
We’ve found 3 sets of data that we plan to center this project on: first, an API that contains information regarding mishandled, mainly unclosed college campus sexual assault cases. This information was collected under the Title IX amendment by the Department of Education/Chronicle of Higher Education, and gives us information about what cases were opened, whether they were closed (and if so, when they were), what school the case came from, the state that the school is in, and whether the school is public or private. We will then combine this information with college safety data that is also published by the Department of Education (in a CSV format). This will give us more information about each school (student population, for example). Finally, we will use the names of the schools and the Google Maps API to get the coordinates of each school in order to make accurate map plots.
We intend to use this information to create timeline and geographically organized data, which we will present with an HTML page (using R Markdown and Knitr). This will probably take a fair amount of data wrangling as we need to format data from three different sources into one cohesive dataset. We’ve also done some preliminary data wrangling, and believe that we’ll need (in addition to dplyr) the libraries:
-
httr and jsonlite (to use the API data)
-
Lubridate (to make it easier to format Date data)
-
Plotly and ggplot2 (to make beautiful graphs!)

Interactions:


Result of the Data:

Link to the project: https://anuto.shinyapps.io/info-201-final/