Working with Lincoln Police Traffic Stop Data

R
data
cleaning
Importing and cleaning multi-sheet Excel files for analysis.

Playing around with traffic stop data from the Lincoln Police Department Open Data Portal. The data comes as an Excel file with multiple sheets—one per year—so the first task is getting everything into a single data frame without too much hassle.

Setup

library(tidyverse)
library(readxl)
library(lubridate)
library(here)

Reading Multi-Sheet Excel Files

The data lives in an Excel file with a separate sheet for each year. Rather than reading each sheet manually, I use excel_sheets() to grab all the sheet names, then map() to read them all at once and list_rbind() to stack them into one data frame. The names_to argument keeps track of which sheet each row came from.

path <- here("posts", "lpd-traffic", "data", "lpd-stops.xlsx")

data_raw <- path |> 
  excel_sheets() |> 
  set_names() |> 
  map(read_excel, path = path) |> 
  list_rbind(names_to = "year")

Cleaning Up Dates and Times

The raw date and time columns need some work before they’re useful. I’m adding a few derived variables: month as a label, a time-of-day category, and a cleaner time format.

data <- data_raw |>
  mutate(
    month = month(DATE, label = TRUE, abbr = FALSE),
    date = as.Date(DATE),
    time_of_day = case_when(
      hour(TIME) >= 5 & hour(TIME) < 12 ~ "Morning",
      hour(TIME) >= 12 & hour(TIME) < 17 ~ "Afternoon",
      hour(TIME) >= 17 & hour(TIME) < 21 ~ "Evening",
      TRUE ~ "Night"
    ),
    time = format(TIME, "%I:%M %p")
  )

glimpse(data)

Exporting a Subset for Teaching

I use a subset of this data for a workshop, so I pull out just the 2023 stops and save it as an SPSS file for students using JAMOVI or SPSS.

library(haven)

data_2023 <- data |> 
  filter(year == "2023")

write_sav(data_2023, here("posts", "lpd-traffic", "data", "lpd_2023.sav"))

Debugging Excel Import Issues

If you’re having trouble reading an Excel file, this chunk can help figure out what’s going wrong. It tries to read each sheet one at a time and reports any errors.

path <- here("posts", "lpd-traffic", "data", "lpd-stops.xlsx")

# Check the file exists
file.exists(path)

# List all sheets
sheet_names <- excel_sheets(path)
print(sheet_names)

# Try reading each sheet
for (sheet in sheet_names) {
  tryCatch({
    df <- read_excel(path, sheet = sheet)
    message("Read sheet: ", sheet, " (", nrow(df), " rows)")
  }, error = function(e) {
    message("Error reading sheet: ", sheet)
    message(e$message)
  })
}
Back to top