Data science project plan

Project submission

Author

Student ID:

Introduction

[Write your introduction here, explaining the context and importance of your chosen topic]

Working title

[Your project title here]

Data

[List and briefly describe the datasets you plan to use]

Research question

[State your main research question here]

Initial analysis

[Describe your planned analysis approach and include any preliminary data exploration]

The following code was used to aggregate the data by hour:

collisions_hourly = collisions_2023 |>
  mutate(time = lubridate::hour(datetime)) |>
  count(time, accident_severity) 

Initial data exploration

A visualisation of the data is shown below:

Warning: plotting the first 9 out of 36 attributes; use max.plot = 36 to plot
all
Warning in min(x): no non-missing arguments to min; returning Inf
Warning in max(x): no non-missing arguments to max; returning -Inf
Warning in min(x): no non-missing arguments to min; returning Inf
Warning in max(x): no non-missing arguments to max; returning -Inf

[1] "Metropolitan Police" "Metropolitan Police" "Metropolitan Police"
[4] "Metropolitan Police" "Metropolitan Police" "Metropolitan Police"

Questions

[List any questions or challenges you anticipate]

Reproducibility

Notes on how to reproduce this analysis are provided in the code chunks above. The full code is available in the .qmd file.

You could include details on how you created the submitted zip file, e.g.:

I rendered this document to a PDF file with the following command:

quarto::quarto_render(
  "project-plan.qmd",
  output_format = "pdf",
  output_file = "project-plan.pdf"
)

I created a zip file with the files needed to reproduce this analysis with the following command:

zip(
  zipfile = "submission.zip",
  files = c("project-plan.qmd", "project-plan.pdf", "wy.gpkg")
)

References