Coursework submission 1: Data science project plan and reproducible code

This is a formative (non-assessed but required) submission that will help you develop your final coursework. The deadline is 28th February 2025, 13:59.

What to Submit

Submit a .zip file containing two key items:

  1. A concise PDF document (recommended length: 2 pages, absolute maximum: 5 pages) outlining:
    • Your chosen transport-related topic
    • The main dataset(s) you plan to use
    • Your research question
    • At least 2 academic references (see Quarto Citation Guide for details)
    • Any initial analysis or questions you have
  2. Reproducible code as a .qmd file showing how you accessed and processed your data

Key Requirements

  • Maximum .zip file size: 30 MB
  • Submit via Turnitin
  • AI tools can be used in an assistive role (must be acknowledged)
  • Use the default quarto referencing style

Topics and Datasets

Some suggested areas include:

  • Road safety analysis
  • Infrastructure and travel behavior
  • Traffic congestion patterns
  • Public transport accessibility
  • Active travel infrastructure
  • Transport equity studies
  • Other transport-related topics are encouraged

Specific examples could include:

  • What is the relationship between travel behaviour (e.g. as manifested in origin-destination data represented as desire lines, routes and route networks) and road traffic casualties in a transport region (e.g. London, West Midlands and other regions in the pct::pct_regions$region_name data)

  • Analysis of a large transport dataset, e.g. https://www.nature.com/articles/sdata201889

  • Infrastructure and travel behaviour

    • What are the relationships between specific types of infrastructure and travel, e.g. between fast roads and walking?
    • How do official sources of infrastructure data (e.g. the CID) compare with crowd-sourced datasets such as OpenStreetMap (which can be accessed with the new osmextract R package)
    • Using new data sources to support transport planning, e.g. using data from https://telraam.net/ or https://dataforgood.facebook.com/dfg/tools/high-resolution-population-density-maps
  • Changing transport systems

    • Modelling change in transport systems, e.g. by comparing before/after data for different countries/cities, which countries had the hardest lockdowns and where have changes been longer term? - see here for open data: https://github.com/ActiveConclusion/COVID19_mobility
    • How have movement patterns changed during the Coronavirus pandemic and what impact is that likely to have long term (see here for some graphics on this)
  • Software / web development

    • Creating a package to make a particular data source more accessible, see https://github.com/ropensci/stats19 and https://github.com/elipousson/crashapi examples
    • Development of a data dashboard, e.g. using Quarto Dashboards
    • Development of a web app, e.g. using the shiny package
  • Road safety - how can we makes roads and transport systems in general safer?

    • Influence of Road Infrastructure:
      • Assessing the role of well-designed pedestrian crossings, roundabouts, and traffic calming measures in preventing road accidents.
      • Investigating the correlation between road surface quality (e.g., potholes, uneven surfaces) and the frequency of accidents.
    • Influence of Traffic Management:
      • Assessing the role of traffic lights and speed cameras in preventing road accidents.
      • Investigating the correlation between the frequency of accidents and the presence of traffic calming measures (e.g., speed bumps, chicanes, road narrowing, etc.).
    • Legislation and Enforcement:
        1. Assessing the role of speed limits in preventing road accidents.
  • Traffic congestion - how can we reduce congestion?

    • Data Collection and Analysis:
      • Utilizing real-time traffic data from platforms like Waze and Google Maps to forecast congestion patterns.
      • Analyzing historical traffic data to identify recurring congestion patterns and anticipate future traffic bottlenecks.
    • Machine Learning and Predictive Modeling:
      • Designing machine learning models that use past and current traffic data to predict future congestion levels.

Support and Feedback

  • Feedback will be provided within 15 working days

For full details including assessment criteria, formatting guidelines, and academic integrity requirements, see the assessment brief.