Coursework submission 1: Data science project plan and reproducible code

This is a formative (non-assessed but required) submission that will help you develop your final coursework. The deadline is 28th February 2025, 13:59.

What to Submit

Submit a .zip file containing two key items:

  1. A concise PDF document (recommended length: 2 pages, absolute maximum: 5 pages) outlining:
    • Your chosen transport-related topic
    • The main dataset(s) you plan to use
    • Your research question
    • At least 2 academic references (see Quarto Citation Guide for details)
    • Any initial analysis or questions you have
  2. Reproducible code as a .qmd file showing how you accessed and processed your data

Template and example submission

See the template.qmd file (and rendered result) for guidance on the structure of your submission. An example submission is available in the d2/example.qmd file (rendered here).

See an example .zip file with the files needed to reproduce this analysis at gitub.com/itsleeds/tds/releases/.

See the source code of these files, including the .bib files for creating references, in the course repository: github.com/itsleeds/tds/tree/main/d2.

Key Requirements

  • Maximum .zip file size: 30 MB
  • Submit via Turnitin
  • AI tools can be used in an assistive role (must be acknowledged)
  • Use the default quarto referencing style

Writing tips

See documentation on figures, technical writing and the visual editor mode from quarto.org for help with creating figures and citations.

Topics and Datasets

Some suggested areas include:

  • Road safety analysis
  • Infrastructure and travel behavior
  • Traffic congestion patterns
  • Public transport accessibility
  • Active travel infrastructure
  • Transport equity studies
  • Other transport-related topics are encouraged

Specific examples could include:

  • What is the relationship between travel behaviour (e.g. as manifested in origin-destination data represented as desire lines, routes and route networks) and road traffic casualties in a transport region (e.g. London, West Midlands and other regions in the pct::pct_regions$region_name data)

  • Analysis of a large transport dataset, e.g. https://www.nature.com/articles/sdata201889

  • Infrastructure and travel behaviour

    • What are the relationships between specific types of infrastructure and travel, e.g. between fast roads and walking?
    • How do official sources of infrastructure data (e.g. the CID) compare with crowd-sourced datasets such as OpenStreetMap (which can be accessed with the new osmextract R package)
    • Using new data sources to support transport planning, e.g. using data from https://telraam.net/ or https://dataforgood.facebook.com/dfg/tools/high-resolution-population-density-maps
  • Changing transport systems

    • Modelling change in transport systems, e.g. by comparing before/after data for different countries/cities, which countries had the hardest lockdowns and where have changes been longer term? - see here for open data: https://github.com/ActiveConclusion/COVID19_mobility
    • How have movement patterns changed during the Coronavirus pandemic and what impact is that likely to have long term (see here for some graphics on this)
  • Software / web development

    • Creating a package to make a particular data source more accessible, see https://github.com/ropensci/stats19 and https://github.com/elipousson/crashapi examples
    • Development of a data dashboard, e.g. using Quarto Dashboards
    • Development of a web app, e.g. using the shiny package
  • Road safety - how can we makes roads and transport systems in general safer?

    • Influence of Road Infrastructure:
      • Assessing the role of well-designed pedestrian crossings, roundabouts, and traffic calming measures in preventing road accidents.
      • Investigating the correlation between road surface quality (e.g., potholes, uneven surfaces) and the frequency of accidents.
    • Influence of Traffic Management:
      • Assessing the role of traffic lights and speed cameras in preventing road accidents.
      • Investigating the correlation between the frequency of accidents and the presence of traffic calming measures (e.g., speed bumps, chicanes, road narrowing, etc.).
    • Legislation and Enforcement:
        1. Assessing the role of speed limits in preventing road accidents.
  • Traffic congestion - how can we reduce congestion?

    • Data Collection and Analysis:
      • Utilizing real-time traffic data from platforms like Waze and Google Maps to forecast congestion patterns.
      • Analyzing historical traffic data to identify recurring congestion patterns and anticipate future traffic bottlenecks.
    • Machine Learning and Predictive Modeling:
      • Designing machine learning models that use past and current traffic data to predict future congestion levels.

Support and Feedback

  • Feedback will be provided within 15 working days

For full details including assessment criteria, formatting guidelines, and academic integrity requirements, see the assessment brief.