Introduction to transport data science


Module: Transport Data Science

Robin Lovelace

2024-10-31

Who: Transport Data Science team

Robin Lovelace

  • Associate Professor of Transport Data Science
  • Researching transport futures and active travel planning
  • R developer and teacher, author of Geocomputation with R

Yuanxuan Yang

  • Lecturer in Data Science of Transport
  • New and Emerging Forms of Data: Investigating novel data sources and their applications in urban mobility and transport planning.

TDS Team II

Malcolm Morgan

  • Senior researcher at ITS with expertise in routing + web
  • Developer of the Propensity to Cycle Tool and PBCC

Zhao Wang

  • Civil Engineer and Data Scientist with expertise in machine learning

Demonstrators

  • Juan Pablo Fonseca Zamora

You!

What is transport data science?

  • The application of data science to transport datasets and problems
  • Raising the question…
  • What is data science?
  • A discipline “that allows you to turn raw data into understanding, insight, and knowledge” (Grolemund, 2016)

In other words…

  • Statistics that is actually useful!

Why take Transport Data Science

  • New skills (cutting edge R and/or Python packages)
  • Potential for impacts
  • Allows you to do new things with data
  • It might get you a job!

Live demo: npt.scot web app

The history of TDS

  • 2017: Transport Data Science created, led by Dr Charles Fox, Computer Scientist, author of Transport Data Science book (Fox, 2018)

  • The focus was on databases and Bayesian methods

  • 2019: I inherited the module, which was attended by ITS students

  • Summer 2019: Python code published in the module ‘repo’:

History of TDS II

  • January 2020: Available, Data Science MSc course
  • March 2020: Switch to online teaching
  • 2021-2023: Updated module, focus on methods
  • 2024: Switch to combined practical sessions and lectures
  • 2025+: Expand, online course? book? stay in touch!

Essential reading

  • Chapter 12, Transportation of Geocomputation with R, a open book on geographic data in R (available free online) (Lovelace et al. 2019)
  • Reproducible Road Safety Research with R (RRSRR): https://itsleeds.github.io/rrsrr/

Core reading materials

  • R for Data Science, an introduction to data science with R (available free online)
  • Python equivalent

Optional

There are many good resources on data science for transport applications. Do your own research and reading! The following are good:

  • If you’re interested in network analysis/Python, see this paper on analysing OSM data in Python (Boeing and Waddell, 2017) (available online)

  • If you’re interested in the range of transport modelling tools, see Lovelace (2021).

For more references, see the bibliography at github.com/ITSLeeds/TDS

Objectives

  • Understand the structure of transport datasets

  • Understand how to obtain, clean and store transport related data

  • Gain proficiency in command-line tools for handling large transport datasets

  • Produce data visualizations, static and interactive

  • Learn how to join together the components of transport data science into a cohesive project portfolio

Assessment (for those doing this as credit-bearing)

  • You will build-up a portfolio of work
  • 100% coursework assessed, you will submit by
  • Written in code - will be graded for reproducibility
  • Code chunks and figures are encouraged
  • You will submit a non-assessed 2 page pdf + qmd

Schedule

Feedback

The module is taught by two really well organised and enthusiastic professors, great module, the seminars, structured and unstructured learning was great and well thought out, all came together well

I wish this module was 60 credits instead of 15 because i just want more of it.