Reading List
This reading list contains key resources for the Transport Data Science module, organized by topic.
1 Core Reading
- R for Data Science (Wickham et al., 2023)
- This is an excellent and very popular applied introduction to data science with R, covering the Tidyverse and data visualization. It is open access and based on open code. See github.com/hadley/r4ds for insights into how Quarto can be used to embed code in written outputs.
- Geocomputation with R (Lovelace et al., 2025)
- A guide to geographic data analysis, visualization, and modeling with R.
- The Transportation chapter, which can be found online at r.geocompx.org/transport.html, is a key resource for this module.
- Geocomputation with Python (Dorman et al., 2025)
- Resource for working with geographic data using Python, covering both vector and raster data models, only core reading if you are using Python for the practical sessions.
2 Skills Development
There is a wealth of material in physical books and online teaching the skills needed for this course. The advantage of online materials is that they can be updated more easily, and are often free to access. Below are some key resources for developing the skills needed for this course. Search online for topics you are interested in and see the Quarto gallery of books and the bookdown.org website for more resources.
2.1 Key Skills
Quarto documentation (Allaire et al., 2024)
- The software used to create the Transport Data Science course materials and numerous websites, presentations, dashboards, and books, Quarto is a powerful tool for creating reproducible documents with code and data.
- See the technical writing page of Quarto’s documentation for key information on how to add references, figure captions, and more.
Introduction to GitHub (Heis, 2025)
- A good starting point for learning how to use GitHub for version control and collaboration.
- See also their introduction to Devcontainers at docs.github.com/en/codespaces/
2.2 Python
- Course Materials for: Geospatial Data Science (Szell, 2025)
- Course materials covering various aspects of geospatial data science, including data analysis, visualization, and working with street networks using Python.
- Modern Polars (Heavey, n.d.)
- A side-by-side comparison of the Polars and Pandas libraries.
- A course on Geographic Data Science (Arribas-Bel, 2019)
- Free and open source online book on using GeoPandas and other Python libraries for geographic data analysis.
- Python for Data Analysis (McKinney, 2022)
- Dta wrangling with Pandas, NumPy, and Jupyter, written by the creator of the Pandas library.
2.3 R
- Advanced R
- A comprehensive guide to advanced programming in R, covering topics such as functional programming and object-oriented programming.
3 Software and Tools
- stats19 (Lovelace et al., 2019)
- R package for working with official road crash data
- stplanr: A Package for Transport Planning (Lovelace and Ellison, 2018)
- R package for transport planning with various routing and analysis functions
- OSMnx: New Methods for Acquiring, Constructing, Analyzing, and Visualizing Complex Street Networks (Boeing, 2017)
- Useful, if slightly out of date, paper for anyone working with street network data in Python.
- A/B Street (Carlino et al., 2022)
- A traffic simulation game exploring how small changes to streets can improve transportation in cities. Useful for understanding the impact of urban design on transport systems.
- osm2streets (Carlino, 2025)
- Tool for converting OpenStreetMap data to detailed street networks, useful for transport modeling and analysis.
- See Python bindings that can convert OSM data into polygons representing streets as GeoPandas dataframes at github.com/a-b-street/osm2streets-py
- od2net (Carlino, 2024)
- Tool for converting origin-destination data into network flows, useful for transport modeling and analysis.
4 Research Applications
- The Propensity to Cycle Tool (Lovelace et al., 2017)
- Case study of an open source transport planning tool
- Growing Urban Bicycle Networks (Szell et al., 2021)
- This paper explores methods for auto-suggesting transport network improvements, with reference to reproducible Python code
5 Data Visualization
- The Visual Display of Quantitative Information (Tufte, 2001)
- Classic work on the principles of data visualization
- Visualization Curriculum (Heer, 2021)
- A data visualization curriculum of interactive notebooks, using Vega-Lite and Altair. This book contains a series of Python-based Jupyter notebooks, with a corresponding set of JavaScript notebooks available online on Observable.
5.1 Miscellaneous
- Data Science for Transport: A Self-Study Guide with Computer Exercises (Fox, 2018)
- An introduction to transport data science with hands-on examples, slightly out of date as of 2025.
- Reproducible Road Safety Research with R (Lovelace, 2020)
- Introductory guide for analyzing road safety data in R
- Open source tools for geographic analysis in transport planning (Lovelace, 2021)
- Review of open source tools available for transport planning and analysis.
- Python for Data Science (Turrell et al., 2025)
- A modern guide to data science using Python based on R for Data Science, with practical examples and clear explanations.
- The Geography of Transport Systems (Rodrigue et al., 2013)
- Comprehensive textbook on transport geography and systems
- Modelling Transport (Ortúzar S. and Willumsen, 2001)
- Foundational text on transport modeling methods
- Building Reproducible Analytical Pipelines with R (rodrigues_building?)
- A guide to the data engineering side of data science, with a focus on reproducibility and automation.
- Papers investigating the relationships between new contraflow interventions and traffic levels and collision rates in London (Tait et al., 2024, 2023)
See the full bibliography on Zotero for more resources, and feel free to suggest additions by opening an issue in the tds issue tracker.