The best way to learn is by exploring data and answering your own questions. Here are some datasets that can help you investigate questions like:
What is the Average Daily People/Bikes/Passengers/Cars?
What is the typical daily/weekly/monthly demand profile?
Where are the points with the highest demand/flows?
Some interesting datasets …
Let’s explore some interesting datasets. First we will install (if necessary) and load the packages for this examples
options (repos = c (CRAN = "https://cloud.r-project.org" ))
if (! require ("remotes" )) install.packages ("remotes" )
pkgs = c (
"sf" ,
"tidyverse" ,
"osmextract" ,
"tmap" ,
"maptiles"
)
remotes:: install_cran (pkgs)
sapply (pkgs, require, character.only = TRUE )
sf tidyverse osmextract tmap maptiles
TRUE TRUE TRUE TRUE TRUE
Motorised vehicles counts: Leeds
Many cities/countries publish data from permanent traffic counters e.g. ANPR cameras, induction loops or low-cost sensors. We are going to use data from the sensors in Leeds (available in Data Mill North )
leeds_car_location <- read_csv (
"https://datamillnorth.org/download/e6q0n/9bc51361-d98e-47d3-9963-aeeca3fa0afc/Camera%20Locations.csv"
)
leeds_car_location_sf <- leeds_car_location |>
st_as_sf (coords = c ("X" ,"Y" ),
crs = 27700 )
leeds_car_2019 <- read_csv (
"https://datamillnorth.org/download/e6q0n/9e62c1e5-8ba5-4369-9d81-a46c4e23b9fb/Data%202019.csv"
)
If you are interested in open traffic count datasets see this
Cycle counts for West Yorkshire
Some cities would have some dedicated infrastructure to count the number of people using bikes at strategic points of the city. We are going to use some cycle counters from West Yorkshire that you can find here :
leeds_bike_location <- read_csv (
"https://datamillnorth.org/download/e1dmk/a8c8a11e-1616-4915-a897-9ca5ab4e03b8/Cycle%20Counter%20Locations.csv" ,skip = 1
)
leeds_bike_location_sf <- leeds_bike_location |>
drop_na (Latitude,Longitude) |>
st_as_sf (coords = c ("Longitude" ,"Latitude" ),
crs = 4326 ) |>
st_transform (27700 )
The data for 2019:
leeds_bike_2019 <- read_csv (
"https://datamillnorth.org/download/e1dmk/f13f5d49-6128-4619-a3ff-e6e12f88a71f/Cycle%20Data%202019.csv"
)
Other interesting datasets for you to explore are Paris cycling counters or Scotland .
Pedestrian Counts: Melbourne
Cities also monitor the number pedestrians in key locations. We can use data from the sensors in Melbourne accessible here :
melbourne_locations_sf <- st_read ("https://data.melbourne.vic.gov.au/api/explore/v2.1/catalog/datasets/pedestrian-counting-system-sensor-locations/exports/geojson?lang=en&timezone=Europe%2FLondon" )
Reading layer `OGRGeoJSON' from data source
`https://data.melbourne.vic.gov.au/api/explore/v2.1/catalog/datasets/pedestrian-counting-system-sensor-locations/exports/geojson?lang=en&timezone=Europe%2FLondon'
using driver `GeoJSON'
Simple feature collection with 136 features and 11 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 144.9286 ymin: -37.82591 xmax: 144.9864 ymax: -37.78935
Geodetic CRS: WGS 84
We will extract
melbourne_dec2024 <- read_csv ("https://data.melbourne.vic.gov.au/api/explore/v2.1/catalog/datasets/pedestrian-counting-system-monthly-counts-per-hour/exports/csv?lang=en&refine=sensing_date%3A%222024%2F12%22&timezone=Australia%2FMelbourne&use_labels=true&delimiter=%2C" )
Public transport tap-in data: Bogotá
Public transport ridership data can be difficult to obtain. Fortunately, some cities which have systems managed by a public organisation make this data available for the public. Bogotá’s integrated transport system publishes the tap-in data for the BRT system (see this ). We will use one of the daily reports.
tm_stations_sf <- st_read ("Estaciones_Troncales_de_TRANSMILENIO.geojson" )
Reading layer `Estaciones_Troncales_de_TRANSMILENIO' from data source
`/home/runner/work/tds/tds/s1/Estaciones_Troncales_de_TRANSMILENIO.geojson'
using driver `GeoJSON'
Simple feature collection with 150 features and 25 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: -74.20546 ymin: 4.531715 xmax: -74.04359 ymax: 4.768822
Geodetic CRS: WGS 84
Monthly boarding data can be manually obtained in the open data portal of TransMilenio here
url_tm <- "https://storage.googleapis.com/validaciones_tmsa/ValidacionTroncal/2024/consolidado_2024.zip"
u_bn <- basename (url_tm)
if (! file.exists (u_bn)){
download.file (url = url_tm,
destfile = u_bn,
mode = "wb" )
}
url_tm <- "https://storage.googleapis.com/validaciones_tmsa/ValidacionTroncal/2024/consolidado_2024.zip"
tm_brt_2024 <- read_csv (unz (u_bn,"troncal_2024.csv" ))
TfL’s crowding data is also a great source of ridership data. See this .
Network data from OSM
You may be already familiar with getting and using OSM data. This an example of how to obtain the network that can be used for pedestrians.
my_coordinates <- c (- 76.78893552474851 ,18.01206727612776 )
sf_point <- st_point (my_coordinates) |> st_sfc (crs = 4326 )
sf_buffer <- st_buffer (sf_point,dist = 15e3 )
tm_basemap ("OpenStreetMap" )+
tm_shape (sf_buffer)+
tm_borders ()
my_network <- oe_get_network (sf_buffer, mode = "walking" )
tm_shape (my_network)+
tm_lines ("highway" )
Note: you can access a simplified network dataset from Ordnance Survey’s OpenRoads dataset .