source("https://git.io/JvGjF")Session 4: Geographic and origin-destination data
1 Introduction
In this session, we will learn how to use routing engines and network analysis. The contents of the session are as follows:
- We will start with reviewing the homework from the previous session
- A lecture on graph theory and routing. Slides here
- Session activities: routing in R
- Homework and next session
2 Review Homework (20 mins)
Your homework last week was to review some AI written code. Divide into groups of 2-3 people and share the question you asked AI, the code it produced, and the results it got.
- Does the code run first time?
- Do the results answer the question?
- Try modifying the code and the prompt to improve the code generated by the AI.
- Discuss in your groups the strengths and weaknesses of AI written code.
3 Routing Pratical
3.1 Setup
If you have not installed the packages before hand. You can use ITS Go to do an easy set-up of your computer
The packages we will be using are:
library(sf)               # Spatial data functions
library(lwgeom)           # For advanced spatial functions
library(tidyverse)        # General data manipulation
library(stplanr)          # General transport data functions
library(dodgr)            # Local routing and network analysis
library(opentripplanner)  # Connect to and use OpenTripPlanner
library(tmap)             # Make maps
library(osmextract)       # Download and import OpenStreetMap data
tmap_mode("plot")3.2 Using OpenTripPlanner to get routes
We have set up the Multi-modal routing service OpenTripPlanner for West Yorkshire. Try typing this URL https://otp.robinlovelace.net/ during the session into your browser. You should see something like this:
 Note if you see a grey map use the layers button in the top right to switch to a different basemap.
 Note if you see a grey map use the layers button in the top right to switch to a different basemap.
3.2.1 OTP Web GUI
Exercise
- Play with the web interface, finding different types of routes. What strengths/limitations can you find?
3.2.2 Connecting to OpenTripPlanner in R
To allow R to connect to the OpenTripPlanner server, we will use the opentripplanner package and the function otp_connect.
# ip = "localhost" # to run it on your computer (see final bonus exercise)
ip = "otp.robinlovelace.net" # an actual server
otpcon = otp_connect(
  hostname = ip, 
  ssl = TRUE,
  port = 443,
  router = "west-yorkshire"
)If you have connected successfully, then you should get a message “Router exists.”
Create a test route. Notice than in the web UI the coordinates are Lat/Lng but R uses Lng/Lat
routes_test = otp_plan(
  otpcon = otpcon,
  fromPlace = c(-1.55555, 53.81005),
  toPlace = c(-1.54710, 53.79519),
  mode = "WALK"
)You can use multiple modes and combinations try:
- mode = "WALK"
- mode = c("WALK","TRANSIT")
- mode = c("BICYCLE","TRANSIT")
- mode = "CAR"
- mode = c("CAR_PARK","TRANSIT")
Use some of the functions you have already learnt like head,plot, and summary to understand the data you have produced.
To get some more routes, we will start by importing some data. The NTEM_flow.geojson dataset the contains the top desire lines in West Yorkshire. It was produced from a transport model called the National Trip End Model and research from the University of Leeds.
u = "https://github.com/ITSLeeds/TDS/releases/download/22/NTEM_flow.geojson"
desire_lines = read_sf(u)
head(desire_lines)We will also download the points that represent the possible start and end point of trips in the model
u = "https://github.com/ITSLeeds/TDS/releases/download/22/NTEM_cents.geojson"
centroids = read_sf(u)
head(centroids)Exercise
- Plot the desire_linesandcentroidsobjects using thetmappackage to show the number of travellers on each desire_line and the locations of all centroids.
tmap_mode("plot") # Change to view for interactive map
tm_shape(desire_lines) +
  tm_lines(col = "all",
           lwd = "all",  
           lwd.scale = tm_scale_continuous(values.scale = 10), 
           col.scale = tm_scale_continuous(values  = "-viridis")) +
  tm_shape(centroids) +
  tm_dots(fill = "red")- Produce some different maps for each mode of travel in the desire_linesdataset. How do the numbers of travellers change for walking, driving, and train travel? See example plot below.
This dataset has desire lines, but most routing packages need start and endpoints, so we will extract the start and endpoints using the package lwgeom
Exercise
- Produce a data frame called desire_topwhich contains the top threedesire_linesfor all travellers. Hint?slice_max
- We need to extract start and end point from those desire lines. We would also like to give each place an ID value
# Extract start and end points
fromPlace = sf::st_sf(
  data.frame(id = desire_top$from),
  geometry = lwgeom::st_startpoint(desire_top)
)
toPlace = sf::st_sf(
  data.frame(id = desire_top$to),
  geometry = lwgeom::st_endpoint(desire_top)
)- Create a new object called routes_drive_top, with driving routes between the OD pairs represented in thedesire_topobject.
Calculate routes for the first three desire lines with the following command:
routes_drive_top = otp_plan(otpcon = otpcon,
  otpcon = otpcon,
  fromPlace = fromPlace,
  toPlace = toPlace,
  fromID = fromPlace$id,
  toID = toPlace$id,
  mode = "CAR"
)- Plot routes_drive_topusing thetmappackage mode. You should see something like the image below.
tm_shape(routes_drive_top) + tm_lines()3.2.3 Isochrones
We can also get Isochrones from OTP. In this case lets work out how far we can travel in one hour by cycling and public transport.
isochrone = otp_isochrone(otpcon, 
                          fromPlace = c(-1.558655, 53.807870), 
                          mode = c("BICYCLE","TRANSIT"),
                          maxWalkDistance = 3000)
isochrone$time = isochrone$time / 60 # Convert from seconds to minutes
tm_shape(isochrone) +
  tm_fill("time", alpha = 0.6)Experiment with some different isochrones by changing the mode, and start location. Ty overlaying the OD data on top of you isochrones. Can you see a relationship between travel time and travel demand?
To save you time and to prevent overloading the server, we have pre-generated some extra routes. Download these routes and load them into R.
u = "https://github.com/ITSLeeds/TDS/releases/download/22/routes_drive.geojson"
routes_drive = read_sf(u)
u = "https://github.com/ITSLeeds/TDS/releases/download/22/routes_transit.geojson"
routes_transit = read_sf(u)We will now join the number of drivers onto the driving routes.
Exercise
- Create a dataset called n_driverfromdesire_lineswhich only have the columnsfromtoanddrive. Hint?dplyr::selectand?sf::st_drop_geometry
- Join the n_driverdata onto theroutes_drivedata by linkingfromPlace = fromandtoPlace = to. Hint?dplyr::left_join.
- Plot the routes showing the number of drivers on each route.
3.3 Route Networks (also called flow maps)
The map above shows some useful information about where people drive. But it has a problem. When many routes overlap it hides some of the drivers. What would be more useful would be to add those drivers together so we can see the total number of drivers on each road. This is what a route network does.
We can produce a route network map using stplanr::overline.
rnet_drive = overline(routes_drive, "drive")Exercise
- Make a route network for driving and plot it using the tmappackage. How is is different from just plotting the routes?
Bonus Exercise
- Read the paper about the overline function.
3.3.1 Line Merging
Notice that routes_transit has returned separate rows for each mode (WALK, RAIL, BUS). Notice the route_option column shows that some routes have multiple options.
Let’s suppose you want a single line for each route.
Exercise
- Filter the routes_transitto contain only one route option per origin-destination pair and only the columnsfromPlacetoPlacedistancegeometry. Hintfilter()andselect
Now We will group the separate parts of the routes together.
routes_transit_group = routes_transit |>
  dplyr::group_by(fromPlace, toPlace) |>
  dplyr::summarise(distance = sum(distance))We now have a single row, but instead of a LINESTRING, we now have a mix of MULTILINESTRING and LINESTRING, we can convert to a LINESTRING by using st_line_merge(). Note how the different columns where summarised.
First, we must separate out the MULTILINESTRING and LINESTRING
routes_transit_group_ml = routes_transit_group[st_geometry_type(routes_transit_group) == "MULTILINESTRING", ]
routes_transit_group = routes_transit_group[st_geometry_type(routes_transit_group) != "MULTILINESTRING", ]
routes_transit_group_ml = st_line_merge(routes_transit_group_ml)
routes_transit_group = rbind(routes_transit_group, routes_transit_group_ml)Exercise
- Plot the transit routes, what do you notice about them?
Bonus Exercise:
- Redo exercise 16 but make sure you always select the fastest option. You may need to re-download the routes_transitdata if you have overwritten the original data.
3.4 Network Analysis (dodgr)
Note Some people have have problems running dodgr on Windows, if you do follow these instructions.
We will now analyse the road network using dodgr. Network analysis can take a very long time on large areas. So we will use the example of the Isle of Wight, which is ideal for transport studies as it is small, but has a full transport system including a railway and the last commercial hovercraft service in the world.
First we need to download the roads network from the OpenStreetMap using osmextract::oe_get. We will removed most of the paths and other features and just focus on the main roads. Then use dodgr::weight_streetnet to produce a graph of the road network.
# Download data from OpenSteetMap
roads = oe_get("Isle of Wight", extra_tags = c("maxspeed","oneway"))
# Remove non-road data
roads = roads[!is.na(roads$highway),]
# Only get some road types see https://wiki.openstreetmap.org/wiki/Key:highway
road_types = c("primary","primary_link",
               "secondary","secondary_link",
               "tertiary", "tertiary_link",
               "residential","unclassified")
roads = roads[roads$highway %in% road_types, ]
# Build a graph
graph = weight_streetnet(roads)We will find the betweenness centrality of the Isle of Wight road network. This can take a long time, so first lets check how long it will take.
estimate_centrality_time(graph)
centrality = dodgr_centrality(graph)We can convert a dodgr graph back into a sf data frame for plotting using dodgr::dodgr_to_sf
clear_dodgr_cache()
centrality_sf = dodgr_to_sf(centrality)Exercise
- Plot the centrality of the Isle of Wight road network. What can centrality tell you about a road network?
- Use dodgr::dodgr_contract_graphbefore calculating centrality, how does this affect the computation time and the results?
Bonus Exercises
- Work though the OpenTripPlanner vignettes Getting Started and Advanced Features to run your own local trip planner for the Isle of Wight.
Note To use OpenTripPlanner on your own computer requires Java 8. See the Prerequisites for more details. If you can’t install Java 8 try some of the examples in the vignettes but modify them for West Yorkshire.
- Read the - dodgrvignettes
- Read the MinorRoadTraffic vignette this is a Masters Dissertation topic where students attempt to predict traffic levels using network analysis. The vignette and the package contain some examples of Transport Data Science in action. 
4 Homework
Prepare you draft portfolio for submission on the 28th February.