Where are the buses?
Wellington’s bus system has been the subject of negative attention in the news and on Twitter. Also, I’m teaching a course in Data Science Practice and we’re just getting to a lab on maps with Leaflet. So I thought I’d make a map of Wellington buses and their lateness – people do tend to overestimate problems with public transit, and if they aren’t overestimating it, that’s also important to know. Information helps. For reasons explained below, the map isn’t in real time; it’s accumulated over nearly an hour.
The full code producing the map is here.
How it works
There isn’t an official API, but lots of people have noticed and used a set of URLs under www.metlink.org.nz/api/v1/...
, so the real-time data is available. The static timetable and route information is officially available here.
Because there isn’t an official API, we want to be careful about rate-limiting the queries. The robots.txt
file says
User-agent: *
Crawl-delay: 10
Disallow: /admin
Disallow: /dev
Disallow: /Security
so we’re allowed to read from the real-time API, but no more than one query per ten seconds. I’m using twenty seconds, in case I ever want to make some other queries while the code is running. The ratelimitr
package lets you create rate-limited functions. Here’s mine
download_route<-limit_rate(
function(route) fromJSON(paste0("https://www.metlink.org.nz/api/v1/ServiceLocation/",route)),
rate(n=1,period=20) )
We only get one route at a time, so the data will have to accumulate over multiple queries. It will take nearly an hour to do a full cycle through all the routes (including expresses, night buses, and services in nearby towns such as Masterton). I’ll make a list that holds all the routes, update one route at a time, and drop information that’s more than two hours old.
Here’s the download call
this_route<-tryCatch(download_route(route), error=function(e) NULL)
if (is.null(this_route)) next
if (!length(this_route$Services)) next
if (NROW(this_route$Services)==0) next
There’s a lot of tryCatch()
and a lot of checking because almost anything involving someone else’s computer can go wrong in so many ways.
Here’s the tidyverse-based data processing: splitting the timestamp up into fields, defining colour by lateness, and making a popup to say which route it is and when the information was downloaded
df<-this_route$Services %>%
select(Lat,Long, DelaySeconds, RecordedAtTime) %>%
separate(RecordedAtTime, into=c("y","mth","d","h","m","s","tz","tz2"),sep="[-:T+]") %>%
mutate(lateness=case_when(DelaySeconds< -180 ~"magenta",
DelaySeconds>= -180 & DelaySeconds<300~ "forestgreen",
DelaySeconds>=300 & DelaySeconds<600~"orange",
DelaySeconds>=600~"red")) %>%
mutate(popup=paste0(route,paste("<br/>",abs(round(DelaySeconds/60)),"mins",
ifelse(DelaySeconds<0,"early", "late"),"at",h,":",m)))
Now, the map. Leaflet makes this really easy
allpos[[route]]<-df
m <- leaflet(bind_rows(allpos)) %>%
addProviderTiles("CartoDB.Positron") %>%
addCircleMarkers(~as.numeric(Long),~as.numeric(Lat),col=~lateness,popup=~popup) %>%
addLegend("bottomright",title=paste("Wellington buses (past hour or so)"),
colors=c("magenta","forestgreen","orange","red"),
labels=c("Early", "On schedule","5-10 min late","10+ min late"))
For a map in a HTML rmarkdown document I’d just print m
, but here we want a separate map that automatically reloads over time. I’m using htmlwidgets
saveWidget(widget=m,"/Volumes/tlum005/buses/wellybus.html",selfcontained=FALSE)
tryCatch(
makeItReload("/Volumes/tlum005/buses/wellybus.html"),
error=function(e) {cat("Didn't reload\n")}
)
The function makeItReload
just sticks a bit of JavaScript into the HTML
makeItReload<-function(htmlfile){
contents<-readLines(htmlfile)
writeLines(gsub("</body>","<script type=\"text/javascript\"> setTimeout(function() { window.location.reload();}, 90*1000);</script></body>",contents, fixed=TRUE), htmlfile)
}
And that’s basically it. Rate-limited calls to a web API returning JSON, turned into a scrollable interactive map, to make the Wellington bus situation more transparent.