London Underground open data: much more than you ever wanted to know

A talk from EMF 2022 by eta

On Sunday June 5, 2022 at in Stage A

Over the past year or so, I've become way too interested in tracking London Underground trains with greater precision than anyone else so far. Transport for London provides an API ("TrackerNet") to get departure boards for all the stations -- but (ab)using that information to track the journeys of each individual train has proven much harder than expected.

I'll explain how I built a system to squeeze useful insights out of an API not really designed to provide them -- including a fair few workarounds for TfL's capricious signalling systems, way more graph theory than I have any right doing, and why I now really hate the District Line. You'll be able to see how applying a carefully tuned pile of random maths lets me go from a chaotic jumble of data to a near-perfect model of the Tube map!

I'll also cover the practicalities of running this system, explaining how I packaged it all up into a website people can actually use without getting banned by TfL for using their API too much or overloading the server I run it on.

Don't let the mention of graph theory scare you off; this talk should be appropriate for all audiences, and should appeal to anyone with a casual interest in railways, open data, or infrastructure.