Creating searchable Post Office Inquiry transcripts

Talk by Matthew Somerville 👩‍👧

Saturday from 3:20 PM - 3:40 PM in Stage C

The Post Office Horizon IT Inquiry was established in 2020, led by retired high court judge Sir Wyn Williams, to investigate the “failings which occurred with the Horizon IT system at the Post Office leading to the suspension, termination of subpostmasters’ contracts, prosecution and conviction of subpostmasters”. Transcripts of the inquiry hearings are put online, but only in PDF and text format. This talk will explain my history of scraping and parsing official information, how I have taken these transcripts and converted them into a linkable, searchable, and readable version, how I automatically keep the site up to date, and why I've done this.

