It’s largely technical — we develop a new set of tools for tracking cluster evolution using geolocated business website text and metadata, then test it out by tracking the growth of tech and creative activity in Shoreditch / Tech City.
This stuff is important: clusters matter economically, but are hard to observe at scale, especially when the firms in them are doing novel things that don’t easily show up in conventional datasets. Being unable to see industry space properly is an especially problematic issue for tech, and for other industries being disrupted by tech (finance, clean, retail, creative, and so on). Clusters’ physical space is also hard to track — boundaries are shifting, in ways that are hard to capture using regular administrative boxes.
Our approach leverages firms’ own web text — how they describe themselves — and uses postcode level info to place firms precisely on the ground. We then use machine learning to group firms into industry spaces — this is flexible, so firms can belong to multiple groups — and look at how different spaces grow or shrink during the 2000s. We also use the postcode data to show microgeographies within a cluster.
We test-drive our design on Shoreditch / Tech City. The cluster is well-known — among others, I’ve written about it a lot in the past (see this, this with Emma Vandore and Georgina Voss, and this with Emma). This body of collective knowledge gives us the ground-truth we need to see if all our data work generates sensible results.
Happily for us, we’re able to both reproduce the main stylised facts about the cluster, and to generate some extra, more fine-grained insight (the kind of thing that’s hard to do using fieldwork alone, as most previous work has done).
Here’s the link to the paper again — happy reading!