Physicists explain things to me

Max Nathan
9 min readAug 19, 2016


(c) 1961 Resnais / Robbe-Grillet

Universal laws for cities turn out not to be universal. How helpful are they?

Last month I took part in the 2016 Cities as Complex Systems symposium in Schloss Herrenhausen, a painstakingly-restored stately home and gardens straight out of Last Year in Marienbad. Resnais’ film is a brilliant study in confusion, the limits of knowledge and in communication failure (see picture): a decent summary of my feelings at the workshop’s end.

The symposium was a mix of physicists, mathematicians, planners and architects, even an archaeologist, with a couple of economists and geographers thrown in. Even so, physics and maths dominated the conversation, and around 75% of the papers used the superlinear scaling approach currently beloved of scientists who are urbanists.

Scaling is a deceptively simple way to think about cities and urban growth. Developed by Geoffrey West, Denise Pumain, Luis Bettencourt and others, the idea is that the kind of universal scaling laws that are observed in the natural world also apply to cities.

Biological scaling laws are common — larger creatures live longer, consume more energy and so on — but these links get weaker as organisms get bigger. However, Bettencourt and co argue that as cities get bigger, they exhibit superlinearity: specifically, innovation, wages and productivity rise faster than the urban population. Those aspects of urban economies follow a power law. Superlinearity also applies to some ‘bads’ (resource consumption, crime) and even to measures of behaviour, like walking speed. Other features, such as infrastructure provision, scale linearly with city size.

Unlike the natural world, then, the pace of urban social and economic life is always accelerating. In theory, Bettencourt and co suggest, urban scaling in innovation and growth means that there is no limit to city size. In practice, growing urban cores may run out of innovations, resources or both, leading to population collapse, then a recovery period that gradually speeds up as scaling laws kick in again. In their seminal 2007 PNAS paper, Bettencourt, West and others provide striking evidence of superlinearity in US metro areas, and some evidence for European and Chinese city-regions. They also argue that scaling relationships should form the basis for making urban policy.

Huge if true. So does it work? And is it useful?

There are at least three candidate universal laws for the systems of cities: superlinear scaling, Zipf’s Law and Gibrat’s Law. In summary:

  • Gibrat’s Law argues that urban systems follow a bell curve (technically, the log-normal distribution in Figure 1). This is because growth rates are delinked from city size. Specifically, cities follow a stochastic growth process, and power laws don’t hold. The implication is that for whatever outcome, on average there’s no difference between the growth rates of the largest cities and the smallest.
Figure 1. Gibrat’s Law. Source: Giesen, Zimmermann and Suedekum (2010).
  • Zipf’s Law argues that urban systems follow a power law distribution with an exponent of one. Specifically, this assumes stochastic growth, as in Gibrat, plus a lower bound on city size and an upper bound on the number of cities. This gives us the linear slope in Figure 2: the most-populous city in a given system will be twice as big as the next biggest city, three times as large as the third biggest, and so on.
Figure 2. Zipf’s Law. Source: Gabaix (2014).
  • Superlinear scaling argues that city systems follow a power law with an exponent greater than one. This assumes growth is disproportionate to size, so that biggest city is more than twice as big as the second on some outcome, more than twice as big as the third, and so on. In Figure 3, the scaling coefficient is beta, which takes the value 1.12: in US metros, wages scale superlinearly with city size.
Figure 3. Superlinear scaling. Soucre: Bettencourt et al (2007).

In practice, none of these laws seem to be truly universal. For example, using conventional metro areas, Eeckhout (2004) finds that Zipf’s Law holds for the biggest (US) cities, but overall, the city system follows Gibrat’s Law. Rozenfeld et al (2011) define cities using a clustering method, and find that Zipf’s Law holds for the whole of the US and UK urban systems. Jens Suedekum and colleagues (2010), using data for eight countries across the world, find Zipf-like power laws for the biggest and smallest cities, but a Gibrat-style log-normal distribution for everywhere else.

However, at the workshop Jens presented some newer work, using consistently-defined units across the whole EU28, which finds find that the biggest European cities are too small to follow a power law. Similarly, Henry Overman and Patricia Rice (2008) find that Zipf’s Law holds for some of the English urban system, but cities like Manchester, Liverpool and Birmingham are smaller than they ‘should’ be (figure 3).

Figure 3. Undersized English metros? Source: Overman and Rice (2008).

And as Elsa Arcaute and colleagues have shown in a series of papers, superlinear scaling doesn’t seem to hold in the UK or in France. The UK’s urban system, dominated by London, is very different from the US system. Economic outcomes scale linearly, but there is no superlinearity, no matter how city boundaries are defined. In France, we can generate the result on some city definitions, but not others. Bettencourt and co use Larger Urban Zones for their European results — but we know that these are often administrative boxes which don’t match up to real urban geographies.

So it seems that none of these ‘universal’ laws holds everywhere: there are clearly US and European differences, and results also depend on how cities are defined in the first place. (Neither do these laws hold in relative outliers: see this work by Dani Arribas-Bel and colleagues looking at Zipf and Gibrat in Australian cities.)

Now let’s take a step back. Even if superlinearity in cities does hold, what can it tell us? Here I was left feeling pretty confused.

Technically speaking, testing for superlinearity in a city simply involves fitting the log of some growth measure against the log of city population. That is, it’s a binary correlation, which suggests a link between the two variables in play. However, it doesn’t tell us anything about the causal links between the two, or if they exist at all. And on its own, it tells us nothing about how this relationship has come about.

Superlinear scaling studies often don’t seem very interested in explaining their findings, except in the most speculative way. For physicists and mathematicians — at least those at CTCS — the best explanation is the most parsimonious, and the focus is on the city, not the actors in it.

For economists and geographers, this is puzzling. Like other empirical social scientists, we are interested in understanding social reality in as much detail as possible, ideally going down to individual firms, people and communities. We’d prefer simple explanations, but not at the cost of missing critical detail, or understanding the underlying mechanisms. A parsimonious model that simply links two things is not an explanation at all. Sure, urban economists are also interested in whether power laws hold for cities. But these super high-level correlations are only the start of the conversation.

This mindset reflects some fundamental constraints in the kinds of knowledge that social scientists believe are available to them. Dani Rodrik convincingly argues that universal theories in economics are best considered as ‘scaffolding’: they provide useful, high-level starting points for our understanding, but none appear to be truly universal, all of the time. No one theory of value, or of the business cycle, for example, has turned out to be complete. More detailed explanations only tend to hold for particular times and places, and the models that underpin them are similarly contingent. In turn, Rodrik suggests, we should be sceptical about the reach of universal laws in any social sciences. Social reality is a lot messier than pure maths or theoretical physics, and less amenable to generalisation.

This is why economists tend to start with very simple models, then enrich these — often by drawing insights from other disciplines, such as psychology, politics, history and sociology. As Rodrik puts it, economic knowledge advances by deploying the right framework (or frameworks) for the question at hand. I’m not claiming that economics or quantitative analysis is all we need — to understand urban places we’ll often need to move beyond these. More on that below. I’m not singling out the scaling literature either — one could make similar criticisms about those data science enthusiasts who believe that billions of observations sweep away the need for actual research.

For example, urban economists would say that in OECD countries, a positive link between city size and (say) productivity could be entirely explained by the kind of people who live in cities: those people might be more productive wherever they live. Or it could be that cities themselves help people and firms become more productive, through matching, sharing and learning effects.

In practice, we know that both of these processes are in play, because we have a large body of empirical work, in many of those OECD countries, on sorting, agglomeration economies and the specific channels that underpin these things. In turn, there is a wealth of models in urban economics, economic geography, and other fields, that provide theoretical underpinnings for the results we observe.

To be fair, superlinear scaling studies sometimes draw on these bodies of work. But much of this is simply gestured to in the introductions to papers: links aren’t drawn, connections are unexplored. More seriously, a great deal of crucial existing work in economics and geography seems to be missing in the scaling literature. While a few researchers are interested in segregation, there is little or no modelling of the cost of living in cities, either housing costs or the costs of goods and services. Cities are usually considered as bounded objects, where all the action takes place inside the boundary, or comes through some unexplained external shock (such as ‘technological progress’); only a handful of papers even attempt to look at urban systems, and model how changes in one place might affect others. Equally, models of urban decline or stasis don’t seem to feature. And as discussed above, there are still few attempts to properly link city-level outcomes to what people, firms or communities do at the micro level — although agent-based models shown at CTCS take steps towards this.

I also see this as a two-way problem. Economists, geographers and other social scientists don’t always reach to social physicists as much as they should. And we should do much more to learn from the data sources and analytical techniques common in the scaling field, as well as computer science / data science more broadly. Here are some good examples in geography; here are a couple of examples in economics. And here’s something on methods.

Universal laws are always appealing, especially if they promise to cut through the messiness of social reality with some scientific rigour. However, even if superlinear scaling holds, it seems to deal with real-world urban complexity by simply boxing it off. In practice, it does not provide the policy guide that some of its proponents would like. Notably, it is orders of magnitude less informative than urban economics frameworks, which are regularly called out as too minimal by other social scientists. (As a quantitative economic geographer, one of the strangest moments at the workshop was finding myself demanding more richness and detail.)

However — taken as conceptual scaffolding, superlinear scaling can help us develop richer explanations of specific city systems and urban processes. To do so requires going beyond scaling: as Rodrik suggests, we need a range of other models, inside and outside economics, to tackle specific features of the city or the urban system. It would be fascinating to see what could emerge out of such cross-field exercises. Here is one example in the QJE which formally links a global feature (the shape of world population growth) to underlying endogenous growth processes in cities (the presence of new technology and associated knowledge spillovers). This paper has some excellent suggestions for geographers, as does Mike Batty’s book.

If you don’t like assemblage as an image, think in terms of layering: moving between high-level and street-level perspectives, and between structured, quantitative methods and qualitative techniques, drawing on different literatures as need be. Or if you’re more of a geographer, use Ian Gordon’s object-orientated strategy: rotate the city through different angles; zoom in on particular groups and processes; pull out to cover different time periods and events. Again, this is multi-discipline as well as multi-method. And it’s this almost cinematic approach that probably comes closest to capturing the real complexities of urban life.

Many thanks to Dani Arribas-Bel and Diane Coyle for helpful comments. Apologies to Rebecca Solnit for co-opting her book title. Updated November 2021 to include more detail (text, graphics) on Gibrat, Zipf and superlinearity.



Max Nathan

UCL & CEP. Co-founder @centreforcities & @whatworksgrowth. Urbanism, economics, innovation, migration and public policy. My views. I’m at