Mapping the evolution of cities

A central component of our model of urban evolution is Signals. Signals are representations or messages that convey information about the forms, groups, and activities that characterize a place. In the contemporary city, social media review sites are increasingly important Signals. They mediate the information a potential user receives about a place, and in turn potentially shape their ideas about the sort of activities, people, and forms they can expect to encounter. 

In this study, we develop methodologies for “reading the city” through the signals conveyed through Yelp reviews. We show that Yelp reviews provide a window onto the collective representations of a city and its neighbourhoods. The methodology enables us to identify stability and change as well as convergence and divergence among neighbourhood representations. This technique can form the basis for a kind of sociological observatory that tracks not only trends in business activity but also in the latent collective meaning of places.

The abstract is below, along with some key conclusions. The paper is published here, and a pre-print can be freely accessed here.

Abstract: This paper develops novel methods for using Yelp reviews as a window into the collective representations of a city and its neighbourhoods. Basing analysis on social media data such as Yelp is a challenging task because review data is highly sparse and direct analysis may fail to uncover hidden trends. To this end, we propose a deep autoencoder approach for embedding the language of neighbourhood-based business reviews into a reduced dimensional space that facilitates similarity comparison of neighbourhoods and their change over time. Our model improves performance in distinguishing real and fake neighbourhood descriptions derived from real reviews, increasing performance in the task from an average accuracy of 0.46 to 0.77. This improvement in performance indicates that this novel application of embedded language analysis permits us to uncover comparative trends in neighbourhood change through the lens of their venues’ reviews, providing a computational methodology for reading a city through its neighbourhoods. The resulting toolkit makes it possible to examine a city’s current sociological trends in terms of its neighbourhoods’ collective identities.

This ability to identify areas with coherent and stable meaning on the basis of review texts is a central contribution of this paper. Most research uses qualitative and survey methods to identify shared local meaning, often suggesting linkages between neighbourhood identity, solidarity, and local advocacy. Our techniques capture similar information about identity and a platform for pursuing such linkages, but in a way that is scalable, reliable, and more systematically comparable.

An additional benefit of our approach concerns the fact that even the most stable neighbourhoods do have vocabulary change, despite mini- mal semantic change. This is a key feature of the embedding space: the vocabulary itself is discarded, and meaning is retained. However, as a result of this, it is critical to note that individual words may easily change year-on-year, even in the most stable neighbourhood. The col- lective meaning of an area transcends changes in specific words, such that a neighbourhood like the Annex is consistently recognized as hav- ing a similar meaning within the discursive space of the city, even if specific words used to describe it change. This is an example of how the computational analysis pursued here carries forward key ideas from cultural research, namely about the holistic and relational character of meaning….

The high degree of accuracy that the classifier was able to obtain in the low dimensional space demonstrates the validity of using this review space to examine neighbourhoods. It is important to emphasize that the ‘fake’ neighbourhoods are indeed real, contiguous sections of the city, meaning that the model was able to distinguish at a high conceptual level the differences between these two sets. Reviews reliably map onto the real neighbourhoods, and the official neighbourhoods exhibit coherent meanings….

…These techniques might be considered a form of sociological observatory to complement recent uses of Yelp data for economic monitoring. The economic version uses social media data to identify patterns and trends in business activity before or beyond what official government data permit. Our sociological version uses similar data to identify patterns and trends in collective representations and identities. As a result of this, it expands the range of approaches available to researchers, allowing for more nuanced and effective research of neighbourhood change.