A typology of distance-based measures of spatial concentration
Introduction
In the article on “spatial economics” in the New Palgrave Dictionary of Economics, Gilles Duranton wrote “On the empirical front, a first key challenge is to develop new tools for spatial analysis. With very detailed data becoming available, new tools are needed. Ideally, all the data work should be done in continuous space to avoid border biases and arbitrary spatial units.” (Duranton, 2008). In recent years, economists have made every effort in that direction. Measurement of the spatial concentration of activities is certainly one of the most striking examples and has been considerably renewed in the last decade with the development of distance-based methods (Combes et al., 2008). To present the motivation for the use of distance-based methods briefly, let us say that economists traditionally employ disproportionality methods (terminology used by Bickenbach and Bode, 2008) defined on a discrete definition of space. In the latter, the territory being analyzed is divided in several exclusive zones (e.g. a country is divided in turn into regions) and the spatial concentration of activities is evaluated at a given level of observation with the Gini (1912), the Ellison and Glaeser (1997) or the entropy indices of overall localization (Cutrini, 2009), for example. However, the issues arising from discrete spaces are now well known and linked to the Modifiable Areal Unit Problem – MAUP (Openshaw and Taylor 1979; Arbia, 1989): the position of the zoning boundaries and level of observation have an impact (Briant et al., 2010). A first tentative to limit the MAUP's effects is to combine discrete measures with autocorrelation measures. The motivation is the following: results of spatial concentration provided by discrete measures are not affected by the permutation of zones (see Arbia, 2001b, for an illustrative example). As autocorrelation measures evaluate the degree of similarity between zones, they can bring complementary results to the spatial concentration estimates (Guillain and Le Gallo, 2010). Some authors also try to correct in some extent aspatial concentration results by integrating the degree of autocorrelation to the spatial concentration indices (Guimarães et al., 2011). This approach can be of interest if data is only available at the aggregated level of the zone. A second way of research has been undoubtedly more explored since a decade. This second approach does not limit the effects of the MAUP but solves the MAUP. The basic idea is to remove any zoning of space. The development of spatial concentration indices is compulsory to take more effective account of geography (Marcon and Puech, 2003). This encourages the development of distance-based methods which are continuous functions of space. Distance-based measures provide information about concentration at all scales simultaneously and do not rely on zoning. In that case, individual data (and not aggregated data) is used. The seminal work by Ripley, 1976, Ripley, 1977 introduced the best known of the existing distance-based methods: the K function. The latter was taken up quickly by field scientists in ecology (see handbooks by Diggle, 1983, Cressie, 1993, for instance) but its use remained incidental in economics (Arbia, 1989, Arbia and Espa, 1996, Barff, 1987, Feser and Sweeney, 2000, Sweeney and Feser, 1998) until the works of Marcon and Puech, 2003, Marcon and Puech, 2010 and Duranton and Overman (2002)1 who introduced an alternative approach.
In this paper, we propose a typology of distance-based methods. There are two main reasons behind our work. First, a great variety of distance-based methods are used by economists today. The varied toolbox provided by these measures may bring some confusion for economists interested in testing a hypothesis rather than a methodology, so a state of the art may be helpful. Second, in this article we provide a unified theoretical framework by showing that all distance-based methods rely on counting the number of neighbors of points, normalizing this number by space or another number of neighbors, averaging the results in the appropriate way and finally normalizing the result. Monte-Carlo simulations of the null hypothesis allow the data to be tested against it and can also solve remaining issues. As a result, if objects (for example plants) attract each other, more neighbors (other plants) will be found around them on average than if they were distributed randomly and independently. In conclusion, these methods are variations on the same framework to gauge spatial concentration. This being the case, this typology can be useful for readers to choose the appropriate distance-based tool to answer their question.
The paper is organized as follows. In the first part, we give a quick presentation of the common framework and basic vocabulary. Then, all the available distance-based measures are introduced. The third part builds a typology of these methods, showing that they follow the same pattern but vary because they assume different theoretical choices. The last part is a discussion of each tool's properties and their relevance to address economic questions.
Section snippets
Basic principles
Before presenting distance-based measures in detail, we shall propose a general overview of the framework of these functions.
When studying the location of activities, economists document the spatial distribution of one kind of entity (points2), for example shops with a given activity. Their aim is to detect phenomena of attraction (also called aggregation, agglomeration, localization),
The g function
The second-order property of a point pattern characterizes the relation between points: attraction, repulsion or independence. It is defined as the ratio between the joint probability of finding two points in two places x and y, denoted and the product of the probabilities of finding each of them. For practical purposes, this property is assumed to depend only on the distance between the points (as it does not change with direction, the point process is said to be isotropic). A
A typology of distance-based methods
In what follows, we shall prove that all of these functions can be built empirically following the same five steps. First, neighbors are counted around each point at or within a distance r; sometimes weights are summed instead. Second, an average number of neighbors is calculated. Third, is divided by a local reference z(r). In accordance with the typology of Brühart and Traeger (2005), we shall use the following vocabulary:
- •
Topographic measures use space as their benchmark: the
Discussion
The aim of the previous section was to propose a common framework for understanding the construction of the most popular distance-based methods. In this section, we shall provide a discussion of those functions with the objective of addressing economic questions.
Conclusion
A decade ago, disproportionality methods such as the Gini or Ellison and Glaeser indices were classical tools for economists. Quite logically, methods were then developed to take advantage of the knowledge of the exact position of objects and solve issues linked to the Modifiable Areal Unit Problem (Openshaw and Taylor, 1979). The first were statistics based on the distance of the nearest neighbor of points, after Clark and Evans (1954). They have been outdated by the distance-based measures of
Acknowledgements
We thank the editor, two anonymous referees and participants at the 61st Congress of the French Economic Association (Paris), Hotelling Seminar (Université de Paris-Sud / ENS Cachan) and the 12th International Workshop Spatial Econometrics and Statistics (Orléans, France). The second author gratefully acknowledges financial support from the LET (Université de Lyon, CNRS, ENTPE), IUT de Sceaux and AAP Attractivité 2014 (Université de Paris-Sud). This work has benefited from an “Investissement
References (101)
- et al.
The global agglomeration of multinational firms
J. Int. Econ.
(2014) - et al.
Clusters of firms in an inhomogeneous space the high-tech industries in milan
Econ. Model.
(2012) Modelling the geography of economic activities on a continuous space
Papers Reg. Sci.
(2001)- et al.
Location patterns of service industries in France: a distance-based approach
Reg. Sci. Urban Econ.
(2013) - et al.
An anatomy of the geographical concentration of Canadian manufacturing industries
Reg. Sci. Urban Econ.
(2015) - et al.
Dots to boxes do the size and shape of spatial units jeopardize economic geography estimations?
J. Urban Econ.
(2010) Using entropy measures to disentangle regional from national localization patterns
Reg. Sci. Urban Econ.
(2009)- et al.
On the use of Ripley's K-function and its derivatives to analyze domain size
Biophys. J.
(2009) - et al.
Measuring economic localization: Evidence from Japanese firm-level data
J. Japanese Int. Econ.
(2012) - et al.
A class of spatial econometric methods in the empirical analysis of clusters of firms in the space
Empir. Econ.
(2008)
Spatial Data Configuration in Statistical Analysis of Regional Economic and Related Problems
The role of spatial effects in the empirical analysis of regional concentration
J. Geogr. Syst.
Statistica Economica Territoriale
Spatstat an R package for analyzing spatial point patterns
J. Stat. Softw.
Non- and semi-parametric estimation of interaction in inhomogeneous point patterns
Stat. Neerl.
Industrial clustering and the organization of production a point pattern analysis of manufacturing in Cincinnati, Ohio
Annal. Assoc. Am. Geogr.
Fitzgerald a return to the neighborhood and its contemporary structural and geographical contexts
Prof. Geogr.
Comments on Ripley's paper
J. Royal Stat. Soc.
Disproportionality measures of concentration, specialization, and localization
Int. Reg. Sci. Rev.
Exploring and modeling fire department emergencies with a spatio-temporal marked point process
Case Stud. Bus. Indus. Govern. Stat.
Measuring and testing spatial mass concentration with micro-geographic data
Spatial Econ. Anal.
An account of geographic concentration patterns in Europe
Reg. Sci. Urban Econ.
Productivity and the density of economic activity
Am. Econ. Rev.
Distance to nearest neighbor as a measure of spatial relationships in populations
Ecology
Transport costs measures, determinants, and regional policy implications for France
J. Econ. Geogr.
Economic Geography, The Integration of Regions and Nations
Statistics for Spatial Data
Measuring segregation at the micro level an application of the M measure to multi-ethnic residential neighbourhoods in Amsterdam
Tijdschr. voor Econ. Soc. Geogr.
Second-order analysis of spatial clustering for inhomogeneous populations
Biometrics
Second-order analysis of inhomogeneous spatial point processes using case-control data
Biometrics
Statistical Analysis of Spatial Point Patterns
A kernel method for smoothing point process data
Applied Statistics
SPLANCS spatial point pattern analysis code in S-plus
Comput. Geosci.
Testing for localization using micro-geographic data
Rev. Econ. Stud.
Exploring the detailed location patterns of UK manufacturing industries using microgeographic data
J. Reg. Sci.
Geographic concentration in U.S. manufacturing industries a dartboard approach
J. Political Econ.
What causes industry agglomeration? Evidence from coagglomeration patterns
Am. Econ. Rev.
A grid-based method for sampling and analysing spatially ambiguous plants
J. Veg. Sci.
A tool for the quantitative spatial analysis of complex cellular systems
IEEE Transac. Image Process.
A test for the coincident economic and spatial clustering of business enterprises
J. Geogr. Syst.
The Logic of British and American Industry: A Realistic Analysis of Economic Structure and Government
Issues in the measurement of localization
Environ. Plan. A
Cited by (43)
A hypothesis test for detecting distance-specific clustering and dispersion in areal data
2023, Spatial StatisticsAgglomeration and coagglomeration of tech-based firms in Toluca's intrametropolitan space, 2010-2020
2024, Estudios Demograficos y UrbanosThe Identification of Industrial Clusters and their Spatial Characteristics Based on Natural Semantics
2024, Applied Spatial Analysis and PolicyIndustrial systems in the Province of Santa Fe, Argentina: approach through the characterization of Local Economic Areas
2023, Ciudad y Territorio Estudios TerritorialesHow does industrial agglomeration affect firms’ energy consumption? Empirical evidence from China
2023, Indoor and Built Environment