TY - JOUR
T1 - Fast Geographically Weighted Regression (FastGWR)
T2 - a scalable algorithm to investigate spatial process heterogeneity in millions of observations
AU - Li, Ziqi
AU - Fotheringham, Stewart
AU - Li, WenWen
AU - Oshan, Taylor
N1 - Funding Information:
This work was supported by the National Science Foundation [1455349,1758786].
Publisher Copyright:
© 2018, © 2018 Informa UK Limited, trading as Taylor & Francis Group.
PY - 2019/1/2
Y1 - 2019/1/2
N2 - Geographically Weighted Regression (GWR) is a widely used tool for exploring spatial heterogeneity of processes over geographic space. GWR computes location-specific parameter estimates, which makes its calibration process computationally intensive. The maximum number of data points that can be handled by current open-source GWR software is approximately 15,000 observations on a standard desktop. In the era of big data, this places a severe limitation on the use of GWR. To overcome this limitation, we propose a highly scalable, open-source FastGWR implementation based on Python and the Message Passing Interface (MPI) that scales to the order of millions of observations. FastGWR optimizes memory usage along with parallelization to boost performance significantly. To illustrate the performance of FastGWR, a hedonic house price model is calibrated on approximately 1.3 million single-family residential properties from a Zillow dataset for the city of Los Angeles, which is the first effort to apply GWR to a dataset of this size. The results show that FastGWR scales linearly as the number of cores within the High-Performance Computing (HPC) environment increases. It also outperforms currently available open-sourced GWR software packages with drastic speed reductions–up to thousands of times faster–on a standard desktop.
AB - Geographically Weighted Regression (GWR) is a widely used tool for exploring spatial heterogeneity of processes over geographic space. GWR computes location-specific parameter estimates, which makes its calibration process computationally intensive. The maximum number of data points that can be handled by current open-source GWR software is approximately 15,000 observations on a standard desktop. In the era of big data, this places a severe limitation on the use of GWR. To overcome this limitation, we propose a highly scalable, open-source FastGWR implementation based on Python and the Message Passing Interface (MPI) that scales to the order of millions of observations. FastGWR optimizes memory usage along with parallelization to boost performance significantly. To illustrate the performance of FastGWR, a hedonic house price model is calibrated on approximately 1.3 million single-family residential properties from a Zillow dataset for the city of Los Angeles, which is the first effort to apply GWR to a dataset of this size. The results show that FastGWR scales linearly as the number of cores within the High-Performance Computing (HPC) environment increases. It also outperforms currently available open-sourced GWR software packages with drastic speed reductions–up to thousands of times faster–on a standard desktop.
KW - GWR
KW - Geographically Weighted Regression
KW - parallel computing
KW - spatial analysis
KW - statistical software
UR - http://www.scopus.com/inward/record.url?scp=85054525344&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85054525344&partnerID=8YFLogxK
U2 - 10.1080/13658816.2018.1521523
DO - 10.1080/13658816.2018.1521523
M3 - Article
AN - SCOPUS:85054525344
SN - 1365-8816
VL - 33
SP - 155
EP - 175
JO - International Journal of Geographical Information Science
JF - International Journal of Geographical Information Science
IS - 1
ER -