Building a longitudinal geospatial dataset of micro-businesses in Mexico city
Author(s)
Farah, Irene
Kelling, Claire
Leng, Xinyi
Zhao, Yihan
Issue Date
2026-06-10
Keyword(s)
geospatial data
DENUE
point pattern
food retail
Date of Ingest
2026-06-11T09:25:16-05:00
Geographic Coverage
Mexico
Abstract
We introduce a methodology and open-source code to construct a longitudinal dataset using georeferenced business data from Mexico’s National Statistical Directory of Economic Units (DENUE). The data includes the business’ coordinates and other geographic characteristics, but imprecision in the coordinates across years, lack of business IDs before 2015, and concerns with these recently introduced IDs make longitudinal analysis of business continuity difficult. Focusing on food businesses in Mexico City, we analyze data from 2010 (n = 416,898) and 2020 (n = 470,363) to build a longitudinal dataset that allows us to track businesses over time. We address key issues, including imprecise geographic coordinates, missing or incorrect values in critical variables, and discrepancies in business names across years. Our approach combines spatial and string-matching techniques to track businesses over time, achieving an 84% F1 score (a measure that balances precision and recall) when validated against official identifiers. Although our case study centers on food businesses, the methodology can be adapted to other industries, geographies, and similar snapshot-based datasets lacking unique identifiers, serving as a replicable tool for broader urban analytics. The dataset and code are publicly available on GitHub, providing researchers and practitioners with valuable resources to analyze economic and spatial dynamics in cities.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.