LibCity-Dataset: a standardized and comprehensive dataset for urban spatial-temporal data mining

  • Jingyuan Wang*
  • , Wenjun Jiang
  • , Jiawei Jiang
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

The LibCity-Dataset represents a significant contribution to the field of urban spatial-temporal data mining. This dataset uniquely integrates macro traffic state data with micro trajectory data, providing researchers with comprehensive and diverse urban spatial- temporal data. Specifically, we begin by collecting and processing existing open-source spatial-temporal data. Subsequently, we independently collected Beijing taxi trajectory data through third-party interfaces. This data bridges the gap in the scarcity of current open-source vehicle trajectory data. The distinctive aspect of the LibCity-Dataset lies in its innovative approach of standardizing the storage format, achieved through the implementation of atomic files. By adopting this standardized format, diverse data sources are harmonized, enabling effortless application of spatial-temporal prediction models across various datasets. The uniform storage format not only simplifies experimentation but also expedites the advancement of spatial-temporal prediction research, acting as a catalyst for further innovation. This Data Note provides a comprehensive insight into the creation methodology of the LibCity-Dataset, including data collection and processing methodology, data description, data validation, and usage notes. By facilitating open-source collaboration and setting a benchmark for standardization within the spatial-temporal prediction domain, this dataset aims to foster increased research cooperation and knowledge sharing.

Original languageEnglish
Article numberliad021
JournalIntelligent Transportation Infrastructure
Volume2
DOIs
StatePublished - 2023

Keywords

  • Open Source Dataset
  • Spatial-temporal Data
  • Traffic State Data
  • Trajectory Data

Fingerprint

Dive into the research topics of 'LibCity-Dataset: a standardized and comprehensive dataset for urban spatial-temporal data mining'. Together they form a unique fingerprint.

Cite this