Skip to main navigation Skip to search Skip to main content

Second-order destination inference using semi-supervised self-training for entry-only passenger data

  • Carnegie Mellon University

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Automated data collection in urban transportation systems produces a large volume of passenger data. However, quite a few of the data are still incomplete, limiting the insight into passenger mobility. The unavailability of destination information in entry-only passenger data is a very common issue. Traditional approaches for estimating passenger destinations rely on heuristics that can recover only some of the missing destinations. To deal with the remaining incomplete data, this paper, for the first time, proposes a second-order inference methodology to leverage semi-supervised self-training to infer the missing destinations. The methodology involves the design of a base learner to predict the missing destinations based on the statistics of a selected similarity-based “training set”, and the design of a selection strategy to select new data with high prediction confidence to update the training set. To further improve the inference, we incorporate personal history priors to modify the base learner. We evaluate our designs using two data sources: a real-data inspired traffic-passenger behavior simulation in the city of Porto, Portugal, and the real bus Automated Fare Collection (AFC) data collected from the same city. The experimental results show that compared to baseline methods that do not use self-training, our approach significantly improves the inference performance and achieves notably high accuracies.

Original languageEnglish
Title of host publicationBDCAT 2017 - Proceedings of the 4th IEEE/ACM International Conference on Big Data Computing, Applications and Technologies
PublisherAssociation for Computing Machinery, Inc
Pages255-264
Number of pages10
ISBN (Electronic)9781450355490
DOIs
StatePublished - 5 Dec 2017
Externally publishedYes
Event4th IEEE/ACM International Conference on Big Data Computing, Applications and Technologies, BDCAT 2017 - Austin, United States
Duration: 5 Dec 20178 Dec 2017

Publication series

NameBDCAT 2017 - Proceedings of the 4th IEEE/ACM International Conference on Big Data Computing, Applications and Technologies

Conference

Conference4th IEEE/ACM International Conference on Big Data Computing, Applications and Technologies, BDCAT 2017
Country/TerritoryUnited States
CityAustin
Period5/12/178/12/17

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 11 - Sustainable Cities and Communities
    SDG 11 Sustainable Cities and Communities

Keywords

  • Inference
  • Self-training
  • Semi-supervised learning
  • Transport

Fingerprint

Dive into the research topics of 'Second-order destination inference using semi-supervised self-training for entry-only passenger data'. Together they form a unique fingerprint.

Cite this