Abstract
Automated data collection in urban transportation systems produces a large volume of passenger data. However, quite a few of the data are still incomplete, limiting the insight into passenger mobility. The unavailability of destination information in entry-only passenger data is a very common issue. Traditional approaches for estimating passenger destinations rely on heuristics that can recover only some of the missing destinations. To deal with the remaining incomplete data, this paper, for the first time, proposes a second-order inference methodology to leverage semi-supervised self-training to infer the missing destinations. The methodology involves the design of a base learner to predict the missing destinations based on the statistics of a selected similarity-based “training set”, and the design of a selection strategy to select new data with high prediction confidence to update the training set. To further improve the inference, we incorporate personal history priors to modify the base learner. We evaluate our designs using two data sources: a real-data inspired traffic-passenger behavior simulation in the city of Porto, Portugal, and the real bus Automated Fare Collection (AFC) data collected from the same city. The experimental results show that compared to baseline methods that do not use self-training, our approach significantly improves the inference performance and achieves notably high accuracies.
| Original language | English |
|---|---|
| Title of host publication | BDCAT 2017 - Proceedings of the 4th IEEE/ACM International Conference on Big Data Computing, Applications and Technologies |
| Publisher | Association for Computing Machinery, Inc |
| Pages | 255-264 |
| Number of pages | 10 |
| ISBN (Electronic) | 9781450355490 |
| DOIs | |
| State | Published - 5 Dec 2017 |
| Externally published | Yes |
| Event | 4th IEEE/ACM International Conference on Big Data Computing, Applications and Technologies, BDCAT 2017 - Austin, United States Duration: 5 Dec 2017 → 8 Dec 2017 |
Publication series
| Name | BDCAT 2017 - Proceedings of the 4th IEEE/ACM International Conference on Big Data Computing, Applications and Technologies |
|---|
Conference
| Conference | 4th IEEE/ACM International Conference on Big Data Computing, Applications and Technologies, BDCAT 2017 |
|---|---|
| Country/Territory | United States |
| City | Austin |
| Period | 5/12/17 → 8/12/17 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 11 Sustainable Cities and Communities
Keywords
- Inference
- Self-training
- Semi-supervised learning
- Transport
Fingerprint
Dive into the research topics of 'Second-order destination inference using semi-supervised self-training for entry-only passenger data'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver