Biochemical oxygen demand estimation using explainable ensemble learning methods

Research output: Contribution to journalArticlepeer-review

Abstract

Biochemical oxygen demand over five days (BOD5) is a cornerstone indicator of organic pollution, yet its retrieval from remote sensing is hindered by its non-optically active nature. We present an explainable ensemble-learning framework that predicts BOD5 in Hong Kong's marine waters by fusing multi-year (2019–2023) Sentinel-2 imagery with cyclic temporal features and four physicochemical and climatic proxies—chlorophyll-a (Chl-a), salinity, suspended solids (SS) and temperature. Initially, each proxy is estimated and subsequently utilized for BOD5 prediction using CatBoost, LightGBM, XGBoost and Random Forest. XGBoost best captures Chl-a (r = 0.81) and temperature (r = 0.99), whereas CatBoost excels for salinity (r = 0.93), SS (r = 0.85) and ultimately BOD5 (r = 0.88). SHapley Additive exPlanations reveal the dominant predictors and spatio-temporal mapping across four representative dates shows persistently elevated Chl-a, SS and BOD5 and depressed salinity in eutrophic Deep Bay zone. This transparent, high-accuracy framework can guide Environmental Protection Department in prioritizing field sampling and streamlining pollution mitigation.

Original languageEnglish
Article number101835
JournalRemote Sensing Applications: Society and Environment
Volume41
DOIs
StatePublished - Jan 2026

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 14 - Life Below Water
    SDG 14 Life Below Water

Keywords

  • Biochemical oxygen demand
  • Ensemble machine learning
  • Explainable artificial intelligence
  • Hong Kong marine water quality
  • Sentinel-2

Fingerprint

Dive into the research topics of 'Biochemical oxygen demand estimation using explainable ensemble learning methods'. Together they form a unique fingerprint.

Cite this