Abstract
Efficient laparoscopic scene segmentation holds significant potential for surgical assistive intelligence and image-guided task autonomy in robotic surgery. However, the abdominal cavity with intricate tissues and surgical tools under varying conditions challenges the balance between segmentation accuracy and efficiency. To resolve this problem, we propose PLDKD-Net, a novel pixel-level student-teacher knowledge distillation (KD) framework, in which the student model selectively distills the teacher's profound knowledge while exploring rich visual features with a graph-based fusion mechanism for efficient segmentation. Specifically, we first introduce our confidence-based KD (Confi-KD) scheme, in which a pixel-level confidence generator (PCG) is proposed to assess the teacher's performance by discriminatively evaluating its probability map and the raw image, generating a confidence map that can facilitate a selective KD for the student model. To balance the model's accuracy and efficiency, we devise a novel heterogeneous student architecture with a bi-stream visual parsing pipeline to capture multi-scale and inter-spatial visual features. These features are then fused using a relational graph convolutional network (RGCN), which can adaptively tune the fusion degrees of multi-latent knowledge, ensuring visual parsing completeness while avoiding computational redundancy. We extensively validate PLDKD-Net on two public laparoscopic benchmarks, Endovis18 and CholecSeg8K, and in-house surgical videos. Benefiting from our schemes, the experimental outcomes demonstrate superior quantitative and qualitative performance compared to state-of-the-art methods. With the selective KD mechanism, our model yields competitive or even higher performance than the cumbersome teacher model while exhibiting quasi-real-time efficiency, which demonstrates its greater potential for intelligent robotic surgical scene understanding.
| Original language | English |
|---|---|
| Journal | IEEE Transactions on Instrumentation and Measurement |
| DOIs | |
| State | Accepted/In press - 2025 |
| Externally published | Yes |
Keywords
- Endoscopic segmentation
- graph neural network
- knowledge distillation
- laparoscopic vision
Fingerprint
Dive into the research topics of 'PLDKD-Net: Pixel-Level Discriminative Knowledge Distillation for Surgical Scene Segmentation with Graph-based Visual Parsing'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver