SemiPoint: Generalizing Cross-Scene Point Cloud Video Streaming With Semi-Supervised Learning and Residual-Aware Adaptation

Ao Yu; Hui Yang; Cuiyang Feng; Petteri Nurmi; Mohamed Cheriet; Pan Hui

doi:10.1109/TMM.2026.3680432

SemiPoint: Generalizing Cross-Scene Point Cloud Video Streaming With Semi-Supervised Learning and Residual-Aware Adaptation

Ao Yu
, Hui Yang
, Cuiyang Feng
, Petteri Nurmi
, Mohamed Cheriet
, Pan Hui

Hong Kong University of Science and Technology
Beijing University of Posts and Telecommunications
China University of Mining & Technology, Beijing
University of Helsinki

Research output: Contribution to journal › Journal Article › peer-review

Abstract

Viewport prediction algorithms are shedding new light on point cloud video (PCV) streaming. Most existing methodologies are trained with labeled frames (supervised learning) to reduce bandwidth consumption. However, the fully supervised paradigm requires labor-intensive video labeling, struggles to generalize to unfamiliar scenes, and thus produces noisy bitrate allocation outputs. In this study, we propose SemiPoint, a cross-scene PCV streaming framework that features a semi-supervised viewport prediction module and a residual-augmented deep reinforcement learning (DRL)-based bitrate adaptation module. The viewport prediction module employs a semi-supervised architecture that enhances scene generalization by exploiting unlabeled frames through unsupervised constraints. Furthermore, the DRL-based bitrate adaptation module incorporates a residual model that dynamically corrects abrupt viewport shifts through real-time residual compensation. Extensive experimental evaluations demonstrate that SemiPoint achieves superior performance compared to fully supervised approaches with limited labeled datasets. It demonstrates enhanced generalization capabilities in changing scenes and delivers more reliable bitrate adaptation in scenarios involving sudden head/body movements.

Original language	English
Journal	IEEE Transactions on Multimedia
DOIs	https://doi.org/10.1109/TMM.2026.3680432
Publication status	In press - 2026

!!!Keywords

bitrate adaptation
deep reinforcement learning
Point cloud video streaming
semi-supervised learning
viewport prediction

Access to Document

10.1109/TMM.2026.3680432

Cite this

@article{bfd5ab9c30104d76a9c043bd274479ce,

title = "SemiPoint: Generalizing Cross-Scene Point Cloud Video Streaming With Semi-Supervised Learning and Residual-Aware Adaptation",

abstract = "Viewport prediction algorithms are shedding new light on point cloud video (PCV) streaming. Most existing methodologies are trained with labeled frames (supervised learning) to reduce bandwidth consumption. However, the fully supervised paradigm requires labor-intensive video labeling, struggles to generalize to unfamiliar scenes, and thus produces noisy bitrate allocation outputs. In this study, we propose SemiPoint, a cross-scene PCV streaming framework that features a semi-supervised viewport prediction module and a residual-augmented deep reinforcement learning (DRL)-based bitrate adaptation module. The viewport prediction module employs a semi-supervised architecture that enhances scene generalization by exploiting unlabeled frames through unsupervised constraints. Furthermore, the DRL-based bitrate adaptation module incorporates a residual model that dynamically corrects abrupt viewport shifts through real-time residual compensation. Extensive experimental evaluations demonstrate that SemiPoint achieves superior performance compared to fully supervised approaches with limited labeled datasets. It demonstrates enhanced generalization capabilities in changing scenes and delivers more reliable bitrate adaptation in scenarios involving sudden head/body movements.",

keywords = "bitrate adaptation, deep reinforcement learning, Point cloud video streaming, semi-supervised learning, viewport prediction",

author = "Ao Yu and Hui Yang and Cuiyang Feng and Petteri Nurmi and Mohamed Cheriet and Pan Hui",

note = "Publisher Copyright: {\textcopyright} 1999-2012 IEEE.",

year = "2026",

doi = "10.1109/TMM.2026.3680432",

language = "English",

journal = "IEEE Transactions on Multimedia",

issn = "1520-9210",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - SemiPoint

T2 - Generalizing Cross-Scene Point Cloud Video Streaming With Semi-Supervised Learning and Residual-Aware Adaptation

AU - Yu, Ao

AU - Yang, Hui

AU - Feng, Cuiyang

AU - Nurmi, Petteri

AU - Cheriet, Mohamed

AU - Hui, Pan

PY - 2026

Y1 - 2026

N2 - Viewport prediction algorithms are shedding new light on point cloud video (PCV) streaming. Most existing methodologies are trained with labeled frames (supervised learning) to reduce bandwidth consumption. However, the fully supervised paradigm requires labor-intensive video labeling, struggles to generalize to unfamiliar scenes, and thus produces noisy bitrate allocation outputs. In this study, we propose SemiPoint, a cross-scene PCV streaming framework that features a semi-supervised viewport prediction module and a residual-augmented deep reinforcement learning (DRL)-based bitrate adaptation module. The viewport prediction module employs a semi-supervised architecture that enhances scene generalization by exploiting unlabeled frames through unsupervised constraints. Furthermore, the DRL-based bitrate adaptation module incorporates a residual model that dynamically corrects abrupt viewport shifts through real-time residual compensation. Extensive experimental evaluations demonstrate that SemiPoint achieves superior performance compared to fully supervised approaches with limited labeled datasets. It demonstrates enhanced generalization capabilities in changing scenes and delivers more reliable bitrate adaptation in scenarios involving sudden head/body movements.

AB - Viewport prediction algorithms are shedding new light on point cloud video (PCV) streaming. Most existing methodologies are trained with labeled frames (supervised learning) to reduce bandwidth consumption. However, the fully supervised paradigm requires labor-intensive video labeling, struggles to generalize to unfamiliar scenes, and thus produces noisy bitrate allocation outputs. In this study, we propose SemiPoint, a cross-scene PCV streaming framework that features a semi-supervised viewport prediction module and a residual-augmented deep reinforcement learning (DRL)-based bitrate adaptation module. The viewport prediction module employs a semi-supervised architecture that enhances scene generalization by exploiting unlabeled frames through unsupervised constraints. Furthermore, the DRL-based bitrate adaptation module incorporates a residual model that dynamically corrects abrupt viewport shifts through real-time residual compensation. Extensive experimental evaluations demonstrate that SemiPoint achieves superior performance compared to fully supervised approaches with limited labeled datasets. It demonstrates enhanced generalization capabilities in changing scenes and delivers more reliable bitrate adaptation in scenarios involving sudden head/body movements.

KW - bitrate adaptation

KW - deep reinforcement learning

KW - Point cloud video streaming

KW - semi-supervised learning

KW - viewport prediction

UR - https://www.scopus.com/pages/publications/105034784300

U2 - 10.1109/TMM.2026.3680432

DO - 10.1109/TMM.2026.3680432

M3 - Journal Article

AN - SCOPUS:105034784300

SN - 1520-9210

JO - IEEE Transactions on Multimedia

JF - IEEE Transactions on Multimedia

ER -

SemiPoint: Generalizing Cross-Scene Point Cloud Video Streaming With Semi-Supervised Learning and Residual-Aware Adaptation

Abstract

!!!Keywords

Access to Document

Other files and links

Fingerprint

Cite this