RG-RGD: Real-Time Small-Target RGB-D Depth Refinement for Robotic Laser Ablation

This repository contains the reference implementation for the manuscript:

RG-RGD: Real-Time Small-Target RGB-D Depth Refinement for Robotic Laser Ablation
Bowen Si, Dayong Ning, Jiaoyi Hou, Yongjun Gong, Ming Yi, Fengrui Zhang, Zhilei Liu
Manuscript submitted to Machines (MDPI), 2026

RG-RGD denotes residual-gated RGB-D depth refinement. The released code supports the public VOID benchmark experiment and the RGB-D/IMU self-supervised training pipeline used for the small-target robotic perception study.

Release Scope

Included:

Supervised RGB-D depth refinement on the public VOID benchmark.
RGB-D/IMU self-supervised training for small-target video sequences.
Reference implementations of BFS-SOFA, residual-gated Bayesian measurement fusion, UACSPN refinement, and IMU-assisted view-synthesis training.
Reproduction scripts, dataset-layout notes, citation metadata, and reproducibility checklists.

Not included:

VOID dataset files.
Private London plane fruit-ball RGB-D/IMU sequences.
Model checkpoints or local ViT weights.
Laser-head or gimbal firmware.
Deployment-level field-trial statistics.

The prototype-related code path is provided to show how refined local geometry can be consumed by a robotic workflow. Dataset files and hardware-control firmware are not redistributed in this repository.

Repository Layout

RG-RGD-Depth-Refinement/
|-- README.md
|-- LICENSE
|-- CITATION.cff
|-- OPEN_SOURCE_MANIFEST.md
|-- environment.yml
|-- requirements.txt
|-- requirements-optional.txt
|-- configs/
|   |-- void_paper_command.txt
|   `-- selfsup_paper_command.txt
|-- docs/
|   |-- CODE_ALIGNMENT.md
|   |-- DATA_PREPARATION.md
|   |-- REPRODUCE_VOID.md
|   |-- REPRODUCE_SELFSUP.md
|   `-- REPRODUCIBILITY_CHECKLIST.md
|-- scripts/
|   |-- run_void.sh
|   |-- run_selfsup.sh
|   `-- verify_release.py
`-- tools/
    |-- train_void_supervised.py
    `-- train_rgbd_imu_selfsup.py

Installation

Create the reference conda environment:

conda env create -f environment.yml
conda activate rgrgd

Alternatively, install the Python dependencies manually:

conda create -n rgrgd python=3.10 -y
conda activate rgrgd
pip install -r requirements.txt

Install the PyTorch build that matches your CUDA version when using the manual route. Optional YOLO-based utilities are disabled in the default reproduction commands; install them only when running ablations:

pip install -r requirements-optional.txt

Dataset Preparation

This repository does not redistribute datasets. Users should prepare datasets locally and pass paths to the scripts.

VOID: download the official VOID release and point --root to a density folder such as void_release/void_1500.
RGB-D/IMU sequences: use the layout documented in docs/DATA_PREPARATION.md. Private London plane sequences are available from the corresponding author upon reasonable request, subject to institutional approval.

Reproducing the Main Experiments

VOID benchmark

bash scripts/run_void.sh /path/to/void_release/void_1500 runs/void_rgrgd

The default VOID command uses --vit_no_pretrained and does not download external ViT weights. Local ViT weights can be supplied with --vit_local_weights; report that setting separately from the default run.

RGB-D/IMU self-supervised training

bash scripts/run_selfsup.sh /path/to/london_plane_rgbd_imu runs/selfsup_london_plane

Depth frames should be registered to the RGB camera before training when the RGB-D sensor stores depth in a different camera frame.

Full commands are recorded in:

configs/void_paper_command.txt
configs/selfsup_paper_command.txt

Paper-to-Code Mapping

Manuscript component	Main code location
Hybrid RGB-D feature extraction	`ViTSRGBStem`, `rgb_local`, `dep_stem` in `tools/train_*`
Dense depth hint from valid measurements	depth-hint preprocessing in both training scripts
Benefit-driven foveated scale head	`BFSHead`
Small-object focused attention	`SofaCrossAttention`
Self-play benefit supervision	two-pass training loop in `tools/train_rgbd_imu_selfsup.py`
Residual-gated depth prediction	`RGRGDDepthRefiner.forward()`
Bayesian measurement fusion	uncertainty heads and fusion block in `RGRGDDepthRefiner.forward()`
UACSPN propagation	`LiteLearnedPropRefiner`, `GaussianBPRefiner`, `UACSPNRefiner`
IMU-assisted pose decomposition	`PoseNet`, `IMUCache`
View-synthesis warping	`warp_src_to_tgt`
VOID benchmark experiment	`tools/train_void_supervised.py`
RGB-D/IMU self-supervised experiment	`tools/train_rgbd_imu_selfsup.py`

Additional notes are available in docs/CODE_ALIGNMENT.md.

Verification

Before sharing a modified version, run:

python -m py_compile tools/train_void_supervised.py tools/train_rgbd_imu_selfsup.py

To run an end-to-end release check without external datasets:

python scripts/verify_release.py

The verification script creates small synthetic VOID-style and RGB-D/IMU-style datasets, runs one reduced training epoch for each entry point, and checks checkpoint writing. It verifies software execution only and is not used to reproduce manuscript metrics.

When reporting new results, record the exact command, git commit hash or release tag, dataset split, seed, GPU, CUDA version, PyTorch version, and whether local ViT weights or optional YOLO utilities were used. A checklist is provided in docs/REPRODUCIBILITY_CHECKLIST.md.

Reported Results

The manuscript reports the following headline values under the paper protocol:

Setting	Metric	Value
VOID benchmark	MAE	24.95 mm
VOID benchmark	iMAE	10.85
Self-collected ROI ablation	ROI geometric error reduction	15.3% with BFS-SOFA
Runtime at 320 x 320	model / end-to-end latency	44.57 ms / 72.70 ms mean

See the manuscript for the complete comparison tables, ablation settings, and runtime protocol.

Citation

Please cite the associated manuscript when using this code:

@article{si2026rgrgd,
  title   = {RG-RGD: Real-Time Small-Target RGB-D Depth Refinement for Robotic Laser Ablation},
  author  = {Si, Bowen and Ning, Dayong and Hou, Jiaoyi and Gong, Yongjun and
             Yi, Ming and Zhang, Fengrui and Liu, Zhilei},
  journal = {Machines},
  year    = {2026},
  note    = {Manuscript under review}
}

Machine-readable citation metadata are provided in CITATION.cff.

License

This code is released under the MIT License. The VOID dataset and third-party assets retain their own licenses; users are responsible for complying with those terms.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RG-RGD: Real-Time Small-Target RGB-D Depth Refinement for Robotic Laser Ablation

Release Scope

Repository Layout

Installation

Dataset Preparation

Reproducing the Main Experiments

VOID benchmark

RGB-D/IMU self-supervised training

Paper-to-Code Mapping

Verification

Reported Results

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
configs		configs
docs		docs
scripts		scripts
tools		tools
.gitattributes		.gitattributes
.gitignore		.gitignore
CITATION.cff		CITATION.cff
LICENSE		LICENSE
OPEN_SOURCE_MANIFEST.md		OPEN_SOURCE_MANIFEST.md
README.md		README.md
environment.yml		environment.yml
requirements-optional.txt		requirements-optional.txt
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

RG-RGD: Real-Time Small-Target RGB-D Depth Refinement for Robotic Laser Ablation

Release Scope

Repository Layout

Installation

Dataset Preparation

Reproducing the Main Experiments

VOID benchmark

RGB-D/IMU self-supervised training

Paper-to-Code Mapping

Verification

Reported Results

Citation

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages