BodySLAM: A Generalized Monocular Visual SLAM Framework for Surgical Applications

BodySLAM is a cutting-edge, deep learning-based Simultaneous Localization and Mapping (SLAM) framework designed specifically for endoscopic surgical applications. By leveraging advanced AI techniques, BodySLAM brings enhanced depth perception and 3D reconstruction capabilities to various surgical settings, including laparoscopy, gastroscopy, and colonoscopy.

📄 Research Paper

Our comprehensive paper detailing the BodySLAM framework is now available on arXiv:

BodySLAM: A Generalized Monocular Visual SLAM Framework for Surgical Applications

G. Manni, C. Lauretti, F. Prata, R. Papalia, L. Zollo, P. Soda

If you find our work useful in your research, please consider citing:

@misc{manni2024bodyslamgeneralizedmonocularvisual,
      title={BodySLAM: A Generalized Monocular Visual SLAM Framework for Surgical Applications}, 
      author={G. Manni and C. Lauretti and F. Prata and R. Papalia and L. Zollo and P. Soda},
      year={2024},
      eprint={2408.03078},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2408.03078}, 
}

If you use the depth estimation module in your research, please also cite:

@misc{https://doi.org/10.48550/arxiv.2302.12288,
  doi = {10.48550/ARXIV.2302.12288},
  
  url = {https://arxiv.org/abs/2302.12288},
  
  author = {Bhat, Shariq Farooq and Birkl, Reiner and Wofk, Diana and Wonka, Peter and Müller, Matthias},
  
  keywords = {Computer Vision and Pattern Recognition (cs.CV), FOS: Computer and information sciences, FOS: Computer and information sciences},
  
  title = {ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth},
  
  publisher = {arXiv},
  
  year = {2023},
  
  copyright = {arXiv.org perpetual, non-exclusive license}
}

🚀 Overview

In the challenging world of endoscopic surgeries, where hardware limitations and environmental variations pose significant obstacles, BodySLAM stands out by integrating deep learning models with strong generalization capabilities. Our framework consists of three key modules:

Monocular Pose Estimation Module (MPEM): Estimates relative camera poses between consecutive frames using our novel CyclePose architecture
Monocular Depth Estimation Module (MDEM): Predicts depth maps from single images using the Zoe model
3D Reconstruction Module (3DM): Combines pose and depth information for 3D scene reconstruction

✨ Features

State-of-the-Art Depth Estimation: Utilizes the Zoe model for accurate monocular depth estimation
Novel Pose Estimation: Implements CycleVO, a novel developed unsupervised method for pose estimation
Cross-Setting Performance: Robust functionality across various endoscopic surgical environments

🛠 Refactoring Status

We're actively refactoring our codebase to enhance usability and performance. Here's our current progress:

Monocular Depth Estimation Module (MDEM)
Monocular Pose Estimation Module (MPEM)
3D Reconstruction Module (3DM)
Integration and Testing

📘 Examples

We've included several examples to help you get started with BodySLAM:

Depth Estimation Examples

Basic Depth Estimation: Demonstrates the fundamental pipeline for estimating depth from a single image.
```
python examples/depth_estimation/basic_depth_estimation.py
```
Depth Map Scaling and Colorization: Shows how to scale and colorize depth maps for better visualization.
```
python examples/depth_estimation/depth_map_scaling.py
```
Batch Processing: Illustrates how to process multiple images for depth estimation and colorization.
```
python examples/depth_estimation/batch_processing.py
```

Pose Estimation Examples

Single Pair Processing: Estimate relative pose between two consecutive frames.

python examples/pose_estimation/run_cycle_pose.py --mode pair \
    --model_path path/to/model.pth \
    --input frame1.jpg \
    --input2 frame2.jpg \
    --output pose.txt

Sequence Processing: Process an entire sequence of frames.

python examples/pose_estimation/run_cycle_pose.py --mode sequence \
    --model_path path/to/model.pth \
    --input path/to/sequence \
    --output sequence_poses.txt

Dataset Processing: Process multiple sequences in a dataset.

python examples/pose_estimation/run_cycle_pose.py --mode dataset \
    --model_path path/to/model.pth \
    --input path/to/dataset \
    --output path/to/results

🚀 Installation

Clone the repository:

git clone https://github.com/yourusername/BodySLAM.git
cd BodySLAM

Create and activate a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install the required modules:
```
pip install -r requirements.txt
```

🔧 Project Structure

BodySLAM/
├── src/
│   ├── depth_estimation/
│   │   └── interface.py
│   └── pose_estimation/
│       └── interface.py
├── examples/
│   ├── depth_estimation/
│   │   └── basic_depth_estimation.py
│   └── pose_estimation/
│       └── run_cycle_pose.py
└── tests/

🔜 Coming Soon

3D Reconstruction Module: Integration of pose and depth for complete 3D reconstruction (28/01/2025)
Pre-trained Models: Ready-to-use models for different surgical settings (29/01/2025)
Enhanced Documentation: More detailed tutorials and API documentation

❓ FAQ

Q: Will the training dataset for CycleVO be released? A: No, the training dataset for CycleVO will not be released to the public. However, we will release the pre-trained model weights.

Q: Where can I find the Hamlyn Dataset? A: The Hamlyn Dataset can be accessed here.

Q: Where can I find the EndoSLAM Dataset? A: The EndoSLAM Dataset can be accessed here.

🤝 Contributing

We welcome contributions! If you're interested in improving BodySLAM, please check our Contributing Guidelines (coming soon).

📄 License

BodySLAM is released under the MIT License.

For questions or support, please open an issue on our GitHub repository.

Name		Name	Last commit message	Last commit date
Latest commit History 106 Commits
examples		examples
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BodySLAM: A Generalized Monocular Visual SLAM Framework for Surgical Applications

📄 Research Paper

🚀 Overview

✨ Features

🛠 Refactoring Status

📘 Examples

Depth Estimation Examples

Pose Estimation Examples

🚀 Installation

🔧 Project Structure

🔜 Coming Soon

❓ FAQ

🤝 Contributing

📄 License

About

Releases

Packages

Languages

GuidoManni/BodySLAM

Folders and files

Latest commit

History

Repository files navigation

BodySLAM: A Generalized Monocular Visual SLAM Framework for Surgical Applications

📄 Research Paper

🚀 Overview

✨ Features

🛠 Refactoring Status

📘 Examples

Depth Estimation Examples

Pose Estimation Examples

🚀 Installation

🔧 Project Structure

🔜 Coming Soon

❓ FAQ

🤝 Contributing

📄 License

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages