Holistic panoramic 3D scene understanding using spherical harmonics

Credit: Ulsan National Institute of Science and Technology

An advanced artificial intelligence (AI) technology has been developed that can extract three-dimensional (3D) spatial structure and object information within indoor environments using just a single 360-degree panoramic photograph. This breakthrough is expected to significantly impact fields requiring precise spatial understanding, including augmented reality (AR), mixed reality (MR), and digital twin applications.

Led by Professor Kyungdon Joo from the Graduate School of Artificial Intelligence at UNIST, the research team introduced HUSH (Holistic Panoramic 3D Scene Understanding using Spherical Harmonics)—an AI model, capable of simultaneously extracting spatial configurations and internal object details from panoramic images with remarkable accuracy.

In AR and MR technologies, integrating digital content with real-world spaces necessitates AI systems to accurately interpret and represent information such as wall and furniture positions, as well as distances between objects. Traditionally, achieving this level of understanding has required multiple images from different angles or expensive equipment, like depth sensors.

The HUSH model advances beyond these limitations by utilizing only a single 360-degree panoramic image to derive this information. Although panoramic images can capture a wider scene in a single shot, their spherical distortion makes precise analysis challenging. Conventional methods attempt to mitigate this by segmenting the image and repeatedly applying standard AI models, but this often results in information loss or computational inefficiency.

To address these issues, the research team employed Spherical Harmonics (SH)—a mathematical technique that accurately models the spherical nature of panoramic images. This method decomposes the scene into frequency components: Low-frequency components effectively represent broad, flat areas like ceilings and floors, while high-frequency components capture detailed structures such as furniture and objects, thereby enhancing accuracy.

Jongsung Lee, the first author of the study, explained, “Spherical Harmonics are traditionally used in virtual view synthesis for representing color and lighting of objects or scenes. Recognizing their capacity to analyze data on a spherical surface, we innovatively applied SH to panoramic image-based spatial reconstruction for the first time.”

The HUSH model demonstrated superior accuracy in depth prediction and other spatial understanding tasks compared to existing 3D scene reconstruction models. Remarkably, it can infer multiple spatial details from a single image, offering both high performance and computational efficiency.

Professor Joo emphasized, “This technology has broad potential applications in real-world scenarios where precise understanding of indoor spaces is essential—such as AR and MR environments, or creating immersive media that enable user interaction from just one image.”

This research was presented at CVPR 2025 (Conference on Computer Vision and Pattern Recognition), held in Nashville, from June 11 to 15, 2025.

More information:
Jongsung Lee, Harin Park, Byeong-Uk Lee, and Kyungdon Joo, “HUSH: Holistic Panoramic 3D Scene Understanding using Spherical Harmonics,” CVPR 2025, (2025).

Poster: cvpr.thecvf.com/virtual/2025/poster/33754

GitHub: vision3d-lab.github.io/hush/

Provided by
Ulsan National Institute of Science and Technology


Citation:
HUSH: Holistic panoramic 3D scene understanding using spherical harmonics (2025, July 9)
retrieved 9 July 2025
from https://techxplore.com/news/2025-07-hush-holistic-panoramic-3d-scene.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.