Navigating with Spatial Intelligence: A Survey of Scene Graph-Based Object Goal Navigation

Chi GUO; Aolin LI; Yiyue MENG

doi:10.1051/wujns/2025305405

Open Access

Issue		Wuhan Univ. J. Nat. Sci. Volume 30, Number 5, October 2025


Page(s)		405 - 426
DOI		https://doi.org/10.1051/wujns/2025305405
Published online		04 November 2025

Wuhan University Journal of Natural Sciences, 2025, Vol.30 No.5, 405-426

CLC number: TP18

Navigating with Spatial Intelligence: A Survey of Scene Graph-Based Object Goal Navigation

以空间智能导航：基于场景图谱的目标驱动导航综述

Chi GUO (郭迟)¹^,2^,3^†, Aolin LI (李奥林)¹^,4 and Yiyue MENG (孟怡悦)¹^,5

¹ GNSS Research Center, Wuhan University, Wuhan 430072, Hubei, China
² Hubei Luojia Laboratory, Wuhan 430072, Hubei, China
³ Artificial Intelligence Institute, Wuhan University, Wuhan 430072, Hubei, China
⁴ School of Geodesy and Geomatics, Wuhan University, Wuhan 430072, Hubei, China
⁵ Electronic Information School, Wuhan University, Wuhan 430072, Hubei, China

^† Corresponding author. E-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.

Received: 25 September 2024

Abstract

Today, autonomous mobile robots are widely used in all walks of life. Autonomous navigation, as a basic capability of robots, has become a research hotspot. Classical navigation techniques, which rely on pre-built maps, struggle to cope with complex and dynamic environments. With the development of artificial intelligence, learning-based navigation technology have emerged. Instead of relying on pre-built maps, the agent perceives the environment and make decisions through visual observation, enabling end-to-end navigation. A key challenge is to enhance the generalization ability of the agent in unfamiliar environments. To tackle this challenge, it is necessary to endow the agent with spatial intelligence. Spatial intelligence refers to the ability of the agent to transform visual observations into insights, insights into understanding, and understanding into actions. To endow the agent with spatial intelligence, relevant research uses scene graph to represent the environment. We refer to this method as scene graph-based object goal navigation. In this paper, we concentrate on scene graph, offering formal description, computational framework of object goal navigation. We provide a comprehensive summary of the methods for constructing and applying scene graph. Additionally, we present experimental evidence that highlights the critical role of scene graph in improving navigation success. This paper also delineates promising research directions, all aimed at sharpening the focus on scene graph. Overall, this paper shows how scene graph endows the agent with spatial intelligence, aiming to promote the importance of scene graph in the field of intelligent navigation.

摘要

自主移动机器人在各行各业中得到了广泛应用。自主导航作为机器人的基本能力，已成为研究热点。传统的导航技术依赖于预先构建的地图，难以应对复杂动态环境。随着人工智能的发展，基于学习的导航技术应运而生。与传统方法不同，智能体通过视觉观察感知环境并做出决策，实现端到端的导航。一个关键挑战在于增强智能体在陌生环境中的泛化能力。为了应对这一挑战，赋予智能体空间智能十分必要。空间智能指的是智能体将视觉观察转化为洞察、洞察转化为理解、理解转化为行动的能力。为了赋予智能体空间智能，相关研究使用场景图谱来表示环境。我们将这种方法称为基于场景图谱的目标驱动导航。在本文中，我们专注于场景图谱，提供目标驱动导航的形式描述和计算框架。我们全面总结了构建和应用场景图谱的方法。此外，我们提供实验证据，突出场景图谱在提高导航成功率中的关键作用。聚焦场景图谱，本文还概述了有关研究方向。总的来说，本文展示了场景图谱如何赋予智能体空间智能，旨在加强智能导航领域对场景图谱的重视。

Key words: object goal navigation / scene graph / spatial intelligence / deep reinforcement learning

关键字 : 目标驱动导航 / 场景图谱 / 空间智能 / 深度强化学习

Cite this article: GUO Chi, LI Aolin, MENG Yiyue. Navigating with Spatial Intelligence: A Survey of Scene Graph-Based Object Goal Navigation[J]. Wuhan Univ J of Nat Sci, 2025, 30(5): 405-426.

Biography: GUO Chi, Ph. D., Professor, research direction: BeiDou applications, unmanned system navigation and location-based services. E-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.

Foundation item: Supported by the Major Science and Technology Project of Hubei Province of China (2022AAA009), the Open Fund of Hubei Luojia Laboratory

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.