zikele

zikele

人生如此自可乐

LookOut:现实世界中的人形自我中心导航

2508.14466v1

中文标题#

LookOut:现实世界中的人形自我中心导航

英文标题#

LookOut: Real-World Humanoid Egocentric Navigation

中文摘要#

从第一人称观察中预测无碰撞的未来轨迹在人形机器人、VR/AR 和辅助导航等应用中至关重要。 在本工作中,我们引入了一个具有挑战性的问题,即从第一人称视频中预测一系列未来的 6D 头部姿态。 特别是,我们预测头部的平移和旋转,以学习通过头部转动事件表达的主动信息获取行为。 为了解决这个任务,我们提出了一种框架,该框架对时间聚合的 3D 潜在特征进行推理,该框架对环境静态和动态部分的几何和语义约束进行建模。 受该领域缺乏训练数据的启发,我们进一步贡献了一个使用 Project Aria 眼镜的数据收集管道,并通过这种方法展示了数据集。 我们的数据集名为 Aria Navigation Dataset(AND),包含用户在现实场景中导航的 4 小时记录。 它包括各种情况和导航行为,为学习现实世界的第一人称导航策略提供了宝贵的资源。 大量实验表明,我们的模型学习到了类似人类的导航行为,例如等待 / 减速、重新规划路线以及在陌生环境中四处张望以观察交通情况。 请访问我们的项目网页https://sites.google.com/stanford.edu/lookout。

英文摘要#

The ability to predict collision-free future trajectories from egocentric observations is crucial in applications such as humanoid robotics, VR / AR, and assistive navigation. In this work, we introduce the challenging problem of predicting a sequence of future 6D head poses from an egocentric video. In particular, we predict both head translations and rotations to learn the active information-gathering behavior expressed through head-turning events. To solve this task, we propose a framework that reasons over temporally aggregated 3D latent features, which models the geometric and semantic constraints for both the static and dynamic parts of the environment. Motivated by the lack of training data in this space, we further contribute a data collection pipeline using the Project Aria glasses, and present a dataset collected through this approach. Our dataset, dubbed Aria Navigation Dataset (AND), consists of 4 hours of recording of users navigating in real-world scenarios. It includes diverse situations and navigation behaviors, providing a valuable resource for learning real-world egocentric navigation policies. Extensive experiments show that our model learns human-like navigation behaviors such as waiting / slowing down, rerouting, and looking around for traffic while generalizing to unseen environments. Check out our project webpage at https://sites.google.com/stanford.edu/lookout.

文章页面#

LookOut:现实世界中的人形自我中心导航

PDF 获取#

查看中文 PDF - 2508.14466v1

智能达人抖店二维码

抖音扫码查看更多精彩内容

Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.