中文标题#
ECHO:面向頻率的變長信號分層編碼
英文标题#
ECHO: Frequency-aware Hierarchical Encoding for Variable-length Signal
中文摘要#
預訓練基礎模型在視覺和語言領域表現出色,但它們在通用機器信號建模方面的潛力 —— 涵蓋聲學、振動和其他工業傳感器數據 —— 仍未得到充分探索。 現有的基於子帶編碼器的方法已取得有競爭力的結果,但受限於固定的輸入長度,以及缺乏顯式的頻率位置編碼。 在本工作中,我們提出了一種新穎的基礎模型,該模型結合了先進的帶分割架構與相對頻率位置嵌入,能夠在任意採樣配置下實現精確的頻譜定位。 該模型支持任意長度的輸入,無需填充或分段,生成的嵌入表示保留了時間和頻譜保真度。 我們在 SIREN(https://github.com/yucongzh/SIREN)上評估了我們的方法,這是一個新提出的用於機器信號編碼的大規模基準,它統一了多個數據集,包括所有 DCASE 任務 2 挑戰(2020-2025)和廣泛使用的工業信號語料庫。 實驗結果表明,在異常檢測和故障識別方面,我們的方法始終表現出最先進的性能,證實了所提出模型的有效性和泛化能力。 我們在 https://github.com/yucongzh/ECHO 上開源了 ECHO。
英文摘要#
Pre-trained foundation models have demonstrated remarkable success in vision and language, yet their potential for general machine signal modeling-covering acoustic, vibration, and other industrial sensor data-remains under-explored. Existing approach using sub-band-based encoders has achieved competitive results but are limited by fixed input lengths, and the absence of explicit frequency positional encoding. In this work, we propose a novel foundation model that integrates an advanced band-split architecture with relative frequency positional embeddings, enabling precise spectral localization across arbitrary sampling configurations. The model supports inputs of arbitrary length without padding or segmentation, producing a concise embedding that retains both temporal and spectral fidelity. We evaluate our method on SIREN (https://github.com/yucongzh/SIREN), a newly introduced large-scale benchmark for machine signal encoding that unifies multiple datasets, including all DCASE task 2 challenges (2020-2025) and widely-used industrial signal corpora. Experimental results demonstrate consistent state-of-the-art performance in anomaly detection and fault identification, confirming the effectiveness and generalization capability of the proposed model. We open-sourced ECHO on https://github.com/yucongzh/ECHO.
文章页面#
PDF 获取#
抖音掃碼查看更多精彩內容