当前位置: X-MOL 学术Int. J. Comput. Vis. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Meet JEANIE: A Similarity Measure for 3D Skeleton Sequences via Temporal-Viewpoint Alignment
International Journal of Computer Vision ( IF 19.5 ) Pub Date : 2024-05-06 , DOI: 10.1007/s11263-024-02070-2
Lei Wang , Jun Liu , Liang Zheng , Tom Gedeon , Piotr Koniusz

Video sequences exhibit significant nuisance variations (undesired effects) of speed of actions, temporal locations, and subjects’ poses, leading to temporal-viewpoint misalignment when comparing two sets of frames or evaluating the similarity of two sequences. Thus, we propose Joint tEmporal and cAmera viewpoiNt alIgnmEnt (JEANIE) for sequence pairs. In particular, we focus on 3D skeleton sequences whose camera and subjects’ poses can be easily manipulated in 3D. We evaluate JEANIE on skeletal Few-shot Action Recognition (FSAR), where matching well temporal blocks (temporal chunks that make up a sequence) of support-query sequence pairs (by factoring out nuisance variations) is essential due to limited samples of novel classes. Given a query sequence, we create its several views by simulating several camera locations. For a support sequence, we match it with view-simulated query sequences, as in the popular Dynamic Time Warping (DTW). Specifically, each support temporal block can be matched to the query temporal block with the same or adjacent (next) temporal index, and adjacent camera views to achieve joint local temporal-viewpoint warping. JEANIE selects the smallest distance among matching paths with different temporal-viewpoint warping patterns, an advantage over DTW which only performs temporal alignment. We also propose an unsupervised FSAR akin to clustering of sequences with JEANIE as a distance measure. JEANIE achieves state-of-the-art results on NTU-60, NTU-120, Kinetics-skeleton and UWA3D Multiview Activity II on supervised and unsupervised FSAR, and their meta-learning inspired fusion.



中文翻译:

认识 JEANIE:通过时间视点对齐进行 3D 骨架序列的相似性测量

视频序列表现出动作速度、时间位置和主体姿势的显着干扰变化(不良影响),导致在比较两组帧或评估两个序列的相似性时出现时间视点错位。因此,我们提出序列对的联合时间和相机视点对齐(JEANIE)。我们特别关注 3D 骨架序列,其相机和拍摄对象的姿势可以在 3D 中轻松操纵。我们在骨架少镜头动作识别(FSAR)上评估 JEANIE,其中由于新类别的样本有限,匹配支持查询序列对的时间块(构成序列的时间块)(通过剔除令人讨厌的变化)至关重要。给定一个查询序列,我们通过模拟多个摄像机位置来创建其多个视图。对于支持序列,我们将其与视图模拟的查询序列相匹配,如流行的动态时间规整(DTW)中那样。具体地,每个支持时间块可以与具有相同或相邻(下一个)时间索引以及相邻相机视图的查询时间块匹配,以实现联合局部时间视点扭曲。 JEANIE 在具有不同时间视点扭曲模式的匹配路径中选择最小距离,这比仅执行时间对齐的 DTW 具有优势。我们还提出了一种类似于序列聚类的无监督 FSAR,并使用 JEANIE 作为距离度量。 JEANIE 在 NTU-60、NTU-120、Kinetics-sculpture 和 UWA3D Multiview Activity II 的监督和无监督 FSAR 及其元学习启发融合上取得了最先进的结果。

更新日期:2024-05-08
down
wechat
bug