US 12,254,681 B2
	Multi-modal test-time adaptation
Yi-Hsuan Tsai, Santa Clara, CA (US); Bingbing Zhuang, San Jose, CA (US); Samuel Schulter, New York, NY (US); Buyu Liu, Cupertino, CA (US); Sparsh Garg, San Jose, CA (US); Ramin Moslemi, Pleasanton, CA (US); and Inkyu Shin, Daejeon (KR)
Assigned to NEC Corporation, Tokyo (JP)
Filed by NEC Laboratories America, Inc., Princeton, NJ (US)
Filed on Sep. 6, 2022, as Appl. No. 17/903,393.
Claims priority of provisional application 63/279,715, filed on Nov. 16, 2021.
Claims priority of provisional application 63/241,137, filed on Sep. 7, 2021.
Prior Publication US 2023/0081913 A1, Mar. 16, 2023
Int. Cl. G06K 9/00 (2022.01); G01S 17/89 (2020.01); G06V 10/776 (2022.01); G06V 10/80 (2022.01)

CPC G06V 10/811 (2022.01) [G01S 17/89 (2013.01); G06V 10/776 (2022.01)]

20 Claims

1. A method for multi-modal test-time adaptation, comprising:

inputting a digital image into a pre-trained Camera Intra-modal Pseudo-label Generator (C-Intra-PG);

inputting a Lidar point cloud set into a pre-trained Lidar Intra-modal Pseudo-label Generator (L-Intra-PG);

applying a fast 2-dimension (2D) model, F^2D, and a slow 2D model, S^2D, to the inputted digital image to apply pseudo-labels to the digital image;

applying a fast 3-dimension (3D) model, F^3D, and a slow 3D model, S^3D, to the inputted Lidar point cloud set to apply pseudo-labels to the Lidar point cloud set;

fusing pseudo-label predictions from the fast (F^2D, F^3D) models and the slow (S^2D, S^3D) models through Inter-modal Pseudo-label Refinement (Inter-PR) module to obtain robust pseudo labels;

measuring a prediction consistency for each of the digital image pseudo-labels and Lidar pseudo-labels separately;

selecting confident pseudo-labels from the robust pseudo labels and measured prediction consistencies to form a final cross-modal pseudo-label set as a self-training signal; and

updating batch parameters of the Camera Intra-modal Pseudo-label Generator and Lidar Intra-modal Pseudo-label Generator utilizing the self-training signal.