US 12,437,525 B2
Method and apparatus with self-attention-based image recognition
Seohyung Lee, Yongin-si (KR); Dongwook Lee, Suwon-si (KR); Changbeom Park, Seoul (KR); and Byung In Yoo, Seoul (KR)
Assigned to Samsung Electronics Co., Ltd., Suwon-si (KR)
Filed by SAMSUNG ELECTRONICS CO., LTD., Suwon-si (KR)
Filed on Apr. 14, 2022, as Appl. No. 17/720,681.
Claims priority of application No. 10-2021-0155779 (KR), filed on Nov. 12, 2021.
Prior Publication US 2023/0154171 A1, May 18, 2023
Int. Cl. G06V 10/82 (2022.01); G06N 3/08 (2023.01); G06V 10/40 (2022.01)
CPC G06V 10/82 (2022.01) [G06N 3/08 (2013.01); G06V 10/40 (2022.01)] 25 Claims
OG exemplary drawing
 
1. A processor-implemented method with self-attention, comprising:
generating three-dimensional (3D) query data and 3D key data by performing a convolution operation based on a 3D feature map;
generating two-dimensional (2D) vertical data based on an averaging based vertical projection of the generated 3D query data and the generated 3D key data;
generating 2D horizontal data based on an averaging-based horizontal projection of the generated 3D query data and the generated 3D key data;
generating an intermediate attention result through a multiplication based on the generated 2D vertical data and the generated 2D horizontal data;
generating a final attention result through a multiplication based on the generated intermediate attention result and the 3D feature map; and
applying the generated final attention result to the 3D feature map.