| CPC G06V 10/82 (2022.01) [G06N 3/08 (2013.01); G06V 10/40 (2022.01)] | 25 Claims |

|
1. A processor-implemented method with self-attention, comprising:
generating three-dimensional (3D) query data and 3D key data by performing a convolution operation based on a 3D feature map;
generating two-dimensional (2D) vertical data based on an averaging based vertical projection of the generated 3D query data and the generated 3D key data;
generating 2D horizontal data based on an averaging-based horizontal projection of the generated 3D query data and the generated 3D key data;
generating an intermediate attention result through a multiplication based on the generated 2D vertical data and the generated 2D horizontal data;
generating a final attention result through a multiplication based on the generated intermediate attention result and the 3D feature map; and
applying the generated final attention result to the 3D feature map.
|