US 12,333,811 B2
Permutation invariant convolution (PIC) for recognizing long-range activities, generating global representation of input streams and classifying activity based on global representation
Noureldien Mahmoud Elsayed Hussein, Amsterdam (NL); Efstratios Gavves, Amsterdam (NL); and Arnold Wilhelmus Maria Smeulders, Amsterdam (NL)
Assigned to QUALCOMM Incorporated, San Diego, CA (US)
Appl. No. 17/769,246
Filed by QUALCOMM Technologies, Inc., San Diego, CA (US)
PCT Filed Nov. 13, 2020, PCT No. PCT/US2020/060595
§ 371(c)(1), (2) Date Apr. 14, 2022,
PCT Pub. No. WO2021/097359, PCT Pub. Date May 20, 2021.
Claims priority of application No. 20190100517 (GR), filed on Nov. 15, 2019.
Prior Publication US 2024/0135708 A1, Apr. 25, 2024
Int. Cl. G06V 20/40 (2022.01); G06V 10/82 (2022.01); G06V 20/58 (2022.01); G06V 30/19 (2022.01)
CPC G06V 20/46 (2022.01) [G06V 10/82 (2022.01); G06V 20/41 (2022.01); G06V 20/49 (2022.01); G06V 20/582 (2022.01); G06V 20/44 (2022.01); G06V 30/19173 (2022.01)] 30 Claims
OG exemplary drawing
 
1. A method, comprising:
segmenting an input stream to generate a plurality of frame sets, each frame set including a plurality of frames;
identifying, by a permutation invariant convolutional layer of a neural network, for each frame set from the plurality of frame sets, a frame with a highest likelihood of including one or more actions of a set of predefined actions;
generating a global representation of the input stream from pooled representations of the identified frames; and
classifying a long-range activity based on the global representation.