US 12,443,654 B2
Searchable video with binary vectors
Dharanish Kedarisetti, Woburn, MA (US); and Yanyan Hu, Andover, MA (US)
Assigned to MOTOROLA SOLUTIONS, INC., Chicago, IL (US)
Filed by MOTOROLA SOLUTIONS, INC., Chicago, IL (US)
Filed on Jan. 12, 2023, as Appl. No. 18/096,234.
Prior Publication US 2024/0241908 A1, Jul. 18, 2024
Int. Cl. G06F 16/732 (2019.01); H04N 19/94 (2014.01)
CPC G06F 16/732 (2019.01) [H04N 19/94 (2014.11)] 15 Claims
OG exemplary drawing
 
1. A non-transitory machine-readable medium comprising instructions that, when executed by a processor, cause the processor to:
receive metadata describing an event that is detected within video captured by a camera;
generate a grid for a scene of the video, wherein the grid comprises a multiscale overlapping grid, wherein the multiscale overlapping grid includes multiple grids with different scales, and wherein each of the multiple grids represents an entirety of the scene of the video;
determine respective cells, of the multiple grids of the multiscale overlapping grid, that contain the event;
encode the event as a binary vector;
generate sets of respective binary vectors for the multiple grids, the sets of the respective binary vectors each including the binary vector encoding the event for a respective cell that contains the event, and wherein other binary vectors of the respective binary vectors indicate that associated cells are lacking the event;
associate the sets of the respective binary vectors with respective multiple grids of the multiscale overlapping grid;
store the sets of the respective binary vectors in a queryable datastore;
receive a query to find a given event, similar to the event, within the video;
encode the query as a query binary vector representative of the given event;
compare the query binary vector to the sets of the respective binary vectors; and
when the query binary vector is similar to the binary vector found in the sets of the respective binary vectors, identify the respective cells of the multiple grids that contain the event,
wherein when the event is found in a first multiscale grid that includes only one cell, the scene of the video is determined to include the event, and
wherein when the event is found in a second multiscale grid that includes a plurality of cells, the event is identified in a respective region of the scene of the video that includes the respective cell associated with the respective binary vector.
 
6. A non-transitory machine-readable medium comprising instructions that, when executed by a processor, cause the processor to:
receive a query to find an event within video;
encode the query as a query binary vector representative of the event;
compare the query binary vector to sets of respective binary vectors generated for the video to find similar events,
wherein a scene of the video is associated with a grid, wherein the grid comprises a multiscale overlapping grid,
wherein the multiscale overlapping grid includes multiple grids with different scales, and each of the multiple grids represents an entirety of the scene of the video,
wherein the sets of respective binary vectors are respectively associated with the multiple grids,
wherein the sets of the respective binary vectors each include a respective binary vector encoding a given event, similar to the event, for a respective cell that contains the given event, and wherein other binary vectors of the respective binary vectors indicate that associated cells are lacking the event; and
return a result indicative of a similarity of the query binary vector to one or more respective binary vectors of the sets of respective binary vectors,
wherein the instructions are further to:
when the query binary vector is similar to the respective binary vector found in the sets of the respective binary vectors, identify the respective cells of the multiple grids that contain the event,
when the event is found in a first multiscale grid that includes only one cell that includes the entirety of the scene of the video, the result indicates that the entirety of the scene of the video includes the event, and
when the event is found in a second multiscale grid that includes a plurality of cells, the result indicates that the event is identified in a respective region of the scene of the video that includes the respective cell associated with the respective binary vector.
 
11. A non-transitory machine-readable medium comprising instructions that, when executed by a processor, cause the processor to:
generate a set of binary vectors representative of events detected within a video;
wherein each binary vector is associated with a spatial dimension, a temporal dimension, or the spatial and temporal dimensions of the video;
wherein different subsets of the set of binary vectors are associated with different scales of a multiscale overlapping grid defined for a scene of the video, and
wherein the multiscale overlapping grid includes multiple grids with different scales, and wherein each of the multiple grids represents an entirety of the scene of the video,
wherein each of the different subsets includes a respective binary vector representing a given event for a respective cell that contains the given event, and other binary vectors of each of the different subsets indicate that associated cells are lacking the given event,
wherein when a first multiscale grid includes only one cell, the first multiscale grid is for determining that an entirety of the scene of the video includes the given event, and
wherein when a second multiscale grid includes a plurality of cells, the second multiscale grid is for determining a respective region of the scene of the video that includes the respective cell, associated with the respective binary vector, that includes the given event.