US 12,456,159 B1
	Systems and methods for object detection in spherical videos
Amine Chadli, San Mateo, CA (US); Rawia Mhiri Ep Hakim, Lille (FR); Ralph McEntagart, Ville en Sallaz (FR); and Amine Belhakimi, Paris (FR)
Assigned to GoPro, Inc., San Mateo, CA (US)
Filed by GoPro, Inc., San Mateo, CA (US)
Filed on Jun. 14, 2024, as Appl. No. 18/744,422.
Int. Cl. G06T 3/16 (2024.01); G06T 5/80 (2024.01); G06T 7/20 (2017.01); G06T 7/70 (2017.01); G06V 20/40 (2022.01)

CPC G06T 3/16 (2024.01) [G06T 5/80 (2024.01); G06T 7/20 (2013.01); G06T 7/70 (2017.01); G06V 20/46 (2022.01); G06T 2207/10016 (2013.01)]

20 Claims

1. A system for object detection in spherical videos, the system comprising:

one or more physical processors configured by machine-readable instructions to:

obtain video information defining a spherical video, the spherical video having a progress length, the spherical video including spherical visual content viewable as a function of progress through the progress length, wherein the spherical visual content has a field of view of 360 degrees;

generate multiple perspective projections of the spherical visual content, individual perspective projections providing a two-dimensional view of an extent of the spherical visual content, adjacent perspective projections having an overlap, wherein the multiple perspective projections of the spherical visual content are generated without use of equirectangular projection;

perform object detection in the multiple perspective projections, the object detection including identification of objects depicted within the multiple perspective projections, determination of placement of the identified objects, and generation of scores for the identified objects, the scores for the identified objects indicating confidence of the object detection for the identified objects, the placement of the identified objects including positions and sizes of the identified objects in the multiple perspective projections wherein a given object is identified within a given perspective projection, the given object being a given distance from a boundary of the given perspective projection, the given object having a given score;

modify one or more of the scores for the identified objects based on proximity of the identified objects to boundaries of the multiple perspective projections, wherein the given score for the given object is modified based on the given distance of the given object from the boundary of the given perspective projection;

project the placement of the identified objects within the multiple perspective projections to the spherical visual content;

identify multiple detections of a single object within the identified objects;

filter out one or more of the multiple detections of the single object from the identified objects as being redundant detection based on the scores for the identified objects; and

perform object tracking in the spherical video based on the projected placement of the identified objects in the spherical visual content.