US 12,190,587 B2
	Recursive segment to scene segmentation for cloud-based coding of HDR video
Harshad Kadu, Santa Clara, CA (US); Guan-Ming Su, Fremont, CA (US); Neeraj J. Gadgil, Pune (IN); and Tsung-Wei Huang, Sunnyvale, CA (US)
Assigned to Dolby Laboratories Licensing Corporation, San Francisco, CA (US)
Appl. No. 18/044,771
Filed by Dolby Laboratories Licensing Corporation, San Francisco, CA (US)
PCT Filed Sep. 17, 2021, PCT No. PCT/US2021/050838 § 371(c)(1), (2) Date Mar. 9, 2023, PCT Pub. No. WO2022/061089, PCT Pub. Date Mar. 24, 2022.
Claims priority of provisional application 63/080,255, filed on Sep. 18, 2020.
Claims priority of application No. 20196876 (EP), filed on Sep. 18, 2020.
Prior Publication US 2023/0343100 A1, Oct. 26, 2023
Int. Cl. G06V 20/40 (2022.01); G06T 5/40 (2006.01); H04N 19/142 (2014.01); H04N 19/192 (2014.01); H04N 19/98 (2014.01)

CPC G06V 20/49 (2022.01) [G06T 5/40 (2013.01); H04N 19/142 (2014.11); H04N 19/192 (2014.11); H04N 19/98 (2014.11)]

13 Claims

1. A method for segmenting a video segment into scenes using a cloud-based system for encoding high dynamic range video, the method comprising:

receiving in a current computing node of the cloud-based system a first video sequence comprising video frames in a high dynamic range;

generating for each video frame in the first video sequence a frame-based forward reshaping function mapping the video frame from the high dynamic range to a second dynamic range lower than the high dynamic range;

generating, using a set of scene cuts for the first video sequence, a set of primary scenes for the first video sequence;

generating a second set of scenes for the first video sequence based on the set of primary scenes, wherein a primary scene belonging to a parent scene with video frames to be coded across the current computing node and a neighbor computing node of the cloud-based system is divided into secondary scenes, wherein, given a primary scene, generating a list of secondary scenes for the primary scene comprises:

initializing a set of secondary scenes and a set of violation scenes based on the set of primary scenes; and

generating one or more sets of smoothness thresholds based on the frame-based forward reshaping functions, wherein generating the one or more smoothness thresholds comprises computing a first set of smoothness thresholds ℑ_j^DCfor each frame i in the first video sequence,

ℑ_j^DC=χ_j−χ_j−1,

wherein

χ_j=1/H×WΣ_bT_P^F(b)×h_j^v(b),

wherein T _j F (b) denotes a frame-based forward reshaping function for frame-j in the first video sequence as a function of input codewords b, h_j^v(b) denotes a histogram of the j-th frame in the first video sequence, and H and W denote width and height values for the frames in the first video sequence;

generating for each scene in the second set of scenes a scene-based forward reshaping function mapping the video frames in the scene from the high dynamic range to the second dynamic range;

applying the scene-based forward reshaping functions to the video frames in the first video sequence to generate an output video sequence comprising video frames in the second dynamic range; and

compressing the output video sequence to generate a coded bitstream.