US 11,990,131 B2
	Method for processing a video file comprising audio content and visual content comprising text content
Jain Rahul, Thergaon Pune (IN); Sen Rudreshwar, Bangalore (IN); Goyal Anuj, Pune (IN); Chavan Dhananjay, Pune (IN); Sinha Utsav, Kolkata (IN); and Shekhar Bavanari, Telangana (IN)
Assigned to BULL SAS, Les Clayes sous Bois (FR)
Filed by ATOS GLOBAL IT SOLUTIONS AND SERVICES PRIVATE LIMITED, Maharashtra (IN)
Filed on Jul. 15, 2021, as Appl. No. 17/376,352.
Claims priority of application No. 20187791 (EP), filed on Jul. 24, 2020.
Prior Publication US 2022/0028391 A1, Jan. 27, 2022
Int. Cl. G10L 15/26 (2006.01)

CPC G10L 15/26 (2013.01)

13 Claims

1. A method for processing a video file, said video file comprising audio content and visual content, the visual content comprising text content, wherein the method comprises:

extracting, by a processing circuit comprising a processor and a memory, the text content in the visual content;

generating, by the processing circuit, a context information for the audio content based on the text content extracted from said visual content;

converting, by the processing circuit, the audio content into text by using the context information generated based on the text content extracted from the visual content of the video file;

generating, by the processing circuit, an additional context information for the audio content based on the text obtained by converting the audio content;

combining, by the processing circuit, the context information generated based on the text content extracted from the visual content with the additional context information in order to obtain a combined context information; and

re-converting, by the processing circuit, the audio content into text by using the combined context information.