US 11,893,988 B2
Speech control method, electronic device, and storage medium
Song Yang, Beijing (CN); Saisai Zou, Beijing (CN); Jieyi Cao, Beijing (CN); and Junyao Shao, Beijing (CN)
Assigned to BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD., Beijing (CN)
Filed by BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD., Beijing (CN)
Filed on Jun. 24, 2021, as Appl. No. 17/357,598.
Claims priority of application No. 202011211760.9 (CN), filed on Nov. 3, 2020.
Prior Publication US 2021/0319795 A1, Oct. 14, 2021
Int. Cl. G10L 15/22 (2006.01); G06F 16/635 (2019.01); G10L 15/05 (2013.01)
CPC G10L 15/22 (2013.01) [G06F 16/635 (2019.01); G10L 15/05 (2013.01); G10L 2015/223 (2013.01)] 12 Claims
OG exemplary drawing
 
1. A speech control method, comprising:
acquiring target audio data sent by a client, the target audio data comprising audio data collected by the client within a target duration before wake-up and audio data collected by the client after wake-up;
performing speech recognition on the target audio data; and
controlling the client based on an instruction recognized from a second audio segment of the target audio data in response to recognizing a wake-up word from a first audio segment at beginning of the target audio data; in which, the second audio segment is later than the first audio segment or has an overlapping portion with the first audio segment, wherein a duration of the first audio segment is greater than the target duration;
wherein the method further comprises:
deleting an audio segment of the target duration at a beginning position of the target audio data to acquire retained audio data in response to not recognizing the wake-up word from the first audio segment, or not recognizing the instruction from the second audio segment;
re-performing speech recognition on the retained audio data to obtain a re-devised first audio segment and a re-devised second audio segment; and
controlling the client based on an instruction recognized from the re-divided second audio segment.