US 12,154,559 B2
	Speech recognition device and method
Chanwon Seo, Suwon-si (KR); Yehoon Kim, Suwon-si (KR); and Sojung Yun, Suwon-si (KR)
Assigned to SAMSUNG ELECTRONICS CO., LTD., Suwon-si (KR)
Appl. No. 16/770,243
Filed by SAMSUNG ELECTRONICS CO., LTD., Suwon-si (KR)
PCT Filed Dec. 19, 2018, PCT No. PCT/KR2018/016219 § 371(c)(1), (2) Date Jun. 5, 2020, PCT Pub. No. WO2019/124963, PCT Pub. Date Jun. 27, 2019.
Claims priority of application No. 10-2017-0175338 (KR), filed on Dec. 19, 2017.
Prior Publication US 2020/0372911 A1, Nov. 26, 2020
Int. Cl. G10L 15/22 (2006.01); G10L 15/20 (2006.01); G10L 15/25 (2013.01)

CPC G10L 15/22 (2013.01) [G10L 15/20 (2013.01); G10L 15/25 (2013.01)]

8 Claims

1. A speech recognition device comprising:

a microphone;

a memory storing information about at least one pre-set output location of a voice signal from at least one external device; and

a processor configured to:

receive a voice signal through the microphone,

generate voice characteristic data by determining whether an output location of the voice signal corresponds to the at least one pre-set output location, by determining a number of output locations from which the voice signal is output using a data recognition model and by determining whether the voice signal is a reconstructed signal from a compressed signal by analyzing waveform and frequency of the voice signal based on a neural network,

based on the voice characteristic data, identify whether the voice signal is a user-spoken voice signal or a voice signal output from the at least one external device, wherein the identifying comprises determining the voice signal as more likely being a voice signal output from the at least one external device than being a user-spoken signal based on determining that the voice signal is output from a plurality of output locations, and determining the voice signal as a voice signal output from the at least one external device based on determining that the voice signal is the reconstructed signal from the compressed signal,

based on the voice signal being identified as the user-spoken voice signal, identify the voice signal as a voice command and perform an operation corresponding to the voice command.