US 12,131,738 B2
	Electronic apparatus and method for controlling thereof
Jaeyoung Roh, Suwon-si (KR); Hejung Yang, Suwon-si (KR); Hojun Jin, Suwon-si (KR); and Donghan Jang, Suwon-si (KR)
Assigned to SAMSUNG ELECTRONICS CO., LTD., Suwon-si (KR)
Filed by SAMSUNG ELECTRONICS CO., LTD., Suwon-si (KR)
Filed on Sep. 14, 2022, as Appl. No. 17/944,401.
Application 17/944,401 is a continuation of application No. PCT/KR2021/012883, filed on Sep. 17, 2021.
Claims priority of application No. 10-2021-0000983 (KR), filed on Jan. 5, 2021.
Prior Publication US 2023/0017927 A1, Jan. 19, 2023
Int. Cl. G10L 15/22 (2006.01); G10L 15/02 (2006.01); G10L 15/26 (2006.01)

CPC G10L 15/22 (2013.01) [G10L 15/02 (2013.01); G10L 15/26 (2013.01); G10L 2015/223 (2013.01)]

15 Claims

1. An electronic apparatus comprising:

a microphone;

a communication interface;

a memory configured to store at least one instruction; and

a processor configured to execute the at least one instruction to:

obtain a user voice input for registering a wake-up voice input via the microphone;

input the user voice input into a trained neural network model to obtain a first feature vector corresponding to text included in the user voice input;

receive a verification data set determined based on information related to the text included in the user voice input from an external server via the communication interface;

input a verification voice input included in the verification data set into the trained neural network model to obtain a second feature vector corresponding to the verification voice input; and

identify whether to register the user voice input as the wake-up voice input based on a similarity between the first feature vector and the second feature vector.

11. A method of controlling an electronic apparatus, the method comprising:

obtaining a user voice input for registering a wake-up voice input;

inputting the user voice input into a trained neural network model to obtain a first feature vector corresponding to text included in the user voice input;

receiving a verification data set determined based on information related to the text included in the user voice input from an external server;

inputting a verification voice input included in the verification data set into the trained neural network model to obtain a second feature vector corresponding to the verification voice input; and

identifying whether to register the user voice input as the wake-up voice input based on a similarity between the first feature vector and the second feature vector.