US 12,118,996 B2
	Method for processing voice signals of multiple speakers, and electronic device according thereto
Young Ho Han, Suwon-si (KR); Nam Hoon Kim, Suwon-si (KR); Jae Young Roh, Suwon-si (KR); Chi Youn Park, Suwon-si (KR); Kyung Min Lee, Suwon-si (KR); Keun Seok Cho, Suwon-si (KR); and Jong Youb Ryu, Suwon-si (KR)
Assigned to Samsung Electronics Co., Ltd., Suwon-si (KR)
Filed by Samsung Electronics Co., Ltd., Suwon-si (KR)
Filed on Oct. 27, 2022, as Appl. No. 17/975,074.
Application 17/975,074 is a continuation of application No. 16/755,383, granted, now 11,495,222, previously published as PCT/KR2018/013821, filed on Nov. 13, 2018.
Claims priority of application No. 10-2017-0175339 (KR), filed on Dec. 19, 2017.
Prior Publication US 2023/0040938 A1, Feb. 9, 2023
Int. Cl. G10L 15/00 (2013.01); G10L 15/02 (2006.01); G10L 15/10 (2006.01); G10L 15/22 (2006.01)

CPC G10L 15/22 (2013.01) [G10L 15/02 (2013.01); G10L 15/10 (2013.01)]

20 Claims

1. An electronic device comprising:

a receiver receiving a speech signal;

memory storing one or more computer programs; and

one or more processors communicatively coupled to the receiver and the memory,

wherein the one or more computer programs include computer-executable instructions that, when executed by the one or more processors, cause the electronic device to:

control the receiver to receive the speech signal,

determine whether the speech signal comprises speech signals of a plurality of different speakers,

in response to determining that the speech signal comprises the speech signals of the plurality of different speakers, detect feature information from a speech signal of each speaker,

based on the feature information, determine relationships between speech content of the plurality of different speakers,

based on the determined relationships between the speech content of the plurality of different speakers, determine that speech content of a first speaker among the plurality of different speakers and speech content of a second speaker among the plurality of different speakers are generated in a same speech domain and that conflicts occur between the speech content of the first speaker and the speech content of the second speaker, and

based on the determining that conflicts occur and the determined relationships between the speech content of the plurality of different speakers, control the electronic device and at least one other electronic device to perform an operation corresponding to each speech content of the plurality of different speakers.