US 12,300,266 B2
	Computing device for attention-based joint training with noise suppression model for sound event detection technology robust against noise environment and method of thereof
Joon-Hyuk Chang, Seoul (KR); and Jin Young Son, Seoul (KR)
Assigned to IUCF-HYU (INDUSTRY-UNIVERSITY COOPERATION FOUNDATION HANYANG UNIVERSITY), Seoul (KR)
Filed by IUCF-HYU (INDUSTRY-UNIVERSITY COOPERATION FOUNDATION HANYANG UNIVERSITY), Seoul (KR)
Filed on Oct. 5, 2022, as Appl. No. 17/960,563.
Claims priority of application No. 10-2021-0177369 (KR), filed on Dec. 13, 2021.
Prior Publication US 2023/0186940 A1, Jun. 15, 2023
Int. Cl. G10L 25/51 (2013.01); G10L 15/20 (2006.01); G10L 25/30 (2013.01)

CPC G10L 25/51 (2013.01) [G10L 15/20 (2013.01); G10L 25/30 (2013.01)]

7 Claims

1. A computing device for a sound event detection (SED) technology robust against a noise environment, the computing device comprising:

a memory;

an input module configured to obtain a sound event input; and

a processor connected to the memory and the input module and configured to execute at least one instruction stored in the memory,

wherein the processor is configured to obtain a joint model for performing noise suppression and SED on the sound event input and identify a type of the sound event input by using the joint model, and

a noise suppression model and an SED model are jointed as the joint model,

wherein the joint model is a model generated by further being trained after the noise suppression model and the SED model are joined,

wherein the SED model in the joint model comprises a plurality of feature extraction layers to which an output of the noise suppression model in the joint model is input, and a classification layer disposed at an end of the plurality of feature extraction layers, and

wherein weights of the classification layer are fixed and not updated during the training after the noise suppression model and the SED model are joined.