CPC G10L 25/63 (2013.01) [G06F 17/16 (2013.01); G06F 17/18 (2013.01); G06N 3/047 (2023.01); G10L 25/30 (2013.01)] | 20 Claims |
1. A system comprising:
one or more memory devices comprising:
an audio bi-directional recurrent encoder that generates an audio feature vector for one or more words in an acoustic sequence;
a textual bi-directional recurrent encoder that generates a textual feature vector for the one or more words in a textual sequence corresponding to the acoustic sequence;
a multi-hop neural attention model that generates an attention output at each hop that alternates from utilizing the textual feature vector and the audio feature vector as context; and
a hidden feature vector generator that generates a hidden feature vector based on the attention output and one or more of the audio feature vector and the textual feature vector; and
one or more processors configured to cause the system to determine an emotion of the acoustic sequence based on the hidden feature vector.
|
10. A system comprising:
one or more memory devices comprising:
an audio encoder that generates an audio feature vector for one or more words in an acoustic sequence;
a textual encoder that generates a textual feature vector for the one or more words in a textual sequence corresponding to the acoustic sequence;
a first neural attention model that generates a first attention output by applying attention to the textual feature vector using the audio feature vector as context;
a first hidden feature vector generator that generates a first hidden feature vector based on the first attention output;
a second neural attention model that generates a second attention output by applying attention to the audio feature vector using the first hidden feature vector as context; and
a second hidden feature vector generator that generates a second hidden feature vector based on the second attention output and the audio feature vector; and
one or more processors configured to cause the system to determine an emotion of the acoustic sequence based on the first hidden feature vector and the second hidden feature vector.
|