US 11,934,950 B2
	Apparatus and method for embedding sentence feature vector
Seong Ho Joe, Seoul (KR); Young June Gwon, Seoul (KR); Seung Jai Min, Seoul (KR); Ju Dong Kim, Seoul (KR); Bong Kyu Hwang, Seoul (KR); Jae Woong Yun, Seoul (KR); Hyun Jae Lee, Seoul (KR); and Hyun Jin Choi, Seoul (KR)
Assigned to SAMSUNG SDS CO., LTD., Seoul (KR)
Filed by SAMSUNG SDS CO., LTD., Seoul (KR)
Filed on Oct. 26, 2020, as Appl. No. 17/080,351.
Claims priority of application No. 10-2020-0045567 (KR), filed on Apr. 14, 2020; and application No. 10-2020-0130918 (KR), filed on Oct. 12, 2020.
Prior Publication US 2021/0319260 A1, Oct. 14, 2021
Int. Cl. G06N 3/08 (2023.01); G06F 18/213 (2023.01); G06N 3/045 (2023.01); G06N 20/20 (2019.01); G06F 18/22 (2023.01)

CPC G06N 3/08 (2013.01) [G06F 18/213 (2023.01); G06N 3/045 (2023.01); G06N 20/20 (2019.01); G06F 18/22 (2023.01)]

8 Claims

1. A method of embedding a sentence feature vector, which is performed by a computing device comprising one or more processors and a memory in which one or more programs to be executed by the one or more processors are stored, the method comprising:

acquiring a first sentence and a second sentence, each including one or more words;

extracting a first feature vector corresponding to the first sentence and a second feature vector corresponding to the second sentence by independently inputting each of the first sentence and the second sentence into a bidirectional encoder representations from transformers (BERT)-based feature extraction network; and

compressing the first feature vector and the second feature vector into a first compressed vector and a second compressed vector, respectively, by independently inputting each of the first feature vector and the second feature vector into a convolutional neural network (CNN)-based vector compression network,

wherein the CNN-based vector compression network comprises:

a plurality of convolution filters configured to reduce a dimension of the first feature vector or the second feature vector that is an input feature vector,

an activation function application unit configured to generate a plurality of 1×N activation vectors by applying a predetermined activation function to feature vectors with a reduced dimension, and

a pooling layer configured to perform max pooling in which a plurality of elements located in a same row and a same column of the plurality of 1×N activation vectors are compared based on a depth direction, and a first element with a maximum value is selected as an element in the same row and the same column of an 1×N compression vector,

wherein N is a natural number,

wherein the plurality of 1×N activation vectors have same sizes of row and columns,

wherein the BERT-based feature extraction network is trained by updating training parameters based on a similarity between the first compressed vector and the second compressed vector,

wherein the BERT-based feature extraction network comprises a Siamese network architecture composed of a first feature extraction network configured to receive the first sentence and extract the first feature vector and a second feature extraction network configured to receive the second sentence and extract the second feature vector,

wherein the CNN-based vector compression network comprises a Siamese network architecture composed of a first vector compression network configured to receive the first feature vector and compress the first feature vector into the first compressed vector and a second vector compression network configured to receive the second feature vector and compress the second feature vector into the second compressed vector, and

wherein the CNN-based vector compression network is separately provided from the BERT-based feature extraction network.