US 12,430,373 B2
Method, computer device, and non-transitory computer-readable recording medium to provide search results based on multi-modal features
Seungkwon Choe, Seongnam-si (KR); Jieun Lee, Seongnam-si (KR); Sangyeon Kim, Seongnam-si (KR); Dongju Lee, Seongnam-si (KR); and Jisu Jeon, Seongnam-si (KR)
Assigned to NAVER CORPORATION, Seongnam-si (KR)
Filed by NAVER CORPORATION, Seongnam-si (KR)
Filed on Dec. 5, 2023, as Appl. No. 18/529,588.
Claims priority of application No. 10-2022-0174089 (KR), filed on Dec. 13, 2022; and application No. 10-2023-0043095 (KR), filed on Mar. 31, 2023.
Prior Publication US 2024/0193197 A1, Jun. 13, 2024
Int. Cl. G06F 16/33 (2025.01); G06F 16/334 (2025.01)
CPC G06F 16/3347 (2019.01) 10 Claims
OG exemplary drawing
 
1. A multi-modal search method for searching a product on a computer network performed by a computer device, wherein the computer device comprises at least one processor configured to execute computer-readable instructions included in a memory, the method comprising:
receiving a user query from a user for a product including a plurality of attributes of the product;
mapping the plurality of attributes of the product to a multi-modal embedding space;
decomposing the multi-modal embedding space into a delta space including non-linear information and a virtual attribute vector space supporting linear vector expression;
performing a vector operation between at least two attribute vectors associated with the plurality of attributes included in the user query in the virtual attribute vector space;
restoring the multi-modal embedding space by combining the delta space and the virtual attribute vector space, the restored multi-modal embedding space including an embedding vector acquired through the vector operation performed in the virtual attribute vector space; and
providing search results of the product in the user query based on the embedding vector in the restored in the multi-modal embedding space;
wherein the performing of the vector operation further comprises:
estimating a correction function that considers a non-linear error associated with the non-linear information using each attribute vector in the multi-modal embedding space; and
performing the vector operation between the attribute vectors associated with the attributes by adding or subtracting the correction function.