US 12,469,322 B2
Methods and systems for transfer learning of deep learning model based on document similarity learning
Sung Min Kim, Seongnam-si (KR); Kyoungho Choi, Seongnam-si (KR); and Kyuho Lee, Seongnam-si (KR)
Assigned to NAVER CORPORATION, Gyeonggi-Do (KR)
Filed by NAVER CORPORATION, Seongnam-si (KR)
Filed on Jun. 23, 2021, as Appl. No. 17/355,406.
Claims priority of application No. 10-2021-0007453 (KR), filed on Jan. 19, 2021.
Prior Publication US 2022/0230014 A1, Jul. 21, 2022
This patent is subject to a terminal disclaimer.
Int. Cl. G06V 30/418 (2022.01); G06F 18/21 (2023.01); G06F 18/214 (2023.01); G06F 18/22 (2023.01); G06F 18/2415 (2023.01); G06F 40/30 (2020.01); G06N 3/096 (2023.01); G06V 30/416 (2022.01)
CPC G06V 30/418 (2022.01) [G06F 18/2148 (2023.01); G06F 18/2193 (2023.01); G06F 18/22 (2023.01); G06F 18/2415 (2023.01); G06F 40/30 (2020.01); G06N 3/096 (2023.01); G06V 30/416 (2022.01)] 18 Claims
OG exemplary drawing
 
1. A transfer learning method of a computer apparatus comprising at least one processor, the transfer learning method comprising:
extracting, by the at least one processor, a similar document pair set and a dissimilar document pair set from a document database, the similar document pair set including a plurality of similar document pairs having a common attribute, and the dissimilar document pair set including a plurality of dissimilar document pairs extracted randomly;
acquiring, by the at least one processor, a semantic similarity for each of the plurality of similar document pairs and each of the plurality of dissimilar document pairs by initially determining a calculated mathematical similarity depending on whether attributes of each of the plurality of similar document pairs and each of the plurality of dissimilar document pairs are identical or similar;
adjusting the calculated mathematical similarity by increasing the calculated mathematical similarity for each similar document pair of the similar document pair set and by decreasing the calculated mathematical similarity for each dissimilar document pair of the dissimilar document pair set;
pre-training, by the at least one processor, a similarity model using the plurality of similar document pairs, the plurality of dissimilar document pairs, and the semantic similarity to output a similarity between documents;
generating, by the at least one processor, a fine tuning model by replacing a first output function of the pre-trained similarity model with a second output function; and
training, by the at least one processor, the fine tuning model to output a score for a document input to the fine tuning model.