| CPC G06F 40/295 (2020.01) [G16H 15/00 (2018.01); G06F 40/30 (2020.01)] | 15 Claims |

|
1. An information processing apparatus that creates a training input text which is used for training a natural language processing model and in which a part of term phrases is masked, from a document, the information processing apparatus comprising:
a processor,
wherein the processor is configured to train the natural language processing model which is able to predict a part of term phrases which has been masked by:
extracting a plurality of specific term phrases from the document;
deriving a degree of association indicating a degree of association among the plurality of specific term phrases;
selecting a target term phrase that is a term phrase as a target to be masked based on the degree of association;
determining a text of a same type as a text including the term phrase selected as the target term phrase; and
using the text including the target term phrase and the text of the same type as the training input text.
|