CPC G06F 40/40 (2020.01) [G06F 40/10 (2020.01); G06N 7/01 (2023.01); G06N 20/00 (2019.01)] | 22 Claims |
1. A method for generating, from a pre-trained language model, a target language model for controlled text generation, the target language model having minimal divergence with pre-trained language model distribution, comprising:
(a) receiving a pre-trained language model having attributes with existing probability distributions over the pre-trained language model;
(b) receiving at least one target constraint, the received target constraint specifying an expectation of a target attribute over the target language model, the target language model approximating the pre-trained language model;
(c) computing parameters of an energy based model by applying the received target constraint to the pre-trained language model;
(d) obtaining samples from a reference policy;
(e) updating parameters of a target policy using the obtained samples from the reference policy and the energy based model;
(f) updating the reference policy with the target policy if a first distance between the target policy and an implicit probability distribution, the implicit probability distribution being represented by the energy based model, is smaller than a second distance between the reference policy and the implicit probability distribution represented by the energy based model, the first and second distances being calculated as a divergence;
(g) repeating (d), (e) and (f) until the target policy converges with the target constraint; and
(h) outputting the target policy as the target language model having minimal divergence with pre-trained language model distribution and configured to generate controlled text with the target attribute over a probability distribution approximating a probability distribution specified by the target constraint.
|