US 11,755,888 B1
Method and system for accelerating score-based generative models with preconditioned diffusion sampling
Li Zhang, Shanghai (CN); Hengyuan Ma, Shanghai (CN); Xiatian Zhu, Shanghai (CN); and Jianfeng Feng, Shanghai (CN)
Assigned to FUDAN UNIVERSITY, Shanghai (CN)
Filed by Fudan University, Shanghai (CN)
Filed on Jan. 9, 2023, as Appl. No. 18/151,923.
Int. Cl. G06N 3/0475 (2023.01)
CPC G06N 3/0475 (2023.01) 19 Claims
OG exemplary drawing
 
13. A system for accelerating score-based generative models (SGM), including a processor, which is adapted for running a computer program to implement computer instructions as follows:
setting a frequency mask (R) and a space mask (A) for a given dataset, wherein the frequency mask (R) and the space mask (A) both have a same shape as a data point in the dataset;
setting a target sampling iteration number (T);
sampling an initial sample (x0) which is a random Gaussian vector, having the same shape as a data point in the dataset;
conducting iteration, for each iteration number (t) from 1 to the target sampling iteration number (T), conducting steps comprising:
sampling a noise term which has a same shape as an initial sample;
applying a preconditioned diffusion sampling (PDS) operator (M) to the noise term to generate a preconditioned noise term, wherein the PDS operator is constructed as follows:
(a) using Fast Fourier Transform (FFT) to generate a first mapped vector by mapping an input vector in a space domain into the frequency domain;
(b) adjusting the first mapped vector using the frequency mask (R), wherein the frequency mask (R) regulates the frequency coordinates of the input vector and is in element-wise multiplication with the first mapped vector;
(c) mapping the adjusted first mapped vector back to the space domain by the inverse of Fast Fourier Transform to generate a second mapped vector; and
(d) using the space mask (A) to regulate the pixel coordinates of the input vector, wherein the space mask is in element-wise multiplication with the second mapped vector;
wherein the elements of the frequency mask (R) are all positive;
calculating a drift term by an update function which has input parameters including the sample of the previous iteration (xt−1), the present iteration number, and the present step size;
applying the transpose of the PDS operator (MT) and then applying the PDS operator (M) to the drift term, to generate a preconditioned drift term; and
calculating the summation of the preconditioned drift term and a scale term, and then taking a real part of the summation for the diffusion of the sample of present iteration (xt); wherein the scale term controls the scale of the noise term; and
outputting the sample of the final iteration (xT), which obeys the distribution of the dataset.