US 11,755,888 B1
	Method and system for accelerating score-based generative models with preconditioned diffusion sampling
Li Zhang, Shanghai (CN); Hengyuan Ma, Shanghai (CN); Xiatian Zhu, Shanghai (CN); and Jianfeng Feng, Shanghai (CN)
Assigned to FUDAN UNIVERSITY, Shanghai (CN)
Filed by Fudan University, Shanghai (CN)
Filed on Jan. 9, 2023, as Appl. No. 18/151,923.
Int. Cl. G06N 3/0475 (2023.01)

CPC G06N 3/0475 (2023.01)

19 Claims

13. A system for accelerating score-based generative models (SGM), including a processor, which is adapted for running a computer program to implement computer instructions as follows:

setting a frequency mask (R) and a space mask (A) for a given dataset, wherein the frequency mask (R) and the space mask (A) both have a same shape as a data point in the dataset;

setting a target sampling iteration number (T);

sampling an initial sample (x₀) which is a random Gaussian vector, having the same shape as a data point in the dataset;

conducting iteration, for each iteration number (t) from 1 to the target sampling iteration number (T), conducting steps comprising:

sampling a noise term which has a same shape as an initial sample;

applying a preconditioned diffusion sampling (PDS) operator (M) to the noise term to generate a preconditioned noise term, wherein the PDS operator is constructed as follows:

(a) using Fast Fourier Transform (FFT) to generate a first mapped vector by mapping an input vector in a space domain into the frequency domain;

(b) adjusting the first mapped vector using the frequency mask (R), wherein the frequency mask (R) regulates the frequency coordinates of the input vector and is in element-wise multiplication with the first mapped vector;

(c) mapping the adjusted first mapped vector back to the space domain by the inverse of Fast Fourier Transform to generate a second mapped vector; and

(d) using the space mask (A) to regulate the pixel coordinates of the input vector, wherein the space mask is in element-wise multiplication with the second mapped vector;

wherein the elements of the frequency mask (R) are all positive;

calculating a drift term by an update function which has input parameters including the sample of the previous iteration (x_t−1), the present iteration number, and the present step size;

applying the transpose of the PDS operator (M^T) and then applying the PDS operator (M) to the drift term, to generate a preconditioned drift term; and

calculating the summation of the preconditioned drift term and a scale term, and then taking a real part of the summation for the diffusion of the sample of present iteration (x_t); wherein the scale term controls the scale of the noise term; and

outputting the sample of the final iteration (x_T), which obeys the distribution of the dataset.