US 12,265,978 B2
	Customized product performance prediction method based on heterogeneous data difference compensation fusion
Lemiao Qiu, Hangzhou (CN); Yang Wang, Hangzhou (CN); Shuyou Zhang, Hangzhou (CN); Zili Wang, Hangzhou (CN); and Huifang Zhou, Hangzhou (CN)
Assigned to ZHEJIANG UNIVERSITY, Hangzhou (CN)
Filed by ZHEJIANG UNIVERSITY, Hangzhou (CN)
Filed on Nov. 10, 2021, as Appl. No. 17/522,921.
Application 17/522,921 is a continuation of application No. PCT/CN2021/070983, filed on Jan. 8, 2021.
Claims priority of application No. 202011124136.5 (CN), filed on Oct. 20, 2020.
Prior Publication US 2022/0122103 A1, Apr. 21, 2022
Int. Cl. G06N 3/04 (2023.01); G06N 3/08 (2023.01); G06Q 30/0202 (2023.01)

CPC G06Q 30/0202 (2013.01) [G06N 3/04 (2013.01); G06N 3/08 (2013.01)]

4 Claims

1. A customized product performance prediction method based on heterogeneous data difference compensation fusion, comprising the following steps:

(1) collecting and obtaining data samples with a configuration parameter of a customized product as an input feature and performance of the customized product to be predicted as an output feature; collecting actual measurement performance data of an existing product, and constructing a historical actual measurement data set for performance prediction of the customized product; establishing a virtual simulation model of the customized product by using computer simulation software, obtaining performance data through simulation analysis, and constructing a calculation simulation data set for performance prediction of the customized product;

(2) performing data preprocessing on the historical actual measurement data set and the calculation simulation data set;

(3) performing difference compensation correction on the calculation simulation data set on the basis of the historical actual measurement data set: encoding the historical actual measurement data set and the calculation simulation data set on the basis of a depth auto-encoder, mapping the data samples from an input space into a feature space, so as to express key features of the data samples, denoting the encoded historical actual measurement data set as ESet_h, and denoting the encoded calculation simulation data set as ESet_s; dividing, through random sampling, the data set ESet_hinto a training sample set, a verification sample set and a test sample set, which are denoted as a historical actual measurement training set ESet_htrain, a historical actual measurement verification set ESet_hvalid, and a historical actual measurement test set ESet_htest,respectively; and finally performing associated connection on the data set ESet_sand the data set ESet_htrainby using a neighborhood association method, performing difference compensation correction on the data set ESet_sby using the data set ESet_htrainby means of a similarity difference compensation method, and denoting the data set ESet_safter the difference compensation correction as MSet_s;

(4) selecting a BP neural network model as a performance prediction model of the customized product, and taking the input feature and the output feature selected in the step (1) as the input and output of the prediction model; using the calculation simulation data set after the difference compensation correction as the training sample set, and training and constructing an optimal BP neural network model combined with a tabu search algorithm; and then testing the model by using the historical actual measurement test set ESet_htest, so as to obtain a final performance prediction model of the customized product; and

(5) for a data sample to be predicted, firstly performing data preprocessing according to the processing of the calculation simulation data set in the step (2), and then inputting the data sample into the depth auto-encoder constructed in the step (3) for encoding, and finally inputting the encoded sample to be predicted into the prediction model constructed in the step (4) for prediction and obtaining the product performance of the customized product under different configuration parameter conditions;

in the step (3), a neural network model is trained by using the historical actual measurement data set and the calculation simulation data set to serve as the depth auto-encoder for the data samples; the depth auto-encoder is composed of an input layer, an encoder, a feature expression layer, a decoder and an output layer, and both the encoder and the decoder comprise three hidden layers; the input and output of the depth auto-encoder are input feature vectors of the data samples, the layers are fully connected, an activation function between the input layer and the hidden layer, and an activation function between the hidden layer and the hidden layer are relu function, and an activation function between the hidden layer and the output layer is tanh function;

in the step (3),associated connection on the encoded calculation simulation data set ESet_sand the encoded historical actual measurement training set ESet_htrainis performed by using the neighborhood association method, the specific association process is: initializing an empty mark set for each data sample in the data set ESet_s; randomly selecting a data sample Sample_kin the data set ESet_htrain, taking the data sample Sample_kas the center, taking a neighborhood threshold ε as the radius, adding tags of data samples in the data set ESet_swithin this neighborhood range into the mark set, wherein the added mark is the serial number of the data sample Sample_k, and at the same time, setting the access attribute of the data sample Sample_kas visited; traversing all data samples whose access attributes are unvisited in the data set ESet_htrain, and repeatedly adding marks for the data samples within the neighborhood range until the access attributes of all the data samples in the data set ESet_htrainare visited; and

in the step (3), difference compensation correction on the encoded calculation simulation data set ESet_sis performed by using the historical actual measurement training set ESet_htrainon the basis of the similarity difference compensation method, the similarity difference compensation method is: traversing the data set ESet_s, and performing difference compensation correction on the output feature of the data sample for each data sample Sample_lwhose mark tag is not empty according to the following formula:

where, ŷ_FEA^lrepresents an output feature vector of the data sample Sample_lafter the difference compensation correction; y_FEA^lrepresents the output feature vector of the data sample Sample_lbefore the difference compensation correction, that is, the output feature vector obtained through simulation analysis; M represents the number of marks in the mark set of the data sample Sample_l, that is, the number of data samples in the data set ESet_htrainassociated with the data sample Sample_l; S_zrepresents the Euclidean distance between the data sample Sample_land the data sample in the z^thdata set ESet_htrainassociated with the data sample Sample_l, which measures the similarity between the two data samples; Δy_zrepresents an absolute difference between the output feature vector of the data sample Sample_land the output feature vector of the data sample in the z^thdata set ESet_htrainassociated with the data sample Sample_l; y_real^zrepresents the output feature vector of the data sample in the z^thdata set ESet_htrainassociated with the data sample Sample_l; and α and β are hyperparameters.