US 12,248,541 B1
	Watermark embedding method based on service invocation data
Mengxiang Wang, Beijing (CN); Yucheng Zhang, Beijing (CN); Qiang Fu, Beijing (CN); Fujun Wan, Beijing (CN); Xinyao Zhou, Beijing (CN); and Na Liu, Beijing (CN)
Assigned to China National Institute of Standardization, Beijing (CN)
Filed by China National Institute of Standardization, Beijing (CN)
Filed on Aug. 14, 2024, as Appl. No. 18/805,460.
Claims priority of application No. 202410373303.1 (CN), filed on Mar. 29, 2024.
Int. Cl. G06F 21/16 (2013.01); G06F 17/14 (2006.01)

CPC G06F 21/16 (2013.01) [G06F 17/14 (2013.01)]

9 Claims

1. A watermark embedding method based on service invocation data, characterized by comprising the following steps:

obtaining invocation data based on service, and preprocessing the invocation data;

obtaining key data through screening the preprocessed invocation data based on relevant weights, then, adding timestamps to the key data to obtain enhanced data; including:

a, calculating relevant weights applied to the invocation data:

wherein, a maximum value of data E is represented by max(E), a minimum value of data E is represented by min(E), and a relevant weight of data E is represented by custom character

(E), an initial relevant weight of data E is represented by custom character

_o(E), and a proportion of a category to which data E belongs in random sampling C is represented by custom character

(cs(C)), a proportion of data invoked by class b is represented by custom character

(b), an a^thnearest neighbor data of class b invocation data is represented by g_a, a data value of correlation E is represented by C[E], a sampling data value of an a^thinvocation data E of a nearest neighbor is represented by s_a[E], an a^thsampling data is represented by s_a, a category to which data belongs in random sampling C is represented by cs(C), and a quantity of sampled data is represented by custom character

, in addition, a nearest neighbor data is represented by g, a difference in correlation E between an invocation data C and a sampling data s_ais represented by df(E, C, s_a), and a difference in nearest neighbor g_abetween the invocation data C and the sampling data s_ais represented by df(E, C, g_a);

performing a descending sort on the invocation data according to the relevant weights, presetting a threshold for these weights, and subsequently screening the dependent sets based on this threshold;

mapping a position of an exploratory factor and the dependent set, and an expression is:

wherein, a mapping function is represented by custom character

(·), the a^thdata related to the i^thexploratory factor is represented by q_i,a, the random number is represented by r, and the natural constant is represented by e, then, calculating a fitness value of the exploratory factor:

wherein, the fitness is represented by R, a misclassification rate is represented by er, the number of data in a dependent set is represented by M, an importance of the misclassification rate is represented by α, an importance of a dependent subset is represented by ω, and the number of selected dependent subsets is represented by M_L;

comparing the fitness of the exploratory factor, updating a global and local optimal solutions, and updating the position of exploratory factor, and the expression is:

wherein, a velocity of an i^thexploratory factor in a d-dimension is represented by θ_i,d, the position of the i^thexploratory factor in the d-dimension is represented by q_i,d, an inertia weight of the exploratory factor is represented by ψ, and the learning factors are represented by β₁and β₂, in addition, random constants are represented by r₁and r₂, a global optimal position is represented by qs_i,d, an individual optimal position is represented by bs_i,d, and an updated position of the exploratory factor is represented by θ_i,d;

implementing an adaptive t-distribution perturbation strategy, iterating continuously until a maximum quantity of iterations is reached, and then, outputting screened remaining data as the key data;

b, calculating a nearest point and the distance to the key data:

wherein, the distance from the p^thnearest neighbor point to a c^thkey data point is represented by custom character

_c^(p), the dimension is represented by d, the number of dimensions is represented by e, a c^thsample in the d^thdimension is represented by Q_cd, and the p^thnearest neighbor point in the d^thdimension is represented by Q_pd;

calculating a sum of distances from nearest neighbors of a sample point to the key data:

wherein, the number of the nearest neighbors is represented by custom character

, and a sum of the distances between the c^thkey data and the nearest neighbors is represented by custom character

;

performing a descending sort according to the sum of distances, presetting a range of values for neighborhood parameters, and distributing the neighborhood parameters equally to a neighborhood based on the sum of distances between the key data and neighboring points, and the expression is:

wherein, the neighborhood parameter is represented by custom character

, the maximum value of the neighborhood parameter is represented by custom character

_max, and the minimum value of the neighborhood parameter is represented by custom character

_min, in addition, the sum of the distances between a first key data and the neighboring points is represented by custom character

, the control parameter is represented by ζ, and the maximum value of the sum of distances between the key data and the neighboring points is represented by custom character

;

calculating weights of local neighbors and the weights of an original local linear structure:

wherein, an enhanced weight is represented by χ_w, the weight of a neighboring sequence structure between the key data Q_cand the y^thneighbor is represented by U_c^y, the weight of the original local linear structure is represented by χ_L, and a 2-norm function is represented by ∥·∥₂, in addition, a y^thneighbor of the key data is represented by U_c^y, a c^thkey data is represented by Q_c, the minimum parameter value function is represented by argmin(·), and an attenuation coefficient between the c^thkey data and a y^thneighbor is represented by ψ_cy;

calculating the importance weight:

ϕ=δ₁χ_h+δ₂χ_L

wherein, an importance weight is represented by ϕ, the weight of a neighboring sequence structure is represented by χ_h, the sequence coefficient is represented by δ₁, and a linear coefficient is represented by δ₂;

taking the nearest neighbor points corresponding to the importance weights greater than or equal to 0.372 as insertion points for timestamps, and then, the enhanced data can be output by inserting timestamps;

selecting a contribution degree of enhanced data to obtain high-quality data, then, encoding the high-quality data to generate encoded data; including:

calculating the distance between the enhanced data:

wherein, a dissimilarity degree between a j^thand a s^thdata is represented by ω_js, the distance between the j^thand s^thdata is represented by ρ_js, the conditional probability is represented by custom character

, and the numerical distance between the j^thand the s^thdata is represented by k_js;

calculating a cumulative contribution of the enhanced data:

wherein, the cumulative contribution degree is represented by custom character

, the j^thexplained variance ratio is represented by ξ_j, an offset value between the j^thand the s^thdata is represented by F_js, the distance is represented by ρ, a genetic factor is represented by υ, and an average offset value is represented by F;

outputting the enhanced data with a cumulative contribution greater than 1 as high-quality data;

constructing a data watermark embedding model by employing the encoded data; and then, inputting the service invocation data to be embedded into the data watermark embedding model, and thus the embedding results can be output.