US 11,965,864 B1
Quantitative source apportionment based on nontarget high-resolution mass spectrometry (HRMS) data of pollution source and pollution receptor
Weiling Sun, Beijing (CN); Yitao Lyu, Beijing (CN); Qian Chen, Beijing (CN); and Jinren Ni, Beijing (CN)
Assigned to Peking University, Beijing (CN)
Filed by Peking University, Beijing (CN)
Filed on Aug. 14, 2023, as Appl. No. 18/449,672.
Claims priority of application No. 202310422604.4 (CN), filed on Apr. 20, 2023.
Int. Cl. G01N 30/72 (2006.01); G01N 30/86 (2006.01); G16C 20/20 (2019.01); G01N 30/02 (2006.01)
CPC G01N 30/7206 (2013.01) [G01N 30/8631 (2013.01); G16C 20/20 (2019.02); G01N 2030/025 (2013.01)] 5 Claims
OG exemplary drawing
 
1. A quantitative source apportionment method based on nontarget high-resolution mass spectrometry (HRMS) data of pollution sources and pollution receptors, comprising:
step 1: acquiring samples of pollution sources and pollution receptors, and pre-processing the samples to extract trace organic pollutants;
step 2: acquiring nontarget HRMS data of the samples obtained in step 1;
step 3: performing data pre-processing on raw HRMS data obtained through nontarget analysis in step 2 to obtain a HRMS dataset comprising a mass-to-charge ratio, a retention time, and a peak area of substances;
step 4: determining source-sink relationship information based on positions of the pollution sources and the pollution receptors, taking each remaining sample other than a background sample and the samples of the pollution source as a sink sample, one sink sample corresponding to one group, and determining source samples of each sink sample;
step 5: constructing, based on the source-sink relationship information obtained in step 4 and the HRMS dataset obtained in step 3, an input matrix for each group of sink sample, and standardizing mass spectrometry data in the input matrix, comprising:
constructing a sink sample vector and a source sample vector for each group based on the mass spectrometry data obtained in step 3, representing a single sink sample by using a vector x, wherein a formula of x is x=(x1, . . . , xj, . . . , xN), xj represents a signal intensity of a j-th substance, N represents the number of all substance types in the HRMS dataset obtained in step 3; and representing a known source sample i of the sink sample x by using a vector yi, wherein a formula of yi is yi=yi1, . . . , yij, . . . , yiN, yij represents a signal intensity of a j-th substance in the source sample i and 1≤i≤K, K represents the number of all known source samples of the sink sample x, and the sink sample x comprises a unknown source sample, that is a K+1-th source sample; and
constructing the input matrix (xT, y1T, y2T, . . . , yKT), where in a situation that the input matrix has a missing value, populating a value 0 into the input matrix; and standardizing the input matrix, comprising performing z-score standardization, a [0,1] standardization or a standardization of maximum and minimum values to obtain a standardized input matrix; and
step 6: adopting an expectation-maximization method or a Bayesian method to quantitatively calculate contribution of each source sample based on the standardized input matrix, comprising one of:
substituting the vector x of the sink sample and the vector yi of the source sample i in the expectation-maximization method, comprising randomly assigning a value for contribution αi of the sink sample x, and substituting the input matrix (xT, y1T, y2T, . . . , yKT) in the expectation-maximization method to iterate the contribution αi until convergence or reaching a maximum number of iterations; or
substituting the vector x of the sink sample and the vector yi of the source sample i in the Bayesian method, comprising initializing the contribution αi by using a random source environment assignment, iteratively calculating the contribution αi to update each vector according to a conditional distribution, and calculating a posterior probability until convergence or reaching the maximum number of iterations;
wherein the quantitative source apportionment method further comprises: sending the contribution of each pollution source obtained from the step 6 to a sewage treatment manager, and developing, according to the contribution of each source, a sewage treatment scheme by the sewage treatment manager to perform sewage treatment, to enhance sewage treatment plant efficacy.