| CPC G06F 16/353 (2019.01) [G06F 18/24323 (2023.01)] | 11 Claims |

|
1. An automatic industry classification method, comprising determining a scope of target patents, wherein the automatic industry classification method further comprises following steps:
step 1: defining a target industry tree;
step 2: generating marks on the target industry tree;
step 3: performing a rough classification for the target patents by using the marks; and
step 4: performing a fine classification for the target patents according to a result of the rough classification,
wherein step 1 further comprises:
defining an industry tree I={[i1, . . . , ij, . . . , in}, wherein ij ∈ I and is a first level industry, j is a serial number of the first level industry, 1≤j≤n, and n is a number of all leaf nodes of I; and
setting ijkl . . .={ijkl, . . . , ijkl . . . t} as any non-leaf node of I, wherein degree of other nodes except the leaf nodes is greater than or equal to 2, k is a serial number of a second level industry, l is a serial number of a third level industry, and t is a serial number of a penultimate level industry:
wherein the determining of the scope of the target patents is to manually determine the scope of the target patents to be classified,
wherein step 3 comprises determining nodes above the leaf nodes; and wherein step 3 further comprises following sub-steps:
step 31: generating a node set V of a graph:
step 32: arranging the marks;
step 33: generating an edge set E of the graph;
step 34: generating an adjacency matrix; and
step 35: performing node division,
wherein step 35 further comprises following sub-steps:
step 351: generating a degree matrix D=diag (d1, d2, . . . ,dl+u), having a diagonal element di=Σj=1l+uWij, wherein, u is a number of unmarked nodes, and Wij, is an adjacent matrix;
step 352: generating a marked matrix, and a nonnegative (l+u)×|γ| marked matrix F= (F1T, F2T, . . . , Fl+uT)T, wherein an element of an i-th row Fi=(Fi1, Ei2, . . . , Fi|γ|) is a marked vector of an International Patent Classification (IPC) in the node set, a classification rule is γi=ar gmax1≤j≤|γ|Fij, wherein, γ is a set of industries, and T represents a transposition a matrix:
step 353: initializing the nonnegative marked matrix F, for i=1,2, . . . , m and j=1,2, . . . |γ[
![]() step 354: constructing a propagation matrix
![]() wherein,
![]() d represents diagonal elements of th degree matric D;
step 355: generating an iterative calculation formula F(t+1)=α*B*F(t)+ (1−α) Y—wherein, α ∈(0,1) is a parameter,F(t) is a result of a t-th iteration, and Y is an initial matrix;
step 356: iterating the iterative calculation formula to convergence to obtain a state F*=lim/t→∞F(t)=(1−α) (M-αB)−1Y under convergence, wherein, M is a unit matrix; and
step 357: performing a prediction of the unmarked nodes γi=argmax1≤j≤|γ|Fij*, wherein, l+1≤i≤l+u.
|