US 12,423,971 B1
	Methods for identifying pine wood nematode-infected discolored woods in mixed coniferous and broadleaf forests
Haifeng Lin, Nanjing (CN); Tao Chen, Nanjing (CN); Chengxuan Li, Nanjing (CN); and Hongping Zhou, Nanjing (CN)
Assigned to NANJING FORESTRY UNIVERSITY, Nanjing (CN)
Filed by NANJING FORESTRY UNIVERSITY, Jiangsu (CN)
Filed on May 20, 2025, as Appl. No. 19/213,978.
Claims priority of application No. 202410735437.3 (CN), filed on Jun. 7, 2024.
Int. Cl. G06V 20/10 (2022.01); G06V 10/72 (2022.01); G06V 10/77 (2022.01); G06V 10/82 (2022.01); G06V 20/17 (2022.01)

CPC G06V 20/188 (2022.01) [G06V 10/72 (2022.01); G06V 10/7715 (2022.01); G06V 10/82 (2022.01); G06V 20/17 (2022.01)]

7 Claims

1. A method for identifying a pine wood nematode-infected discolored wood in a mixed coniferous and broadleaf forest, comprising:

inputting a pine forest image of the mixed coniferous and broadleaf forest to be identified into a trained identification model for the pine wood nematode-infected discolored wood to identify the pine wood nematode-infected discolored wood in the mixed coniferous and broadleaf forest, wherein the trained identification model is improved based on a you Only Look Once version 5 small (YOLOv5s) model by:

connecting a feature-filtering module after a Neck;

constructing a feature-enhancing module to replace a C3 module in a Backbone;

constructing a convolution-transformer module based on multi-head self-attention (MSHA) to connect after a last layer of residual units in the Backbone;

constructing, by Group Shuffle Convolution (GSConv), a multi-scale feature fusion layer to replace an ordinary convolution in the Neck, wherein

a formula for the feature-filtering module is:

where F′ is a feature map of fusion space and channel attention output by the feature-filtering module, F denotes a feature map input to the feature-filtering module, σ is a Sigmoid activation function, Conv₁is a 1×1 convolution operation, and Conv₃is a 3×3 convolution operation, Concat is a splicing operation, δ denotes a ReLu activation function, Avgpool_cand Maxpool_care an average pooling operation and a maximum pooling operation along a channel dimension, Avgpool_sand Maxpool_sare respectively an average pooling operation and a maximum pooling operation along a spatial dimension, and ⊗ denotes element-by-element multiplication,

a formula for the feature-enhancing module is:

where F_outis a feature map output by the feature-enhancing module, f is a feature map input to the feature-enhancing module, y_gis a null convolution output of a plurality of branches with different null rates and g=1,2,3, BN is a Batch Norm normalization operation, δ denotes a ReLu activation function, j is the null rates corresponding to the plurality of branches and j=3,5,7, DConv denotes a null convolution operation, and k denotes a convolution kernel size;

a formula for the convolution-transformer module based on the MSHA is:

where MultiHead(X) is a feature output of the convolution-transformer module based on the MSHA, Z_iis an output of each self-attention head, X_iis a part of a feature X inputted into the convolution-transformer module based on the MSHA, W_i^Q, W_i^K, W_i^Vare linear transformation weight matrices of the each self-attention head, R_wi, R_hiare positional encodings, Concat is the splicing operation, and i=1,2,3,4.