CPC G16H 50/20 (2018.01) [A61B 3/0025 (2013.01); A61B 3/14 (2013.01); A61B 5/4803 (2013.01); G06T 5/30 (2013.01); G06T 5/40 (2013.01); G06T 7/0012 (2013.01); G06T 7/12 (2017.01); G06V 10/7715 (2022.01); G06V 10/82 (2022.01); G06V 20/70 (2022.01); G10L 15/02 (2013.01); G10L 15/05 (2013.01); G10L 15/16 (2013.01); G10L 15/1815 (2013.01); G10L 15/22 (2013.01); G10L 21/0224 (2013.01); G10L 25/18 (2013.01); G10L 25/21 (2013.01); G06T 2207/20084 (2013.01); G06T 2207/30041 (2013.01); G06T 2207/30096 (2013.01); G06T 2207/30101 (2013.01); G06V 2201/03 (2022.01); G10L 2015/025 (2013.01)] | 9 Claims |
1. A neural-network-based-implemented ophthalmologic intelligent consultation method, comprising:
at step S1, acquiring a consultation voice of a to-be-diagnosed patient, performing correction filtering on the consultation voice to acquire a filtered voice, framing the filtered voice into a consultation voice frame sequence, and extracting voice features for the voice frames to acquire a consultation voice feature sequence, wherein framing the filtered voice into the consultation voice frame sequence comprises:
at step S11, performing primary framing on the filtered voice based on a preset framing window length to acquire a framed voice sequence;
at step S12, performing windowing processing on the framed voice sequence to acquire a windowed voice sequence based on the following mainlobe windowing algorithm:
wherein H(w) refers to a frequency value of the w-th windowed voice in the windowed voice sequence, A(w) refers to a frequency value of the w-th framed voice in the framed voice sequence, π is a symbol of Pi, N refers to a voice window length corresponding to the mainlobe windowing algorithm, e is a symbol of an Euler number, and j is a symbol of an imaginary number; and
at step S13, calculating an average zero-crossing rate and a short-time voice energy of the windowed voice sequence, and performing endpoint detection on the windowed voice sequence based on the average zero-crossing rate and the short-time voice energy to acquire the consultation voice frame sequence;
at step S2, performing phoneme recognition on the voice feature sequence to acquire a consultation phoneme sequence, transcoding, by a self-attention, the phoneme sequence into a consultation text, performing text segmentation and vectorization operation on the consultation text sequentially to acquire consultation text features, and performing semantics recognition on the consultation text features to acquire an ophthalmologically-described disease;
at step S3, acquiring an eye picture set of the to-be-diagnosed patient, screening out a sharp eye picture group from the eye picture set, performing gray-level filtering operation on the sharp eye picture group to acquire a filtered eye picture group, and performing primary picture segmentation and size equalization operations on each filtered eye picture in the filtered eye picture group to acquire a standard eyeball picture group;
at step S4, performing secondary picture segmentation operation on the standard eyeball picture group to acquire an eye white picture group, a pupil picture group, and a blood vessel picture group, and extracting eye white features from the eye white picture group, pupil features from the pupil picture group and blood vessel features from the blood vessel picture group; and
at step S5, performing lesion feature analysis on the eye white features, the pupil features and the blood vessel features to acquire an ophthalmologically-observed disease, and generating a consultation result based on the ophthalmologically-observed disease and the ophthalmologically-described disease;
wherein performing lesion feature analysis on the eye white features, the pupil features and the blood vessel features to acquire an ophthalmologically-observed disease comprises:
identifying eye white disease semantics from the eye white features, identifying pupil disease semantics from the pupil features, and identifying blood vessel disease semantics from the blood vessel features;
converging the eye white disease semantics, the pupil disease semantics and the blood vessel disease semantics to form an eye disease semantics set;
performing feature coding on each eye disease semantics in the eye disease semantics set to acquire disease semantic feature codes;
generating a multi-head disease semantics vector set corresponding to the disease semantics feature codes based on a pre-trained disease analysis model, and calculating standard disease semantics vectors corresponding to the multi-head disease semantics vector set through a multi-head attention mechanism of the disease analysis model; and
performing normalization and feature decoding operations on the standard disease semantics vectors in sequence to acquire an ophthalmologically-observed disease;
wherein the disease analysis model is a support vector machine model trained by a large number of labeled disease semantics, the ophthalmologically-observed disease can be acquired by performing normalization on the standard disease semantics vectors using softmax function and performing feature decoding operation on the standard disease semantics vectors using a multilayer perceptron;
wherein generating the consultation result based on the ophthalmologically-observed disease and the ophthalmologically-described disease includes: generating an observed disease result based on the ophthalmologically-observed disease and the standard eyeball picture group, and generating a described-disease result based on the ophthalmologically-described disease and the consultation text, and splicing the observed disease result and the described-disease result into a consultation result.
|