| CPC G06F 40/284 (2020.01) [G06F 18/22 (2023.01); G06F 40/205 (2020.01)] | 14 Claims |

|
1. A computer implemented method for calculating a similarity score, the method comprising:
an interface module for computer-based speech recognition, the interface module configured to recognize speech by:
natural language processing, by a natural language processor, a plurality of sentences from natural language queries, the natural language processing comprising:
receiving, by a tokenization module, a first sentence as a first input;
receiving, by the tokenization module, a second sentence as a second input, wherein the first sentence and the second sentence originate from the natural language queries;
tokenizing, by the tokenization module, the first input to generate a first sequence of tokens;
tokenizing, by the tokenization module, the second input to generate a second sequence of tokens;
determining, by a token matcher, a similarity of tokens between the first sequence of tokens and the second sequence of tokens to generate token pairs;
determining, by the token matcher, a distance of relative positions of the token pairs in the first tokenized sequence and the second tokenized sequence, wherein the relative positions comprise positions of each token of the token pairs within the first and second sentences; and
generating, by the token matcher, a score value that indicates a degree to which the first sentence matches the second sentence based on restricting matches to a maximum value of the distance of relative positions of the token pairs in the first tokenized sequence and the second tokenized sequence, wherein the maximum value of the distance of relative positions of the token pairs is dynamically restricted based on a language's fundamental semantic variability for arrangement, without altering context, and a language model preserving the semantic variability; and wherein the generating comprises bi-directionally determining the distance of the relative positions of the token pairs in the first tokenized sequence and the second tokenized sequence; and
wherein the interface module forwards, based on the match, instructions to one or more applications running on one or more computer systems.
|