| CPC G06F 16/24522 (2019.01) [G06F 16/248 (2019.01); G06F 16/3329 (2019.01); G06F 40/30 (2020.01); G06N 5/04 (2013.01); G06N 20/00 (2019.01); G06V 30/414 (2022.01)] | 19 Claims |

|
1. A method for document processing, the method comprising:
obtaining a question dataset that comprises one or more source questions for document processing by a machine-learning question-and-answer system that provides answer data in response to question data submitted by a user,
analyzing the source question to determine a specificity level for the source question, wherein analyzing the source question comprises determining the source question is overly verbose based on a comparison of the determined specificity level to one or more specificity threshold values,
modifying the source question from the question dataset to generate one or more augmented questions that have equivalent semantic meanings as that of the source question,
processing a document with the one or more augmented questions,
wherein modifying the source question comprises: simplifying the source question, in response to a determination that the source question is overly verbose, to exclude one or more semantic elements of the source question to generate a terse question with an equivalent semantic meaning to the source question, and
adding the one or more augmented questions to an augmented question dataset.
|