| CPC G06F 40/58 (2020.01) [G06F 40/53 (2020.01); G06N 20/00 (2019.01)] | 13 Claims |

|
1. A system for automatic augmentation of sign language translation training data, the system comprising:
one or more processors configured to:
store, in a database, a sequence of sign language glosses and a sequence of spoken-language words in pairs; and
training an artificial intelligence (AI) based-model, to perform an interference operation including a recognition and a translation of sign languages, using an augmented training data from the database based on a result of augmenting the pairs stored in the database,
the augmenting including:
finding a matching of a gloss and a word that have a same meaning from the sequence of sign language glosses and the sequence of spoken-language words;
substituting the found gloss and word with another alternative gloss and another alternative word; and
generating, as the augmented training data, a new pair of a sequence of sign language glosses and a sequence of spoken-language words to augment the pairs stored in the database, including reorganizing arrangements of glosses in the sequence of sign language glosses and connecting to the sequence of spoken-language words in pair,
wherein the generating the new pair further includes
masking a word to substitute in the sequence of spoken-language words, and
inputting the masked word to the artificial intelligence (AI) based-model which is already trained to infer a masked word in a sequence of words, and to determine an alternative word.
|