| CPC G10L 15/193 (2013.01) [H03M 7/30 (2013.01)] | 9 Claims |

|
1. A method for compressing finite-state transducer (FST) data to reduce memory usage in a computing device, comprising:
acquiring to-be-compressed FST data, wherein the FST data comprises state transition data and state data, and wherein the FST data is used in at least one of text retrieval, search engine, natural language processing, machine translation, speech recognition, signal processing and automated control;
decomposing the state transition data based on first data categories to acquire first decomposition data, comprising:
decomposing the state transition data based on data categories of signal label, weight and next state identifier, to acquire signal label decomposition data, weight decomposition data and next state identifier decomposition data;
after decomposing the state transition data based on the first data categories to acquire the first decomposition data, removing output signal label decomposition data from the signal label decomposition data in a case that information presented by the FST data is suitable to be presented by finite-state automaton (FSA) data; and removing the weight decomposition data in a case that the information presented by the FST data is suitable to be presented by Trie data;
decomposing the state data based on second data categories to acquire second decomposition data;
sequentially arranging, for each of the first data categories, the first decomposition data of the first data category, to acquire first arrangement data of the first data category;
alternately arranging the first arrangement data and the second decomposition data according to a sequential order used in the first arrangement data, to acquire second arrangement data;
performing classification statistics on the first arrangement data and the second arrangement data to acquire index data; and
combining the first arrangement data, the second arrangement data, and the index data, to obtain the compressed FST data, wherein the compressed FST data is stored in a memory of the computing device and reduces memory resource consumption during the at least one of text retrieval, search engine, natural language processing, machine translation, speech recognition, signal processing and automated control.
|