CPC G10L 15/063 (2013.01) [G10L 15/02 (2013.01); G10L 15/1822 (2013.01); G10L 15/26 (2013.01); G10L 17/22 (2013.01); G10L 2015/0631 (2013.01)] | 20 Claims |
1. An apparatus for automatic generation and update of a knowledge graph from one or more multi-modal sources, the apparatus comprising:
a speaker diarization module configured for: partitioning an input audio stream into audio segments; classifying speakers of the audio segments as agent or customer; and clustering the audio segments based on speaker classification;
an audio transcription module configured for transcribing the clustered audio segments to transcripts based on an acoustic model;
a speech parsing module configured for:
extracting entities of interest and schema of relations from the transcripts; and
labelling words of the transcripts corresponding to the extracted entities of interest with a plurality of pre-defined tags from a domain-specific language model;
a conversation parsing module configured for:
updating a dynamic information word set VD with the labelled words of the transcripts;
updating a static information word set VS based on the extracted schema of relations from the transcripts;
retrieving one or more sentence patterns from the domain-specific language model; and
generating pairs of question and answer based on the dynamic information word set VD, the static information word set VS and the one or more sentence patterns; and
a knowledge graph container configured for updating a knowledge graph by:
receiving the extracted entities of interest and schema of relations;
representing the extracted entities of interest as nodes in the knowledge graph; and
representing the extracted schema of relations as labels and edges between nodes in the knowledge graph.
|