US 12,282,738 B2
Natural language processing to extract skills characterization
Lei Xu, Shanghai (CN); and Deng Feng Wan, Shanghai (CN)
Assigned to SAP SE, Walldorf (DE)
Filed by SAP SE, Walldorf (DE)
Filed on May 3, 2022, as Appl. No. 17/735,525.
Prior Publication US 2023/0359820 A1, Nov. 9, 2023
Int. Cl. G06F 40/00 (2020.01); G06F 40/166 (2020.01); G06F 40/279 (2020.01); G06Q 10/0631 (2023.01)
CPC G06F 40/279 (2020.01) [G06F 40/166 (2020.01); G06Q 10/063112 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A system for characterizing natural language text units, the system comprising:
at least one processor programmed to perform operations comprising:
using a plurality of text units to train a bidirectional model to generate context vectors, the plurality of text units indicating job descriptions;
accessing a set of annotated text units from a corpus of text units describing job descriptions, a first annotated text unit of the set of annotated text units comprising:
a first span comprising a first set of ordered words from the first annotated text unit; and
first annotation data describing a job skill associated with the first span;
applying the bidirectional model to the set of annotated text units to generate a plurality of span context vectors;
using the plurality of span context vectors generated with the bidirectional model to train a span prediction model, the span prediction model comprising a first probability function configured to provide a probability that a span prediction model input indicates a first job skill and a second probability function to provide a probability that the span prediction model input indicates a second job skill; and
applying the span prediction model to at least a portion of the plurality of text units to generate a plurality of span characterizations, a first span characterization corresponding to a first span indicating that the first span describes the first job skill.