US 12,299,022 B2
	Language model-based data object extraction and visualization
Anirvan Mukherjee, Brooklyn, NY (US); Craig De Souza, Jersey City, NJ (US); Edgar Gomes de Araujo, São Paulo (BR); Johannes Beil, Copenhagen (DK); Jessica Winssinger, New York, NY (US); Michael Zullo, Scarsdale, NY (US); Rushad Heerjee, New York, NY (US); and Shubhankar Sachdev, New York, NY (US)
Assigned to Palantir Technologies Inc., Denver, CO (US)
Filed by Palantir Technologies Inc., Denver, CO (US)
Filed on Apr. 11, 2024, as Appl. No. 18/632,900.
Claims priority of provisional application 63/589,894, filed on Oct. 12, 2023.
Claims priority of provisional application 63/589,911, filed on Oct. 12, 2023.
Claims priority of provisional application 63/497,930, filed on Apr. 24, 2023.
Claims priority of provisional application 63/497,933, filed on Apr. 24, 2023.
Prior Publication US 2024/0354322 A1, Oct. 24, 2024
Int. Cl. G06F 16/00 (2019.01); G06F 16/334 (2025.01); G06F 18/2415 (2023.01); G06N 3/0895 (2023.01)

CPC G06F 16/3344 (2019.01) [G06F 18/2415 (2023.01); G06N 3/0895 (2023.01)]

20 Claims

1. A computerized method, performed by a computing system having one or more hardware computer processors and one or more computer-readable storage devices storing software instructions executable by the computing system, the computerized method comprising:

receiving text data from a data source;

generating a first prompt for a large language model (“LLM”), the first prompt comprising at least the text data;

transmitting the first prompt to the LLM;

receiving a first output from the LLM in response to the first prompt, the first output comprising at least a data triple extracted from the text data, the data triple including a first entity, a second entity, and a relationship between the first entity and the second entity;

generating a second prompt for the LLM, the second prompt comprising at least the data triple;

transmitting the second prompt to the LLM;

receiving a second output from the LLM, the second output comprising at least a classified triple, the classified triple including a first entity type that the first entity is classified to, a second entity type that the second entity is classified to, and a relationship type that the relationship between the first entity and the second entity is classified to;

executing, using the classified triple, a similarity search with reference to an ontology to determine that the classified triple at least partially matches one or more data object types defined in the ontology; and

in response to the determination, adding into a first database at least:

a first data object of a first data object type, the first data object representing the first entity, and

a second data object of a second data object type, the second data object representing the second entity.