US 11,989,507 B2
Computer implemented methods for the automated analysis or use of data, including use of a large language model
William Tunstall-Pedoe, Cambridgeshire (GB); Robert Heywood, Cambridgeshire (GB); Seth Warren, Cambridgeshire (GB); Paul Benn, Cambridgeshire (GB); Duncan Reynolds, Cambridgeshire (GB); Ayush Shah, Cambridgeshire (GB); Luci Krnic, Cambridgeshire (GB); and Ziyi Zhu, Cambridgeshire (GB)
Assigned to UNLIKELY ARTIFICIAL INTELLIGENCE LIMITED, Cambridgeshire (GB)
Filed by UNLIKELY ARTIFICIAL INTELLIGENCE LIMITED, Cambridgeshire (GB)
Filed on Apr. 17, 2023, as Appl. No. 18/301,615.
Application 18/301,615 is a continuation of application No. PCT/GB2023/050405, filed on Feb. 22, 2023.
Application PCT/GB2023/050405 is a continuation in part of application No. 18/001,368, previously published as PCT/GB2021/052196, filed on Aug. 24, 2021.
Claims priority of application No. 2202347 (GB), filed on Feb. 22, 2022; application No. 2219268 (GB), filed on Dec. 20, 2022; application No. 2300624 (GB), filed on Jan. 16, 2023; and application No. 2302085 (GB), filed on Feb. 14, 2023.
Prior Publication US 2023/0274086 A1, Aug. 31, 2023
Int. Cl. G06F 40/20 (2020.01); G06F 16/33 (2019.01); G06F 40/56 (2020.01)
CPC G06F 40/20 (2020.01) [G06F 16/3344 (2019.01); G06F 40/56 (2020.01)] 30 Claims
OG exemplary drawing
 
1. A method for ensuring that a large language model (LLM) generates original text, comprising the steps of:
(i) providing a first database of previous text that the LLM should not generate;
(ii) performing a beam search;
(iii) checking potential continuations generated by the LLM against the first database;
(iv) when a potential continuation generated by the LLM matches non-original text in the first database, adjusting the potential continuation generated by the LLM to no longer match non-original text in the first database;
the method further including a method for adding citations to text generated by the LLM comprising the steps of:
(v) providing a second database of text used to train the LLM which includes sources associated with each section of text stored;
(vi) checking sections of the potential continuation generated by the LLM against the second database; and
(vii) retrieving respective sources where respective sections of the potential continuation generated by the LLM match respective text contained within the second database.