US 12,353,451 B2
Method and system for protecting and removing private information used in large language models
Vijay Madisetti, Alpharetta, GA (US); and Arshdeep Bahga, Chandigarh (IN)
Assigned to Vijay Madisetti, Alpharetta, GA (US)
Filed by Vijay Madisetti, Alpharetta, GA (US)
Filed on Dec. 6, 2024, as Appl. No. 18/971,687.
Application 18/971,687 is a continuation of application No. 18/744,199, filed on Jun. 14, 2024.
Application 18/744,199 is a continuation in part of application No. 18/406,906, filed on Jan. 8, 2024, granted, now 12,158,904, issued on Dec. 3, 2024.
Application 18/406,906 is a continuation in part of application No. 18/470,487, filed on Sep. 20, 2023, granted, now 12,147,461, issued on Nov. 19, 2024.
Application 18/470,487 is a continuation of application No. 18/348,692, filed on Jul. 7, 2023, granted, now 12,001,462, issued on Jun. 4, 2024.
Claims priority of provisional application 63/551,548, filed on Feb. 9, 2024.
Claims priority of provisional application 63/604,909, filed on Dec. 1, 2023.
Claims priority of provisional application 63/604,910, filed on Dec. 1, 2023.
Claims priority of provisional application 63/602,675, filed on Nov. 27, 2023.
Claims priority of provisional application 63/469,571, filed on May 30, 2023.
Claims priority of provisional application 63/463,913, filed on May 4, 2023.
Prior Publication US 2025/0103628 A1, Mar. 27, 2025
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 16/3329 (2025.01); G06F 40/284 (2020.01)
CPC G06F 16/3329 (2019.01) [G06F 40/284 (2020.01)] 30 Claims
OG exemplary drawing
 
1. A method for removing unauthorized information associations from a large language model (LLM) that is pre-trained on training data comprising unauthorized data (UD), the method comprising:
receiving at a synthetic generator module a list of one or more UD instance-UD association pairs between real UD instances and UD associations identified for the real UD instances in the training data;
generating by the synthetic generator module one or more synthetic UD instance-UD association pairs comprising a synthetic UD instance-UD association pair from each real UD instance-UD association pair of the one or more real UD instance-UD association pairs, the synthetic UD instance-UD association pair being configured to one of reduce or remove influence of the real UD instance-UD association pair from which the synthetic UD instance-UD association pair was generated on an output of the LLM; and
generating by a fine tuner module a fine-tuned LLM by iteratively fine-tuning the LLM based upon the one or more synthetic UD instance-UD association pairs.