US 12,008,341 B2
Systems and methods for generating natural language using language models trained on computer code
Mark Chen, Cupertino, CA (US); Jaroslaw Tworek, San Francisco, CA (US); Ilya Sutskever, San Francisco, CA (US); Wojciech Zaremba, San Francisco, CA (US); Hee Woo Jun, San Francisco, CA (US); and Henrique Ponde De Oliveira Pinto, San Francisco, CA (US)
Assigned to OpenAI Opco, LLC, San Francisco, CA (US)
Filed by OpenAI Opco, LLC, San Francisco, CA (US)
Filed on May 23, 2023, as Appl. No. 18/321,921.
Application 18/321,921 is a continuation of application No. 18/321,852, filed on May 23, 2023.
Claims priority of provisional application 63/389,326, filed on Jul. 14, 2022.
Prior Publication US 2024/0020116 A1, Jan. 18, 2024
Int. Cl. G06F 8/30 (2018.01); G06F 8/33 (2018.01); G06F 8/73 (2018.01)
CPC G06F 8/30 (2013.01) [G06F 8/33 (2013.01); G06F 8/73 (2013.01)] 19 Claims
OG exemplary drawing
 
1. A computer-implemented method, comprising:
training a machine learning model to generate natural language docstrings from computer code;
receiving one or more computer code samples at the trained machine learning model;
generating, via the trained machine learning model and based on the received one or more computer code samples, one or more candidate natural language docstrings representing natural language text, each of the one or more candidate natural language docstrings being associated with at least a portion of the one or more computer code samples;
identifying at least one of the one or more candidate natural language docstrings that provides an intent of the at least a portion of the one or more computer code samples;
outputting from the trained machine learning model the at least one identified natural language docstring with the at least a portion of the one or more computer code samples; and
receiving, at the machine learning model, a selection of the one or more computer code samples, wherein the machine learning model provides an automatic description of the selection and generates a template for building an additional machine learning model.