US 12,461,723 B2
	Learned graph optimizations for compilers
Yanqi Zhou, Sunnyvale, CA (US); Sudip Roy, Sunnyvale, CA (US); Amirali Abdolrashidi, Riverside, CA (US); Daniel Lin-Kit Wong, Pittsburgh, PA (US); Chao Ma, Mountain View, CA (US); Qiumin Xu, Santa Clara, CA (US); Hanxiao Liu, Santa Clara, CA (US); Phitchaya Mangpo Phothilimthana, Mountain View, CA (US); Shen Wang, Sunnyvale, CA (US); Anna Darling Goldie, San Francisco, CA (US); Azalia Mirhoseini, Mountain View, CA (US); and James Laudon, Madison, WI (US)
Assigned to Google LLC, Mountain View, CA (US)
Appl. No. 17/921,933
Filed by Google LLC, Mountain View, CA (US)
PCT Filed Jun. 7, 2021, PCT No. PCT/US2021/036250 § 371(c)(1), (2) Date Oct. 27, 2022, PCT Pub. No. WO2021/248138, PCT Pub. Date Dec. 9, 2021.
Claims priority of provisional application 63/035,640, filed on Jun. 5, 2020.
Prior Publication US 2023/0176840 A1, Jun. 8, 2023
Int. Cl. G06F 9/44 (2018.01); G06F 8/41 (2018.01); G06F 9/445 (2018.01); G06F 9/455 (2018.01); G06N 3/044 (2023.01)

CPC G06F 8/443 (2013.01) [G06F 8/451 (2013.01); G06N 3/044 (2023.01)]

20 Claims

1. A computer-implemented method performed by a system comprising one or more computers and configured to execute a compiler optimization network, the method comprising:

receiving an input program, wherein the input program defines a graph of operation modules, wherein each node in the graph is a respective operation module, and each edge between nodes in the graph represents one operation module receiving the output generated by another operation module;

providing a representation of the input program to the compiler optimization network comprising:

a graph-embedding network executable by the system that is configured to encode operation features and operation dependencies of the operation modules of the input program into a graph embedding representation, and

a policy network executable by the system that is configured to generate an optimization action for each of one or more nodes encoded in the graph embedding representation, wherein the policy network employs segmented recurrent attention layers to concurrently generate optimization actions for multiple different tasks, wherein each layer of the segmented recurrent attention layers is dedicated to generating a respective optimization action for a different task of the multiple different tasks;

obtaining, from the compiler optimization network, an output optimization plan comprising one or more optimization actions for each of the multiple different tasks of the input program; and

executing the input program using the output optimization plan.