US 11,809,368 B2
Multi-threaded, self-scheduling processor
Tony M. Brewer, Plano, TX (US)
Assigned to Micron Technology, Inc., Boise, ID (US)
Filed by Micron Technology, Inc., Boise, ID (US)
Filed on Jul. 31, 2021, as Appl. No. 17/390,897.
Application 17/390,897 is a continuation of application No. 16/399,588, filed on Apr. 30, 2019, granted, now 11,119,972.
Claims priority of provisional application 62/667,666, filed on May 7, 2018.
Prior Publication US 2021/0357356 A1, Nov. 18, 2021
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 15/82 (2006.01); G06F 9/30 (2018.01); G06F 13/40 (2006.01); G06F 12/0875 (2016.01); G06F 9/48 (2006.01); G06F 9/50 (2006.01)
CPC G06F 15/82 (2013.01) [G06F 9/30101 (2013.01); G06F 9/4881 (2013.01); G06F 9/5016 (2013.01); G06F 12/0875 (2013.01); G06F 13/4027 (2013.01); G06F 2212/452 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A system, comprising:
an interconnection network;
a first, main memory circuit coupled to the interconnection network, the main memory circuit configured to store operand data; and
a first processor circuit coupled to the interconnection network, the first processor circuit comprising:
a processor core configured to execute a plurality of instructions; and
a core control circuit coupled to the processor core, the core control circuit comprising:
an interconnection network interface coupled to the interconnection network to receive a work descriptor data packet, the interconnection network interface configured to decode the received work descriptor data packet into a received program count and at least one received argument for a corresponding execution thread;
a second, thread control memory circuit comprising a thread identifier pool register configured to store a plurality of thread identifiers, a program count register configured to store the received program count for the corresponding execution thread, and a data cache or a general-purpose register configured to store the at least one received argument for the corresponding execution thread;
an execution queue coupled to the second, thread control memory circuit, the execution queue configured to store one or more thread identifiers of the plurality of thread identifiers; and
a control logic and thread selection circuit coupled to the execution queue, the control logic and thread selection circuit configured, in response to receiving the work descriptor data packet, to automatically schedule execution of the corresponding execution thread by assigning a thread identifier of the plurality of thread identifiers to the corresponding execution thread of a plurality of execution threads, placing the thread identifier in the execution queue, and periodically selecting the thread identifier of the one or more thread identifiers in the execution queue for execution by the processor core of an instruction of the corresponding execution thread, of the plurality of instructions, the processor core configured to automatically commence execution of an instruction corresponding to the received program count using the at least one received argument without accessing the first, main memory circuit to obtain operand data.