US 12,450,682 B2
	Graphics processing unit (GPU)-based logic rewriting acceleration method
Yajun Ha, Shanghai (CN); and Lin Li, Shanghai (CN)
Assigned to SHANGHAITECH UNIVERSITY, Shanghai (CN)
Filed by SHANGHAITECH UNIVERSITY, Shanghai (CN)
Filed on Dec. 13, 2023, as Appl. No. 18/537,836.
Application 18/537,836 is a continuation of application No. PCT/CN2023/083573, filed on Mar. 24, 2023.
Claims priority of application No. 202310179690.0 (CN), filed on Feb. 27, 2023.
Prior Publication US 2024/0289914 A1, Aug. 29, 2024
Int. Cl. G06T 1/20 (2006.01); G06F 9/38 (2018.01)

CPC G06T 1/20 (2013.01) [G06F 9/3851 (2013.01)]

5 Claims

1. A graphics processing unit (GPU)-based logic rewriting acceleration method, comprising parallelizing sub-procedures of And-Inverter Graph (AIG)-based logic rewriting, wherein the parallelizing sub-procedures of AIG-based logic rewriting comprises the following steps:

on a central processing unit (CPU), asynchronously selecting, by a scheduler, a group of same-level nodes from an input AIG-based logic network; then copying the nodes to a memory of a GPU, such that the GPU sequentially enables a kernel to rewrite the nodes in parallel, and selecting, by the CPU, another group of nodes for the GPU to rewrite, which completely overlaps scheduling and rewriting procedures; and on the GPU, completing parallel node rewriting through cut enumeration, maximum fan-out-free cone (MFFC) computation, evaluation, and replacement, wherein

the cut enumeration adopts a parallel cut enumeration algorithm designed based on a subgraph level, to simultaneously process the same-level nodes from a low level to a high level; nodes in a subgraph used for replacement in a previous scheduling cycle are added to different LAN[slevel] lists based on a level of the subgraph, wherein slevel represents a local level of a node in the subgraph used for replacement, cuts of nodes in a same LAN[slevel] list are computed in parallel, and then Boolean functions corresponding to the cuts are computed; and

the MFFC computation adopts a top-down computation mode to parallelize a recursive MFFC computation algorithm, wherein starting from a cut list on which the MFFC computation requires to be performed, each MFFC computation of a cut is allocated to a thread block, threads in the same thread block collaborate to compute an MFFC of the cut, and after the thread block obtains the corresponding cut, following steps are performed:

step 1: adding a root node of a current cut to a corresponding MFFC set of the root node, wherein the MFFC is a set that stores all last MFFC nodes;

step 2: adding left and right children of the root node to a buffering region F1; extracting, by each thread, a node n from the buffering region F1; if all fan-outs of the node n are in a current MFFC set, evaluating a function IS_MFFC_NODE(n) as true; and in this case, adding all the fan-outs of the node n to a buffering region F2, and adding the node n to the MFFC set;

step 3: repeating the step 2 until all nodes in the buffering region F1 are processed, and then swapping roles of the F1 and the F2; and

step 4: processing one leaf node of the current cut by using a method that is the same as a method described in the steps 2 to 3, until all leaf nodes of the current cut are processed;

in a replacement stage, a lockless replacement algorithm is used to avoid lots of data contention in frequently modifying a logic graph, wherein to resolve a conflict that different threads with overlapping MFFCs delete a same node, a node scheduler is used to group nodes with non-overlapping MFFCs; and

to resolve a conflict that MFFCs possessed by different threads do not overlap, but one thread wants to share a node to be deleted by another thread, a replacement process is divided into two stages: in a first stage, each block of the GPU deletes an MFFC and constructs a new subgraph; and in a second stage, each block of the GPU combines all equivalent nodes into a hyper-node that supports concurrent deletion/creation operations for one node without introducing a lock between different threads.