Google researchers use deep reinforcement learning for optimizing chip design

Optimal chip design, or floorplanning, is a linchpin to increasing the computational power of today's systems. However, it is a process that takes substantial time, and efforts are being made to make it more efficient. Considering this, researchers working with Google have now looked towards machine learning to help tackle the problem.

In a recent paper titled “Chip Placement with Deep Reinforcement Learning” published on arXiv, the team at Google poses chip placement as a reinforcement learning (RL) problem. The trained model then places chip blocks, each of which is an individual module, such as a memory subsystem, compute unit, or control logic system onto a chip canvas.

Determining the layout of a chip block, a process called chip floorplanning, is one of the most complex and time-consuming stages of the chip design process and involves placing the netlist onto a chip canvas (a 2D grid), such that power, performance, and area (PPA) are minimized, while adhering to constraints on density and routing congestion. Despite decades of research on this topic, it is still necessary for human experts to iterate for weeks to produce solutions that meet multi-faceted design criteria.

The input to the deep reinforcement learning model is the chip netlist, the ID of the current node to be placed, and some netlist metadata. The netlist graph and the current node are passed through an edge-based graph neural network to generate embeddings of the partially placed graph and the candidate node.

A feed-forward neural network then takes this as a concatenated input to output a learned representation that captures the useful features and helps generate a probability distribution over all possible grid cells onto which the current node could be placed via a policy network. This entire process can be encapsulated in the GIF below. The chip on the left shows macro placement from scratch and on the right, some initial placements are being fine-tuned.

With this setup, the researchers demonstrated an improvement in efficiency and placement quality, stating that for a process that would have taken several weeks for human experts, it was completed in under six hours with their trained ML model.

Our objective is to minimize PPA (power, performance, and area), and we show that, in under 6 hours, our method can generate placements that are superhuman or comparable on modern accelerator netlists, whereas existing baselines require human experts in the loop and take several weeks.

Moving forward, the team believes that its model demonstrates a potent automatic chip placement method that could greatly accelerate chip design, that too, for any chip placement problem, which would enable co-optimization with earlier stages of the chip design process as well.