OpenROAD – Key Milestones on the Road towards Good PPA

Overview

OpenROAD™ is the leading open-source digital place-and-route solution, providing easy access and removing the cost barriers of commercial VLSI design tools. OpenROAD’s vision and focus is to democratize semiconductor design automation from RTL-to-GDS.

The impact of OpenROAD’s innovation and outreach is visible in the rapid increase of OpenLane users by the #Efabless initiative. The MPW program #Efabless #ChipIgnite saw close to 150 successful ASIC and SoC  tapeouts using the #SkyWater130 nm technology.

In 2021, OpenROAD focussed mainly on PPA improvements as a key goal. OpenROAD has rapidly progressed from a fast, autonomous (NHIL- no human in the loop)  flow towards silicon worthy PPA. This exciting and fast developing market segment addresses a clear niche in the design of low/medium complexity SoCs around the >100nm technology node. A large community of software and hardware designers, design service providers, students and enthusiasts have embraced OpenROAD for a diverse range of applications and use models.

Other areas of progress include adding essential features that improve overall usability, timing optimization, extraction and flow control that significantly improves power, performance, and area (PPA).

Key PPA Enhancements

Key PPA areas of improvement were made in the following design flow stages: Logic synthesis, Placement, Timing Optimization, and Clock Tree Synthesis.

Overall, results have shown 30% faster and 20% denser designs which tantamount to a technology node in improvement!

Timing and Routability based enhancements in Global placement

Good placement is key to achieving good PPA and minimizing design iterations. Global placement utilizes the global router during placement to estimate and avoid routing congestion. It also uses static timing analysis to consider slack as it places objects to keep critical paths short. The integration of OpenROAD tools to a common database (OpenDB) and timing analysis (OpenSTA) enables rapid, incremental analysis, and timing based placement optimizations to deliver good PPA.

Incremental Parasitic Estimation for better timing

Early parasitic estimates are used in timing analysis during floorplanning. Parasitic estimation is done incrementally post global routing based on routing guides. This further improves timing analysis and signoff in OpenLane post global routing.

A Hierarchy-based  Macroplacer RTL-MP for a good partitioning strategy

OpenROAD™ added RTL-MP, a hierarchical RTL and dataflow-driven macro placer for better PPA. The paper by Prof. Kahng, Ravi Varadarajan and Zhiang Wang will be presented at ISPD-2022.

The image below shows a  RISC-V CPU in a commercial FinFET process with RTL-MP’s macro placement in the OpenROAD GUI, showing uniformly placed macros with good routing resource allocation and a compact die area.

AutoTuner – Using an ML and METRICS2.1 based framework for cloud-based distribution exploration

OpenROAD™ added the AutoTuner, an open, ML-based, hyperparameter framework and engine to the OpenROAD RTL-GDSII flow.  The  AutoTuner is an autonomous, parameter tuning framework using Ray (https://www.ibm.com/cloud/blog/ray-on-ibm-cloud-code-engine) for commercial and academic RTL-to-GDS flows.  It provides a generic interface where users can define parameter configuration as JSON objects. This  enables AutoTuner to easily support various tools and flows. The tool also utilizes METRICS2.1 to capture PPA of individual search trials. With the abundant features of METRICS2.1, users can explore various reward functions that steer the flow to different PPA goals.

The image below shows results of using an Autotuner based flow vs a custom design for the IBEX core on SKY130HD and ASAP7. Images were generated using OpenROAD GUI.

Direct ABC link into OpenROAD Improves Area

ABC, an open-source logic synthesis tool, is now linked natively into OpenROAD which allows optimization to perform N tries of remapping per logic cone in order to find a better timing vs area tradeoff. This is part of the ongoing work to build the infrastructure for the timing driven re-mapping of logic using placement based timing estimates.

Multi Corner Extraction for accurate post-route extraction

OpenRCX supports multi-corner extraction of tuples of R,C values in a single run of the flow that is fast and efficient.

The Extraction Rules file can be extended to include table values for multiple corners that are then appended into a single file.

The following are benchmark results of running the JPEG design (Included in the OpenROAD-flow-scripts repository) on Skywater 130 nm technology.

jpeg_130 with the following performance stats:

1 Corner :  elapsed Time  = 2 min 4 sec      max memory = 1.4 GB

3 Corners:  elapsed Time = 4 min 9 sec      max memory = 1.5 GB

An ECO flow for hold timing closure in OpenLane

A timing-driven ECO flow in OpenLane generates an optimized netlist based on post-route timing checks to fix hold violations. The ECO flow starts from checking the post route report generated by OpenROAD™ and then using a python script to check the report, insert buffers and resize. Inside the ECO loop, the python script, detail placement, global routing and detail routing are called sequentially. The reports are generated each time, and the loop stops until no hold time violations are found.

The following table shows the timing results of the ECO flow in OpenLane for the LiteX Management Core. All the timing reports are generated by OpenSTA.

LiteX Management Core

LiteX Management Core

OpenLane with ECO

OpenSTA Timing Analysis (Clock:25ns)

ECO Iteration Setup max slack Hold Min slack (ns) Number of violations Inserted buffers
1 -16.16 -1.48 131 219
2 -16.4 -0.03 15 14
3 -16.44 -0.09 23 19
4 -16.49 -0.02 9 9
5 -16.41 -0.05 6 6
6 -16.35 -0.05 3 3
7 -16.44 -0.03 11 7
8 -16.46 -0.03 4 4
9 -16.42 -0.01 4 4
10 -16.4 0 0 0

As shown in the table above, the design has 131 hold violations with a -1.48 worst case slack in the 1st iteration. The ECO inserts 219 buffers into the block, which greatly reduces the number of violations. After 10 iterations, all the violations are solved without unduly impacting setup time as a result.

Refer to documentation about this feature here: https://openlane.readthedocs.io/en/latest/docs/source/eco_flow.html?highlight=ECO#how-to-enable-the-eco-flow

OpenLane benefits from tighter integration with OpenROAD

OpenLane uses OpenROAD™ tools via a python-based interface in it’s repository and thereby directly benefits through a tight integration, aligned vision and collaboration with the OpenROAD project team.

#Efabless users who design their chips using OpenROAD directly benefit from enhancements to OpenROAD in the form of new features, flow and PPA improvements. The OpenLane timing flow was updated to use incremental timing (STA) and extraction (SPEF) for better timing analysis, optimization and closure at various key flow stages.

OpenLane CI processes are closely aligned with OpenROAD and hence all enhancements are available in a timely manner to OpenLane users.

A chip timing analysis flow was added to OpenLane based on a hierarchical, block level DEF physical design methodology that supports faster and more accurate data for timing closure. OpenRCX extracts parastics for each block and the results are stitched at the top level for a final, flat, full-chip timing signoff.

Growth of the OpenROAD GUI

OpenROAD™ added a GUI to help users visualize and debug their designs and thereby make decisions that impact PPA and routability.

The following are key GUI features that can also be accessed via a tcl command interface:

The image below shows a final DEF file displayed in the GUI with controls for selective display on the layout instances of a sample design (Coyote) containing macros and standard cells using sky130 technology.

Logger

The logger infrastructure is called by all OpenROAD™ tools for clean, consistent, and actionable messaging. This greatly enhances usability and provides users with direct feedback to trace a problem source and a way to fix that.

PDK Support

PDK support is vital to designers who care about manufacturability and predictive aspects of design. OpenROAD™ supports both public and private PDKS that are continuously validated through periodic updates and regression testing against a suite of representative designs and commonly supported technology nodes.

The following PDKS were added in 2021:

Private: Intel22, Sky90,  GF12 – with improved rule support in detailed routing

Public: Asap7, Skywater130 – with improved density

Detailed Placer Optimizer

OpenROAD™ added a new detailed placement optimizer based on research from Prof. Andrew Kennings at the University of Waterloo. The goal is to further improve the quality of detailed placement through an iterative standard cell move and swap based algorithm that optimizes wirelength. Results show early improvement in wirelength and worst case slack as seen in three design test cases using skywater130hs.