sivaganesh – The OpenROAD Project

Energy Efficient Design Starts with the Architecture

sivaganesh — Sat, 11 Feb 2023 07:08:13 +0000

Developed with funding from DARPA MTO’s Intelligent Design of Electronic Assets (IDEA) program, OpenROAD provides Application Specific Integrated Circuit (ASIC) and System on Chip (SoC) design teams with an open source, no-human-in-loop, 24-hour chip place-and-route solution. OpenROAD’s value proposition is that it brings down the barriers of cost, expertise and unpredictability of proprietary solutions which currently block semiconductor creators and innovators.

To date, OpenROAD has been used on over 600 tapeouts, on process nodes down to 12nm. Because of its ease-of-use, a growing number of companies have also started to use OpenROAD up front in the development process, during hardware and software architectural exploration. Doing predictive architectural exploration is especially valuable when testing revolutionary new architectures and algorithms where empirical power, performance, and area (PPA) data do not exist.

As commercial processes have become highly proprietary, open predictive technology models available as public PDKS along with OpenROAD’s integrated RTL-GDSII flow, can fill the reliable estimation gap long before detailed design, enabling clear visibility into PPA tradeoffs. For example, the ASAP7 open source 7 nm FinFET PDK, supported by OpenROAD, includes abstract models for schematic and layout entry, library characterization, synthesis, placement and routing, parasitic extraction, and HSPICE simulation.

Precision Innovations Inc. (PII) is the primary industrial developer of OpenROAD providing custom support for the main applications, flows and PDKS for a simple, barrier-free use of design exploration and subsequent implementation in OpenROAD.

Recently PII partnered with Ascenium to provide dedicated support for design estimation using OpenROAD flow and ASAP7 to streamline Ascenium’s quest to deliver a new class of a low power, General Purpose Processor, without an instruction set, through an optimized compiler interface. “The exponential influx of data (video, audio, sensor) accelerated by Deep Learning and 5G increases the importance of compute efficiency for energy management.” Said Øyvind Harboe, VP of Engineering at Ascenium. “By leveraging OpenROAD as we explore architectures, get rapid feedback to make important and reliable hardware and software trade offs early in the design cycle before we commit to physical implementation at a fraction of the cost from using commercial tools. If you are someone who hankers to work on something insanely great, something different, – that can change the industry, leveraging open-source tools, such as LLVM or OpenROAD, then please check out the career opportunities at our website, www.ascenium.com.”

If you too would like to talk with PII about dedicated support for your projects, please contact info@precisioninno.com

OpenROAD 2022: Year End Review

sivaganesh — Wed, 01 Feb 2023 08:08:22 +0000

2022 has been a very exciting and productive year for the OpenROAD project. We made good progress across main project areas including core technologies, tool quality, PPA outcomes and targeted outreach programs. This has led to a wider usage of our tools in various applications and an exponential growth in our user community.

Our core mission is to make OpenROAD based IC design easy-to-use, learn-at-scale and foster system level innovation.Here are some of the key project achievements of 2022 towards these goals.

Key Innovations in Core Technologies, Cloud and ML based execution

We added several new tools and enhanced core tool functionality in the database, partitioning, placement, floorplanning and routing tools to significantly improve runtime, design quality and flow robustness.

A Hierarchy-aware Macro placer

RTL-MP was created to give RTL designers more control and early insights into physical implementation, make architectural tradeoffs through intelligent logic-aware choices, smart clustering and shaping clouds of logic native to the logical hierarchy. This results in dense layouts with superior performance, utilization and area comparable to custom design.

Figure below is a sample design that shows an RTL-MP generated layout for a 12nm Black Parrot design. Macro placements were automatically deduced with knowledge of placement constraints to improve performance (fmax) and reduce wire lengths to achieve quality which is comparable to custom layout quality.

Rapid Design Exploration and PPA Estimation using ML – AutoTuner

As part of our ongoing research to continually improve PPA without increasing runtime, we created the OpenROAD AutoTuner to enable designers to rapidly explore the design space for a range of design configurations by leveraging the power of machine learning and cloud computing. AutoTuner’s hyperparameter optimization allows designers to automatically set up and execute thousands of experiments on the cloud thereby achieving PPA possible at a fraction of the cost and runtime.

AutoTuner test results on representative benchmarks (like IBEX on sky130) reveal up to a 3X improvement in runtime. These were tested using a Kubernetes Ray cluster over GCP with a provision for distributed detailed routing with one load balancer and 30 servers of 15 CPUs. Detailed testing on the router provided clear insights into architectural tradeoffs of distributed vs multi-threaded processing, single vs multiple machine configurations.

Leveraging the Cloud for Efficient Resources and Performance

OpenROAD’s COPILOT: Cloud Optimized Physical Implementation Leveraging OpenROAD Technology, intelligently deploys cloud and compute resources to enhance runtime performance of the distributed router. Our research yielded intelligent interventions in predicting failed runs and focusing on stubborn, recurring subproblems such as DRC failures allowing better control of batch executions. Early testing shows a 10X speedup in detailed routing in select test cases before reaching a plateau.

A new partitioner (TritonPart) aims to better manage design instance constraints and cluster engines to yield good global placement which is key to achieving good design quality and faster closure with a potential for a 1X speedup.

In 2022, OpenROAD added support for a tighter integration of design data to its open database (ODB) that significantly improved incremental design analyses and optimization. We began creating a full python API to OpenROAD to make it easier to extend and utilize python ML based libraries in applications;this further improved runtime performance especially during optimization.

A hierarchical design methodology vastly simplifies design complexity, memory usage and runtime management for large designs (>500k instances). The timing model from netlist was enhanced to provide better support for design hierarchy and include extracted parasitics at block and interconnect signals for accurate top-level timing analysis.

The OpenROAD GUI was significantly enhanced– It supports better hierarchy browsing, viewing and analyzing problems such as off-grid pins and routing, rulers and markers, and heatmaps for hotspot analyses.

Improved QoR through Continuous Innovation and CI

In 2022 we sharpened our focus on PPA targets. The AutoTuner allows designers to get over a 3X runtime speedup over manual design, with best achievable PPA based on user constraint ranges for design parameters. Furthermore, users can capture key design metrics and continuously track QoR and interventions to improve them.

RTL-MP showed great potential for large designs in enabling RTL designers to make early design tradeoffs with rapid feedback from the floorpan and thus get better QoR without sacrificing speed.

We saw QoR improvements throughout our gallery of designs. Here is a sample public dashboard that captures and tracks QoR changes across our nightly builds across a gallery of designs: https://dashboard.theopenroadproject.org/. Design metrics are captured for nightly builds and compared against golden runs for goodness.

We plan to make such capabilities available to users to enable them to capture key design metrics and track QoR across design changes.

OpenROAD continued its synergy with the MacroPlacement effort to provide an open, transparent version of Google Brain’s deep reinforcement learning-based method.

Silicon Support and Tapeouts

We had several notable design successes: intel16 tapeout by Army Labs (ARL) and a GF12 based mixed-signal design by the University of Michigan. The ARL tapeout also advanced capabilities in the power network creation along with automatic I/O Pad placement functionality.

We also prioritized PDK support as a key enabler of open-source design; we added support for GF180MCU public PDK that launched the first Google sponsored shuttle on this node.

As part of the ongoing effort to improve tool quality and results, we implemented a continuous and automated CI for over 100 Open MPW user designs from shuttles MPW2-7. This helps find QoR degradations, tool bugs and other flow issues. Using GitActions based workflows, users can automatically monitor and update their designs based on QoR or software update alerts from their CI. OpenLane was used in > 600 tapeouts to-date from MPW5-7 onwards to MPW-8.

Enhancing Industry focus, User communications and Outreach

We made a lot of progress in our communication and outreach programs to expand our community for an improved learning experience and productive engagement. A new website now shares important updates and provides access to key resources such as tutorials, papers, blogs and interesting user stories showcasing interesting use models and innovations in research: https://theopenroadproject.org/user-stories/. As a result, we have seen a rapid increase in user subscription through our website and GitHub.

OpenROAD is fast becoming the de facto application for Open-Sourced IC design tools for research and skill development. OpenROAD is an ongoing training partner for IC design courses with UCSC extension– two courses as part of UCSC extension in the fall of 2022 : Advanced Physical Design and Timing closure. https://www.ucsc-extension.edu/certificates/vlsi-engineering/. These courses continue in 2023.

Several Universities have signed MoUs to use OpenROAD based design courses. University of Costa Rica, taught a semester-long course on Microelectronics and basic VLSI design. OpenROAD continues to drive global research, innovation and workforce development at major universities and the general body of users.

We offered three OpenROAD based internship projects at GSoC 2022 with a goal of fostering direct contribution in the form of tutorials, documentation and other development needs. This is a great opportunity for students to learn, contribute and advance their skills in EDA tools and IC design.

Tom Feist joined us as part of the EEI – Embedded Entrepreneur Initiative to guide the project along a path towards Industry adoption and product stabilization.

OpenROAD Focus for 2023

The following are key areas of focus planned for 2023:

Enhance usability in software installation, distribution and execution. Runtime efficiency through effective partitioning, distribution and multi-threaded processing over a single machine and across the cloud.
Major features support:
- OpenROAD as a cockpit for fast, low-cost design exploration and prediction for high-confidence convergence and handoff to back-end implementation in OpenROAD and other EDA tools.
- Support for UPF, dynamic power simulation, multiple-power domains
- A hierarchical macro placer that is more efficient for hierarchical placements. This will replace the default macro placer.
- Basic support for DFT- scan-chain integration
- CTS improvements
- Enhanced timing model for better accuracy
Enhanced support for public and private PDKS for design exploration at key technology nodes
- Skywater (SKY90) enablement
- Creation of proxy PDKs such as ASAP7 for accurate predictability for implementation in real applications
QoR improvements – “Measure and track what you need to improve”
- Closing the gap to measurable PPA targets for important designs.These includes auto-tuning of key tools in the flow (router, cts etc.)
- Use important metrics through dynamic tracking and actionable insights both internally and to enable users to track their design progress in an ongoing manner.
Education and Workforce Development

OpenROAD will expand initiatives through key partners to support and foster barrier-free, low/no-cost education for students, researchers and other professionals looking to upskill and gain employment in the semiconductor industry.

Finally, we would like to see you use OpenROAD in many more exciting applications and share your experiences with us! Reach out and share your ideas and comments : https://theopenroadproject.org/contact-us/

Stay tuned and Happy Learning in 2023 and onwards!

Implementation of RISCduino core using a Hierarchical Design Flow

sivaganesh — Thu, 05 Jan 2023 09:28:05 +0000

Dinesh Annaya is an ardent Open-Source EDA enthusiast and an expert user of OpenROAD and OpenLane. He developed a baseline RISCduino SoC, a single, 32 bit RISC-V based controller compatible with the Arduino platform . He has submitted over 15 designs on Open MPW shuttles on sky130- https://github.com/dineshannayya/riscduino. During the course of his design journey, he successively improved the design architecture for better performance, and enhanced functionality. His main motivation for the use of Open-Source EDA tools is to gauge quality of results and potential for commercial use.

A flat design approach forces design implementation to a single module which increases runtime and design complexity. Dinesh uses a hierarchical design flow methodology to reduce runtime, memory usage, and to meet his design, performance and area goals for implementation on the Caravel top-level SoC.

Continuous Architectural and Design Improvement

The hierarchical design flow methodology using OpenROAD and OpenLane significantly reduces runtime and eases design complexity for the target user area die size (10 mm2) and pre-defined pin constraints of the Caravel GPIO. Dinesh implemented three derivatives of the main RISCduino core: single, dual and quad, as shown in the figure below.

Find details here: https://github.com/dineshannayya/riscduino#readme

For his last design iteration, Dinesh was able to achieve a significant performance improvement (100Mhz at typical corner) with increasingly dense designs and a high utilization. Shown below are 21 blocks or Macros at the top-level SoC.

This allowed him to focus on good block-level implementations, which after hardening as Macros were easily integrated at top-level. This also vastly simplified top-level routing and timing closure.

Implementation Using a Hierarchical Flow

Dinesh employs a hierarchical instead of a flat design methodology to better manage block-level performance for faster runtimes and better usage of memory. He uses a combination of a top-down approach for design partitioning, time budgeting and top-level placement, and a bottom-up approach to harden Macros, perform SoC integration and achieve final design convergence.

Here are the main steps:

Design Partitioning and Block-level Constraints

- Design partitioning is based on functionality to minimize interconnect signals and combinational logic between the blocks. Here are some guiding rules that Dinesh used:
  
  Rule-1: Too many design components fragment the floorplan, making it difficult to floorplan and close top-level timing. Group smaller and similar components of the functional units into blocks each less than 0.5mm2

Rule-2: A flat design approach leads to longer RTL-GDSII flow runtime and timing closure challenges at chip level. Logically partition the design into multiple blocks around 0.5mm2 each.

Macro Placement

In this case a manual Macro placement is used to give better pin placement, block-level interconnects and feedthroughs for routing efficiency.

Rule 1: Manual Macro pin placement gives better global routing. Use OpenROAD to preview Macro connectivity and rearrange Macro pin placement.

Rule 2: Add Feedthrough partition to connect blocks to top-level i/o for congestion-free routing

Feedthrough paths are defined from top-level I/O pins into and through the blocks to reduce long routes and congestion. Repeaters are added to these paths to maintain signal strength and avoid max slew and fanout violations.

These feedthrough partitions are manually inserted at 4 corners of the design. The partition I/O signals and position are based on the physical location of the corresponding top-level I/O ports and Macro pins. Finally, these partitions are hardened in met-3, so that they do not block global met-4/met-5 PDN stripes. Timing for feedthrough paths is analyzed by extracting the SPEF parasitics of the paths inside the partition and running timing analysis in OpenLane at top-level.

Time Budgeting and Setting Constraints

Dinesh estimated I/O budgets for each block using a good rule-of-thumb and subsequently re-adjusted the block-level SDC based on top-level hierarchical timing analysis.

Rule-1: Create Block level SDC with I/O Setup delay constraints at Macro ports. Allocate: 60% for external delay with 40% total for block + 20% interconnect. Hold delay constraints:1ns External delay

Rule-2: Run hierarchical timing analysis at top-level; if there are violations, try to re-adjust the I/O timing of the Macro SDC, re-harden it, and re-analyze the top-level timing. This is generally an iterative step until all constraints are met.

Hierarchical Timing Signoff

For MPW2-6 shuttle submissions, Dinesh used his custom top-level scripts using Macro spef + Standard cell .lib to do hierarchical time analysis. An example script is available at:https://github.com/dineshannayya/riscduino/blob/master/sta/scripts/caravel_timing.tcl

Note: Efabless MPW-2 Silicon debug exposed an RCX extraction issue in the hierarchical design and since then the Efabless team revised the tiiming script. From MPW-7 onwards Dinesh used the default timing script. Read this for more information:

https://caravel-user-project.readthedocs.io/en/latest/#running-timing-analysis-on-existing-projects

Flow Summary

Here is a summary of the flow steps:

Design Partitioning
Time budgeting and defining initial constraints
Floorplanning
1. Macro placement and feedthroughs
2. Fine tuning SDC constraints
3. Power network generation
Clock tree synthesis at top-level
1. Balance clock skew
2. Add repeaters as needed
Harden each Macro RTL-GDSII implementation in OpenLane
Top-level integration
1. Hooking up Macros to top-level
Chip-level signoff
1. Load verilog files for all levels of hierarchy
2. Load Macro & Top-level SPEF files
3. Run top-level, flat timing analysis for all three corner
4. Make sure that there are no hold violations in 9 corners –Library (Fast/Typical/Slow) Vs Spef (Max/Nom/Min)
5. Analysis the Max timing margin for each clock domain across each corner.

Key Design Strategies to achieve good PPA

Dinesh customizes the flow implementation to leverage many improvements to OpenROAD’s clock tree synthesis (CTS), router (DRT) and power network creation (PDN) to further improve productivity and QoR. Here are some interesting techniques he uses to improve his design to achieve good PPA with the given the flow capabilities:

Better Power management -- Multiple power regions lead to better use of routing resources

Step-1: Macro Power Pitch/Width changed from default 153um/1.6um to 100um/6.2um

Step-2: Reduce top-level PDN pitch from 153um to 100um and increase width from 1.6um to 6.2um to enable an efficient 9 multi-via hook-up from top-level to Macro.

PDN with 2 via vs 4 vias hookups

Now Macros are connected through 9 multi-cut vias compared to 2 vias for better reliability and lower resistance which resulted in lower IR drop

Step-3:Ensure that feedthrough partition is hardened within met-3 so that this will not create a blockage for top-level met-l4 and met-5 power stripes routing.

Since a repeater partition needs distinct power hook-up requirements compared to the rest of the Macros, he defines a separate power-domain for the power connections. Here is the pdn script:

https://github.com/dineshannayya/riscduino/blob/master/openlane/user_project_wrapper/pdn_cfg.tcl

Power grid on side blocks has thicker straps hence less IR drop.

Clock Tree Balancing

Currently, the OpenLane flow does not automatically balance clock skews across the Macros. Shown below is an example where each Macro has a different clock latency. Dinesh defines each Macro with16-tap adjustable clock skew buffers. Each Macro is hardened with skew adjusted to close timing at the top-level Caravel design.

Final Design Results

Using a customized, hierarchical flow, Dinesh was successfully able to meet his design goals. The table below shows the user area utilization for the riscduino_qcore design which has around 150K cells + 48 Kb SRAM (from sky130 pdk) . For the typical timing corner, the RISC-V Arduino core timing performance was met at an fmax of 100Mhz.

Block	Total Cell	Combo	Seq	Utilization
RISC (4 Core)	94165	79675	14490	45%
QSPI	9038	7525	1513	42%
UART_I2C_USB_SPI	11880	9011	2869	42%
WB_HOST	6511	5359	1152	45%
WB_INTC	6674	5263	1411	20%
PINMUX	11923	9318	1061	35%
PERIPHERAL	5847	4786	1061	42%
BUS-REPEATER	922	922	0	20%

TOTAL	146960	121859	25101

Final Routed Design

Figure below shows the final routed GDSII for the RISCduino Score, Dcore and QCore designs.

Conclusion

Dinesh summarizes his usage experience of OpenROAD and OpenLane as below:

“I highly appreciate the time and effort taken by the OpenROAD team in developing a VLSI design flow based on open-source concepts. Each of the VLSI design stages from RTL-GDSII flow needs specialized technology knowledge and in depth implementation strategy with coordinated effort and strategy. I see a continuous improvement in the OpenLane tool over each MPW shuttle.

I am highly impressed by the OpenROAD team. Response to users in terms of tracking key GitHub issues and overall response time to fix is better than a commercial tool vendor support team. I look forward to a successful commercial tape-out through the OpenLane flow and wish the OpenROAD team the very best for their innovation and mission in OpenEDA flows- in particular in enabling this design flow success as a unix and arduino initiative.

In my technical career I have noticed multiple commercial tool vendors developed an automated RTL to GDS flow and none of these were successful. Main reason for the failure is that every company flow, project & user requirement are unique and each project needs some customization which cannot be easily mapped into a one single RTL to GDS flow. My suggestion to the OpenROAD team is to have clear industry standard handoffs between each stage so that users can effectively use Open-Source EDA tools with commercial tools and Custom Scripts in their commercial projects. I also would like to see missing functionality like Logic Equivalence checking (LEC) and DFT support (JTAG, MBIST, SCAN) added.”

About Dinesh Annayya

Dinesh is an expert SoC designer and has worked in the VLSI industry for more than 20 years, at companies including Cypress Semiconductor, Centillium and Transwitch. Currently he is working as a design manager in Intel India Bangalore Centre. His design work spans multiple foundries including TSMC, Intel, GlobalFoundries, UMC and SMIC and multiple technology nodes including 180nm, 130nm, 90nm, 65nm, 55nm, 45nm, 22nm and 10nm. He has submitted 28 GH issues that have resulted in critical bug fixes that led to significant enhancements to tool features and quality.

In the future, he plans to extend RISCduino with other add-on chips for advanced functionality and fast interconnectivity interfaces such as QUAD SPI.

AE-AV1 Encoder implementation: Using OpenROAD to achieve Real-time Throughput

sivaganesh — Fri, 09 Dec 2022 07:40:39 +0000

Tulio Pereira Bitencourt

OpenROAD is increasingly being used as the leading Open Source EDA solution by a large number of users in industry and academia who are starting to explore and build ASIC designs for a range of mainstream applications of today. Video-on-demand (VoD) is a rapidly growing market dominating >80% of current internet traffic. Video streaming applications demand fast performance to deliver real-time video at high quality, low latency and lower design costs. AV1 supports higher video resolution standards (e.g., 4K, 8K) to fulfill requirements for video size, new video coding standards but fails to meet real-time throughput.

The Problem

AV1, an Open Media (AO Media) video coding delivers good compression rates but does not meet real-time execution and throughput on software only implementations given its high complexity.

In order to develop the next generation of the encoder that meets the ultra-high performance needs (8K@120fps) for MRTR (Maximum Real Time Resolution), Tulio and his team at Informatics Institute, Federal University of Rio Grande do Sul, sought zero-cost OpenEDA solutions to explore and design enhanced design architectures to meet their design goals.

OpenROAD for AE-AV1 Arithmetic encoder design

OpenROAD enables free, open access to tools for RTL-GDS flows and open PDKs within 24 hours run times. This was important for Tulio to explore multiple design architectures to meet his design goals i.e. high performance, low cost (small die area) in the fastest possible time at multiple technology nodes.

The AE-AV1 , open, royalty-free, encoder implements arithmetic coding as a lossless data compression algorithm that improves upon its predecessor codecs – HEVC, VVC. VP9 etc. It optimizes key variables that depict a numeric interval (Low, Range) to encode incoming symbols into a reduced bitstream based on probabilities of their appearance.

The original AV1 lacked the ability to predict hardware implementation results since it relied heavily on dynamic arrays for an unknown set of input symbols. These unique and stringent requirements made OpenROAD the only viable solution to design AE-AVI with a good confidence for manufacturability..

Design Architecture

The team first developed a baseline design in RTL as a multi-stage pipeline shown in the figure. :

Stage 1 Receives symbols, number of symbols in the alphabet and probabilities, and performs pre-calculation

Stage 2 Updates Range and is the critical path. It is optimized by splitting it into Stages 1 (pre-calculations) and 3 (Low updating).

Stage 2 couldn’t be further accelerated due to self-feeding constraints by the Range variable.

Stage-4 is the hardware-friendly stage that implements carry propagation and stores the compressed stream in output registers.

The reason behind separating the updating process of Range and Low in two different stages is to avoid increasing the critical path and, hence avoid adding additional delays into AE-AV1.

Ease-of-Use: Easy Installation, Configuration for Rapid Exploration

“OpenROAD installation is fast and easy- docker based installation encapsulates the complexity of required packages and libraries. It is fantastic how easy it is to just execute a command and have the entire toolset installed and configured all at once, without requiring any intermediary step. ”, says Tulio.

“The scripts used for running the entire OpenROAD flow are extremely easy to use and straightforward to configure. The majority of the work, when one wants to get quick results, is just related to adding the targeted design into the OpenROAD ‘designs’ folder and editing the configuration file. Furthermore, upon designing an architecture, it should be a great idea for any researcher to just use the open-source solutions developed by the OpenROAD team to find the best possible configuration for the design just created, as well as to acquire results quickly to optimize parameters. OpenROAD goes from an RTL input, in my case, a bunch of Verilog files, to GDSII without any extra step necessary aside from triggering the flow.”

“The OpenROAD tools are extremely easy to use and require a very low time to set up. If one considers that a conventional tool requires a lot of infrastructure just to handle licenses, and even more to process the different tasks it supports, it is easy to conclude that running state-of-the-art paid EDA tools in a normal laptop would be unbearable. When running the OpenROAD flow, I used an older generation Dell Inspiron, which is not powerful and could barely handle the AV1 reference software (I had to boot my Linux OS without GUI for that). For OpenROAD, however, I executed everything on the same computer using an external hard-drive, which deprecates the performance even more. My computer did not struggle to run, and in almost no time the analyses were completed.”

To advance computational efficiency, OpenROAD leverages cloud resources to efficiently parallelize key stages in the design flow and distribute processes across multiple machines and CPUs.

Meeting Design Goals- High Throughput, High performance, Low Area

Achieving high performance at the least cost was the design goal–power was not considered to be a key PPA metric for this version of the encoder.OpenLane was initially used to explore design configurations and flow. However, Tulio chose OpenROAD-flow-scripts for its support of ASAP7 along with other Open PDKS (sky130, nangate 45) needed for exploration across technology nodes. OpenROAD-flow-scripts delivers the complete RTL-GDSII flow including yosys for synthesis, OpenSTA for timing analysis and optimization and klayout for DRC checking.

Rapid Design Exploration for optimal Area and Performance

Tulio was successfully able to run several design experiments based on targeted design configurations for multiple frequencies and process technologies including SkyWater130nm (HS, HD), nangate 45nm and ASAP7 predictive PDK.

OpenROAD supports design exploration through an OpenLane python script that automatically runs multiple, user-defined experiments based on different synthesis strategies to optimize area and performance. The table below depicts a sample experiment showing different results for the design to optimize gate count, area and the worst path delay for a given process.

Results

The final design implementation of AE-AV1 using ASAP7 shows a significant improvement in gate count and frequency, over the baseline AV1, with the target MRTR (Maximum-Real-time_Resolution) goal of 8K@120fps, for real-time processing at the maximum possible AV1 resolution. Table below shows the exploration and implementation results across multiple OpenPDKS.

ASAP7 delivered the best PPA, area and frequency improvements to area (24.48%) and frequency ( 82.8%) as compared to the Nangate 45nm PDK. Significant area and performance improvements were possible only at 45nm and lower nodes.The gates count (i.e., area), post-layout for all technologies was calculated by the actual area obtained by each circuit divided by the smaller two-input gate available on the PDK (i.e.,commonly a NAND-2 gate).

The final routed design implementation on ASAP7 is shown below.

Final Routed Design in ASAP7 in OpenROAD GUI

Tulio and his team were able to successfully meet their design goals using OpenROAD based flows, Open PDKs and the GUI, to explore and enhance the AV1 RTL design architecture and verify PPA at multiple technologies all of which were available within a fully integrated, easy-to-use and open ecosystem. They achieved these results within a significantly shorter period of time than what it would have taken with conventional EDA tools and at zero tool and PDK costs They published a paper to showcase their innovative research in this paper (include link).

“The OpenROAD toolset has a very well-structured flow, which can be easily configured by adding a design and editing the configuration file, if one wants quick results, or changing multiple parameters for achieving better results. For someone who was not familiar with the OpenROAD flow, I was very happy to find out that it was extremely straightforward to use and to reach a RTL-to-GDSII flow. The way OpenROAD allows for certain parameters to be kept as default, or be changed according to the needs of the user is incredible and allows designers to reach impressive results with state-of-the-art PDKs”., concludes Tulio.

References

AE-AV1 publication :https://jics.org.br/ojs/index.php/JICS/article/view/564

Baseline AV1 https://ieeexplore.ieee.org/document/9800932

Effective Design Productivity and Performance Management using GitHub Notifications

sivaganesh — Fri, 26 Aug 2022 07:52:48 +0000

OpenROAD aims to continually improve the quality of the application and results i.e PPA for designs. One of the ways to do this is to track CI both internally within the project tools and design repository and enable users to manage their design and CI environment.

OpenROAD uses METRICS2.1 to log design data and track results through continuous monitoring of key design parameters and their impact on QoR.

This blog shows you how to configure your repo to track GH based notifications and hence better manage your design goals and software environment.

Use Models for GitHub notifications

OpenROAD implements a set of automatic CI processes through GitHub Actions workflows that automatically trigger upon key events. You can choose to be notified about important events of interest such as design changes, software updates and potential QoR improvements. This will help you automate your design changes and ensure you are using the latest software version. You will also deepen your understanding of the design flow, tool issues and bugs through GitHub Participation and thus enhance your productivity. You can selectively control and manage the type and degree of notifications through Settings in your repository.

Here are a few examples of use models for such notifications:

Re-run a design flow due to a change in the software version.
Re-run a design flow due to changes in the design configuration or source code.
Use METRICS2.1 to automatically track results for better PPA, run times and overall flow efficiency.

Git actions for 1 and 2 above are supported currently. Using METRICS2.1 to track QoR is in development.

The screenshot below shows examples of GitAction workflows that trigger for (1) and (2) cases above.

Here’s a screenshot of an OpenROAD internal jenkins dashboard that tracks key flow and design METRICS across multiple CI runs. You can see how these METRICS change across software updates. Note the improvements to clock latency, skew and setup time improvements for this design across the last 3 runs.

Managing your CI using Git actions or email notifications

You can use GitHub notification and email to track important events of interest to you such as PRs.

Here is an example of how to automatically update the OpenLane version through a git action: https://theopenroadproject.org/using-git-action/

Learn how to configure notifications, manage and filter subscriptions:

Here’s how to integrate GH email into your repo for notifications https://docs.GitHub.com/en/repositories/managing-your-repositorys-settings-and-features/managing-repository-settings/about-email-notifications-for-pushes-to-your-repository

Stay tuned for more updates on using Git actions and METRICS to effectively manage your design environment and quality of results.

References

IEEE CEDA/DATC repo https://GitHub.com/ieee-ceda-datc/datc-rdf-METRICS4ML

METRICS2.1 ICCAD 21 paper (.pdf), (.pptx), Proc. ACM/IEEE International Conference on Computer-Aided Design, 2021.