Coder Social home page Coder Social logo

rtllm's Introduction


  _____    _______   _        _        __  __        __      __  __       __ 
 |  __ \  |__   __| | |      | |      |  \/  |       \ \    / / /_ |     /_ |
 | |__) |    | |    | |      | |      | \  / |        \ \  / /   | |      | |
 |  _  /     | |    | |      | |      | |\/| |         \ \/ /    | |      | |
 | | \ \     | |    | |____  | |____  | |  | |          \  /     | |  _   | |
 |_|  \_\    |_|    |______| |______| |_|  |_|           \/      |_| (_)  |_|
                                                                             
                                                                             
                                              

Version 1.1

We have released RTLLM v1.1 already, which fixes some errors found in v1.0.

  1. Update the design_description.txt to better guide LLM in generating RTL code.
  2. Provide a more comprehensive testbench.v to improve the accuracy of the test.
  3. Update a more practical testing script auto_run.py .

--13 Dec. 2023

RTLLM: An Open-Source Benchmark for Design RTL Generation with Large Language Model

Yao Lu, Shang Liu, Qijun Zhang, and Zhiyao Xie, "RTLLM: An Open-Source Benchmark for Design RTL Generation with Large Language Model," Asia and South Pacific Design Automation Conference (ASP-DAC) 2024.[paper]

Note: In our paper, the results are obtained based on RTLLM V1.0.

1. Documents

RTL Generation with Large Language Model Benchmark for generating design RTL with natural language (under construction). This repository contains a total of 29 designs. Each design has its own folder, which includes several files:

  1. Design Description (design_description.txt):

    This file provides a natural language description of the design.

  2. Testbench (testbench.v):

    This file contains the testbench code used to simulate and test the design on Synopsys VCS.

    vcs testbench.v ../*.v
    
  3. Designer RTL (verified_verilog.v):

    This file contains the Verilog code that has been verified and confirmed to be functionally correct.

  4. LLM Generated Verilog (LLM_generated_verilog.v):

    This file contains the Verilog code generated by LLM. Just so you know, this code may not be verified and should be used with caution.

Please refer to the respective folders for each design to access the files mentioned above.

2. Run Makefile 1

You can run makefile to test the functionality of the code.

Step 1. Replace #DESIGN_NAME# with the design name you need to test.

TEST_DESIGN = #DESIGN_NAME#

Step 2. Compile the Verilog file.

make vcs

Step 3. Functionality test

make sim

Step 4. View the results

===========Your Design Passed===========
or
===========Error===========
or
===========Test completed with */N failures===========

Step 5. Clear output files

make clean

3. Workflow

Fig.1 Complete RTL generation and evaluation workflow using this benchmark, including three straightforward stages.

  • In stage 1, users feed each natural language description ๐“› into their target LLM ๐“•, generating the design RTL ๐’ฑ = ๐“•(๐“›). If an LLM solution requires additional prompt techniques ๐“Ÿ, it will switch the natural language description ๐“› to actual input prompts ๐“›๐“Ÿ, with the output design RTL being ๐’ฑ = ๐“•(๐“›๐“Ÿ). If necessary, additional human engineers' efforts can also be introduced, generating ๐’ฑ = โ„(๐“•(๐“›๐“Ÿ)).

  • In stage 2, the framework will test the functionality of the generated design RTL ๐’ฑ using our provided testbench ๐’ฏ.

  • In stage 3, the generated design RTL ๐’ฑ is synthesized into a netlist to analyze the design qualities regarding PPA values. They will be compared with the design qualities of the provided reference designs ๐’ฑโ‚•.

Fig.1: The workflow of adopting RTLLM for completely automated design RTL generation and evaluation. The user only needs to provide their LLM as input. It evaluates whether each generated design satisfies the syntax goal, functionality goal, and quality goal.


  • Description (design_description.txt) denoted as ๐’ฑ: A natural language description of the target design's functionality. The criteria is, that a human designer can write a correct design RTL ๐’ฑ after reading the description ๐“›. This description ๐“› also includes an explicit indication of the module name, all input and output (I/O) signals with signal name and width. These pre-defined modules and I/O signal information enable automatic functionality verification with our provided testbench.
  • Testbench (testbench.v) denoted as ๐’ฏ: A testbench with multiple test cases, each with input values and correct output values. The testbench corresponds to the pre-defined module name and I/O signals in ๐“›. It can be applied to verify the correctness of design functionality.
  • Correct Design (designer_RTL.v) denoted as ๐’ฑโ‚•: A reference design Verilog hand-crafted by human designers. By comparing with this reference design ๐’ฑโ‚•, we can quantitatively evaluate the design qualities of the automatically generated design ๐’ฑ. Also, these correct designs have all passed our proposed testbenches.

4. Experiments

Fig.2 summarizes the quantitative evaluation of both syntax and functionality correctness of all five evaluated LLMs using RTLLM.

  • Syntax Correctness: Number of generated design RTLs ๐’ฑ with correct syntax, out of the five trials.
  • Functionality Correctness: A success โœ… as long as there is one generated RTL successfully passing the testbench ๐’ฏ, out of the ones already with correct syntax.

Fig.2: The Syntax and Functionality Correctness Verification for Different LLMs


Fig.3 summarizes the design qualities of generated design RTL from different LLMs2. These quality values are measured on each post-synthesis netlist. We report the worst negative slack (WNS) as the timing metric. It also presents the qualities of our designer-generated reference design ๐’ฑโ‚• in RTLLM. All these reference designs are functionally correct.

Fig.3: The Design Qualities of Gate-Level Netlist, Synthesized with Design Compiler.

Footnotes

  1. We have recently provided an automated Python script (auto_run.py) that you can use as a one-click compilation for all designs after simple modification. โ†ฉ

  2. The worst LLM StarCoder is not presented due to space limitations. โ†ฉ

rtllm's People

Stargazers

Manuel Gijรณn Agudo avatar  avatar  avatar Jinghua Wang avatar  avatar hao avatar Haomin Li avatar Rob Taylor avatar  avatar Magic Mai avatar Pranjal Mittal avatar NickNick avatar Tanvi Sharma avatar Range avatar  avatar Zeju Li avatar  avatar SujayP avatar  avatar  avatar  avatar Aditi Raghavan avatar Pu Yuan avatar Jeff Carpenter avatar Mauricio Montanares avatar Adithya Sunil Edakkadan avatar Mohamed Kassem avatar  avatar Syed Hassan Ul haq avatar Augusto Ribeiro Castro avatar  avatar  avatar Niranjan Anandkumar avatar Yao Lu avatar Ruokai Yin avatar  avatar Xu Haohang avatar jinsheng.chen avatar Qijun Zhang avatar Ahmed Allam avatar  avatar  avatar ShangLiu avatar  avatar  avatar Mingzhe Gao avatar Yikang Ouyang avatar SichaoY avatar  avatar orangeYao avatar  avatar Tianhao Wei avatar Lixin Liu avatar  avatar Lee Man avatar Wenji Fang avatar

Watchers

Avery avatar  avatar

rtllm's Issues

Can you give more details about how to perform logic synthesis?

Thank you for your great job. I wanna calculate PPA for the results generated by our own LLM. But it seems that the paper didn't give many details about how to perform logic synthesis with Design Compiler, which makes fair comparison hard to achieve.
Could you share more details or procedures? Thanks.

Missing testbench.v files

testbench.v for some of the designs (eg: traffic_light, radix2_div) are missing. Is there a way to get them?

make error

Hi! When I tried to run the make file, the following error happened.
87097a86f7c8da2bed47ad0f605e6c6

It seems to due to some environment issue like vcs version or something. Could you share some insights? Thanks.

How to calculate timing for combinational logic circuits

I try to calculate PPA for designs in RTLLM. Following the setting in the paper, I employ DC to conduct logic synthesis.
For timing, I use PT to perform Static Timing Analysis. But I don't know how to get timing for combinational logic circuits like the adder. Can you share some suggestions? Thanks.

any pre-train model?

is there any pre-train model for generating the verilog code?

or the source code of LLM?

Missing design: risc_cpu

Hi,
It looks like the directory for 'risc_cpu' is missing. I'm just curious because all LLMs failed on this task.

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.