Coder Social home page Coder Social logo

rtllm's Introduction


  _____    _______   _        _        __  __        __      __  __       __ 
 |  __ \  |__   __| | |      | |      |  \/  |       \ \    / / /_ |     /_ |
 | |__) |    | |    | |      | |      | \  / |        \ \  / /   | |      | |
 |  _  /     | |    | |      | |      | |\/| |         \ \/ /    | |      | |
 | | \ \     | |    | |____  | |____  | |  | |          \  /     | |  _   | |
 |_|  \_\    |_|    |______| |______| |_|  |_|           \/      |_| (_)  |_|
                                                                             
                                                                             
                                              

Version 1.1

We have released RTLLM v1.1 already, which fixes some errors found in v1.0.

  1. Update the design_description.txt to better guide LLM in generating RTL code.
  2. Provide a more comprehensive testbench.v to improve the accuracy of the test.
  3. Update a more practical testing script auto_run.py .

--13 Dec. 2023

RTLLM: An Open-Source Benchmark for Design RTL Generation with Large Language Model

Yao Lu, Shang Liu, Qijun Zhang, and Zhiyao Xie, "RTLLM: An Open-Source Benchmark for Design RTL Generation with Large Language Model," Asia and South Pacific Design Automation Conference (ASP-DAC) 2024.[paper]

Note: In our paper, the results are obtained based on RTLLM V1.0.

1. Documents

RTL Generation with Large Language Model Benchmark for generating design RTL with natural language (under construction). This repository contains a total of 29 designs. Each design has its own folder, which includes several files:

  1. Design Description (design_description.txt):

    This file provides a natural language description of the design.

  2. Testbench (testbench.v):

    This file contains the testbench code used to simulate and test the design on Synopsys VCS.

    vcs testbench.v ../*.v
    
  3. Designer RTL (verified_verilog.v):

    This file contains the Verilog code that has been verified and confirmed to be functionally correct.

  4. LLM Generated Verilog (LLM_generated_verilog.v):

    This file contains the Verilog code generated by LLM. Just so you know, this code may not be verified and should be used with caution.

Please refer to the respective folders for each design to access the files mentioned above.

2. Run Makefile 1

You can run makefile to test the functionality of the code.

Step 1. Replace #DESIGN_NAME# with the design name you need to test.

TEST_DESIGN = #DESIGN_NAME#

Step 2. Compile the Verilog file.

make vcs

Step 3. Functionality test

make sim

Step 4. View the results

===========Your Design Passed===========
or
===========Error===========
or
===========Test completed with */N failures===========

Step 5. Clear output files

make clean

3. Workflow

Fig.1 Complete RTL generation and evaluation workflow using this benchmark, including three straightforward stages.

  • In stage 1, users feed each natural language description ๐“› into their target LLM ๐“•, generating the design RTL ๐’ฑ = ๐“•(๐“›). If an LLM solution requires additional prompt techniques ๐“Ÿ, it will switch the natural language description ๐“› to actual input prompts ๐“›๐“Ÿ, with the output design RTL being ๐’ฑ = ๐“•(๐“›๐“Ÿ). If necessary, additional human engineers' efforts can also be introduced, generating ๐’ฑ = โ„(๐“•(๐“›๐“Ÿ)).

  • In stage 2, the framework will test the functionality of the generated design RTL ๐’ฑ using our provided testbench ๐’ฏ.

  • In stage 3, the generated design RTL ๐’ฑ is synthesized into a netlist to analyze the design qualities regarding PPA values. They will be compared with the design qualities of the provided reference designs ๐’ฑโ‚•.

Fig.1: The workflow of adopting RTLLM for completely automated design RTL generation and evaluation. The user only needs to provide their LLM as input. It evaluates whether each generated design satisfies the syntax goal, functionality goal, and quality goal.


  • Description (design_description.txt) denoted as ๐’ฑ: A natural language description of the target design's functionality. The criteria is, that a human designer can write a correct design RTL ๐’ฑ after reading the description ๐“›. This description ๐“› also includes an explicit indication of the module name, all input and output (I/O) signals with signal name and width. These pre-defined modules and I/O signal information enable automatic functionality verification with our provided testbench.
  • Testbench (testbench.v) denoted as ๐’ฏ: A testbench with multiple test cases, each with input values and correct output values. The testbench corresponds to the pre-defined module name and I/O signals in ๐“›. It can be applied to verify the correctness of design functionality.
  • Correct Design (designer_RTL.v) denoted as ๐’ฑโ‚•: A reference design Verilog hand-crafted by human designers. By comparing with this reference design ๐’ฑโ‚•, we can quantitatively evaluate the design qualities of the automatically generated design ๐’ฑ. Also, these correct designs have all passed our proposed testbenches.

4. Experiments

Fig.2 summarizes the quantitative evaluation of both syntax and functionality correctness of all five evaluated LLMs using RTLLM.

  • Syntax Correctness: Number of generated design RTLs ๐’ฑ with correct syntax, out of the five trials.
  • Functionality Correctness: A success โœ… as long as there is one generated RTL successfully passing the testbench ๐’ฏ, out of the ones already with correct syntax.

Fig.2: The Syntax and Functionality Correctness Verification for Different LLMs


Fig.3 summarizes the design qualities of generated design RTL from different LLMs2. These quality values are measured on each post-synthesis netlist. We report the worst negative slack (WNS) as the timing metric. It also presents the qualities of our designer-generated reference design ๐’ฑโ‚• in RTLLM. All these reference designs are functionally correct.

Fig.3: The Design Qualities of Gate-Level Netlist, Synthesized with Design Compiler.

Footnotes

  1. We have recently provided an automated Python script (auto_run.py) that you can use as a one-click compilation for all designs after simple modification. โ†ฉ

  2. The worst LLM StarCoder is not presented due to space limitations. โ†ฉ

rtllm's People

Contributors

hkust-zhiyao avatar norayaolu avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.