Coder Social home page Coder Social logo

llm-benchmark's Introduction

LLM Validator

Setup Guide

Prerequisites

In order to run this project you will need python version 3.9/3.10/3.11 installed on your machine.

Setup environment

Clone repository:

git clone [email protected]:Criss-Wang/llm-validator.git
cd llm-validator

[Optional] Create and activate a virtual environment:

virtualenv -p python3.9 venv
. ./venv/bin/activate

Install the pre-commit hook:

pre-commit install

Install the project dependencies:

pip install -r requirements.dev.txt
pip install -e .

Login to your Weights & Biases account:

wandb login

Initialize and export all the environment variables. We support various third-party model inference providers. Please refer to .env.sample for a list of api keys required. then create an .env file and fill in the relevant api key values associate to each environment variable, and run

export $(grep -v '^#' .env | xargs -0)

to export the environment variables.

Basic usage

  1. Define your prompt under prompts folder, in the following format
- name: prompt-name
  system:
    value: >
      system prompt content here
  user:
    value: >
      user prompt content here
  1. Import the validation dataset you'd like to use into datasets folder
  2. Create a config file under configs folder. Refer to configs/code_generation/openai.json for how to structure your configurations in json format.

Advanced usage

Custom Client

You can introduce additional client/api providers, or even local endpoints by implementing a Client defined under llm_validation.component.

Custom Metrics

You can introduce additional metrics by implementing a Metric or inheriting from one of the 5 domains (Cost, Latency, Accuracy, Security, Stability) under llm_validation.component.

Tutorials

llm-benchmark's People

Contributors

criss-wang avatar

Watchers

Kostas Georgiou avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.