Coder Social home page Coder Social logo

rellm's Introduction

ReLLM

Regular Expressions for Language Model Completions.

Some people, when confronted with a problem, think โ€œI know, I'll use regular expressions.โ€ Now they have two problems.

Exact structure out of any language model completion with regular expressions.

Return specific syntactic structure (e.g. JSON or XML), or specific semantic structure (e.g. a date or a number), or even complete templates (e.g. a sentence with a blank to fill in).

How does it work? ReLLM filters non-matching tokens pre-generation. For each token, ReLLM tests every possible completion against a partial regex. For the potential completions that do not match the pattern, ReLLM masks the logits so that the language model does not generate them.

If you are looking for a hosted version of ReLLM, check out the Thiggle Regex Completion API at github.com/thiggle/api

Installation

pip install rellm

The preliminary results are interesting -- even for small models, constraining the token space with ReLLM can improve the quality of the completions. Not to mention the ability to more easily parse the output programmatically. Take a look at some of the examples. For an example of parsing a context-free grammar (like JSON) with ReLLM, see r2d4/parserllm.

import regex
from transformers import AutoModelForCausalLM, AutoTokenizer

from rellm import complete_re

model = AutoModelForCausalLM.from_pretrained("gpt2")
tokenizer = AutoTokenizer.from_pretrained("gpt2")

prompt = "ReLLM, the best way to get structured data out of LLMs, is an acronym for "
pattern = regex.compile(r'Re[a-z]+ L[a-z]+ L[a-z]+ M[a-z]+')
output = complete_re(tokenizer=tokenizer, 
                     model=model, 
                     prompt=prompt,
                     pattern=pattern,
                     do_sample=True,
                     max_new_tokens=80)
print(output)
> Realized Logistic Logistics Model

Examples using GPT2 (124 million parameters)

Using GPT2 (124m)

Prompt: ReLLM, the best way to get structured data out of LLMs, is an acronym for

Pattern: Re[a-z]+ L[a-z]+ L[a-z]+ M[a-z]+

ReLLM: Realized Logistic Logistics Model

Without ReLLM: Largest Largest Address Space (MELSP), which has its roots in the Internet network, at least when compared

Prompt: Return the first three letters of the alphabet in a json array:

Pattern ["[a-z]", "[a-z]", "[a-z]"]

ReLLM: ["a", "b", "c"]

Without ReLLM: { "index": 0, "id":"1", "description":"", "text": "[{ "id": 0, "name":

Prompt: Fill in the sentence with an interesting story about the dentist:

Pattern: Today I'm going to the [a-z]+ to [a-z]+ because ([a-z]+ )*.

ReLLM: Today I'm going to the dentist to see because it is a very important day for me

Without ReLLM: 'My family bought me an appointment with a dentist when I was 15. The dentist gave me one a year and then I was told on

Prompt: Is this a good demo?

Pattern: (Yes|No)

ReLLM: No.

Without ReLLM: I don't know, but this is amazing! Even more amazing is how the design can take place on a small stage that uses LEDs. As

Prompt: Convert the date May 4, 2023 to the format mm/dd/yyyy:

Pattern: [0-9]{2}/[0-9]{2}/[0-9]{4}

ReLLM: 00/00/0045

Without ReLLM: mm:ss

A-Z, Z-A, W-H (0-9:9:19)

Z-R

Prompt: Jeff Dean is a

Pattern (Programmer|Computer Scientist|AGI)

ReLLM: Computer Scientist

Without ReLLM: former national basketball champion and a former professional basketball player. He currently serves as general counsel for the NCAA Office of the Vice President for Academic Affairs.

Prompt: I can eat

Pattern: [0-9]{1,10} [a-z]* of [a-z]*

ReLLM: 800 calories of coffee

Without ReLLM: iced coffee here on the west side and do this, so can you?"

"Why, I don't understand. What did you mean by

rellm's People

Contributors

dwhitena avatar lambdaofgod avatar r2d4 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rellm's Issues

How slow/fast is this method of calling generate()

I noticed one of the core parts of the strategy is to call generate one token at a time, but I was wondering how slow/fast this is compared to using the ConstrainedBeam search or something similar from HF.
Also curious what the speedup might be of implementing in c++ vs via python wrapper. ggerganov/llama.cpp#1773

I actually think your approach is better for my use case because there are many tweaks you can make even on the grammar sampling (as evidenced by the discussion in the above PR) ... but I am curious as to what the performance impact is.

GPU inference not working

Hi, thanks for your work.

When I'm trying to do generation on GPU I get the following error inside logits_processor.py in transformers (I've tried to place tensors on cuda inside compile_re).

TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

How to use rellm on gpu?

How to use on GPU?

Very interesting library @r2d4 !

I am trying to use the example in the README but with the model being on the GPU (as is required for many of the recent larger LLMs):

import regex
from transformers import AutoModelForCausalLM, AutoTokenizer

from rellm import complete_re

model = AutoModelForCausalLM.from_pretrained("gpt2")
tokenizer = AutoTokenizer.from_pretrained("gpt2")

prompt = "ReLLM, the best way to get structured data out of LLMs, is an acronym for "
pattern = regex.compile(r'Re[a-z]+ L[a-z]+ L[a-z]+ M[a-z]+')

# THIS IS WHAT I'D LIKE TO DO
devide = "cuda:0"
model.to(device)

output = complete_re(tokenizer=tokenizer, 
                     model=model, 
                     prompt=prompt,
                     pattern=pattern,
                     do_sample=True,
                     max_new_tokens=80)
print(output)

fails with

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper_CUDA__index_select)

Is it possible to use ReLLM with the model living on the GPU?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.