Coder Social home page Coder Social logo

llm-classifier's Introduction

LLM Classifier - Instantly classify data with Lamini & Llama 2

Train a new classifier with just a prompt. No data needed -- but add data to boost, if you have it.

from lamini import LaminiClassifier

llm = LaminiClassifier()

prompts={
  "cat": "Cats are generally more independent and aloof than dogs, who are often more social and affectionate. Cats are also more territorial and may be more aggressive when defending their territory.  Cats are self-grooming animals, using their tongues to keep their coats clean and healthy. Cats use body language and vocalizations, such as meowing and purring, to communicate.",
  "dog": "Dogs are more pack-oriented and tend to be more loyal to their human family.  Dogs, on the other hand, often require regular grooming from their owners, including brushing and bathing. Dogs use body language and barking to convey their messages. Dogs are also more responsive to human commands and can be trained to perform a wide range of tasks.",
}

llm.prompt_train(prompts)

llm.save("models/my_model.lamini")

Then, predict!

llm.predict(["meow"])
>> ["cat"]

llm.predict(["meow", "woof"])
>> ["cat", "dog"]

Optionally, add any data. This can help with improving your classifier. For example, if the LLM is ever wrong:

llm.predict(["i like milk", "i like bones"])
>> ["dog", "cat"] # wrong!

You can correct the LLM by adding those examples as data. And your LLM classifier will learn it:

llm = LaminiClassifier()

llm.add_data_to_class("cat", "i like milk.")
llm.add_data_to_class("dog", ["i like bones"]) # list of examples is valid too

llm.prompt_train(prompts)

llm.predict(["i like milk", "i like bones"])
>> ["cat", "dog"] # correct!

If you include data on classes that aren't in your classes, then the classifier will include them as new classes, and learn to predict them. However, without a prompt, it won't have a description to use to further boost them.

General guideline: if you don't have any or little data on a class, then make sure to include a good prompt for it. Like prompt-engineering any LLM, creating good descriptions---e.g. with details and examples---helps the LLM get the right thing.

Run now

./train.sh

We have some default classes. You can specify your own super easily like this:

./train.sh --class "cat: CAT_PROMPT" --class "dog: DOG_PROMPT"

The prompts are descriptions of your classes.

./classify.sh 'woof'

You can get the probabilities for all the classes, in this case dog (62%) and cat (38%). These can help with gauging uncertainty.

{
 'data': 'woof',
 'prediction': 'dog',
 'probabilities': array([0.37996491, 0.62003509])
}

Here are our cat/dog prompts.

Cat prompt:

Cats are generally more independent and aloof. Cats are also more territorial and may be more aggressive when defending their territory.
Cats are self-grooming animals, using their tongues to keep their coats clean and healthy. Cats use body language and vocalizations,
such as meowing and purring, to communicate.  An example cat is whiskers, who is a cat who lives in a house with a human.
Another example cat is furball, who likes to eat food and sleep.  A famous cat is garfield, who is a cat who likes to eat lasagna.

Dog prompt:

Dogs are social animals that live in groups, called packs, in the wild. They are also highly intelligent and trainable.
Dogs are also known for their loyalty and affection towards their owners. Dogs are also known for their ability to learn and
perform a variety of tasks, such as herding, hunting, and guarding.  An example dog is snoopy, who is the best friend of
charlie brown.  Another example dog is clifford, who is a big red dog.
./classify.sh --data "I like to sharpen my claws on the furniture." --data "I like to roll in the mud." --data "I like to run any play with a ball." --data "I like to sleep under the bed and purr." --data "My owner is charlie brown." --data "Meow, human! I'm famished! Where's my food?" --data "Purr-fect." --data "Hiss! Who dared to wake me from my nap? I'll have my revenge... later." --data "I'm so happy to see you! Can we go for a walk/play fetch/get treats now?" --data "I'm feeling a little ruff today, can you give me a belly rub to make me feel better?"
{'data': 'I like to sharpen my claws on the furniture.',
 'prediction': 'cat',
 'probabilities': array([0.55363432, 0.44636568])}
{'data': 'I like to roll in the mud.',
 'prediction': 'dog',
 'probabilities': array([0.4563782, 0.5436218])}
{'data': 'I like to run any play with a ball.',
 'prediction': 'dog',
 'probabilities': array([0.44391914, 0.55608086])}
{'data': 'I like to sleep under the bed and purr.',
 'prediction': 'cat',
 'probabilities': array([0.51146226, 0.48853774])}
{'data': 'My owner is charlie brown.',
 'prediction': 'dog',
 'probabilities': array([0.40052991, 0.59947009])}
{'data': "Meow, human! I'm famished! Where's my food?",
 'prediction': 'cat',
 'probabilities': array([0.5172964, 0.4827036])}
{'data': 'Purr-fect.',
 'prediction': 'cat',
 'probabilities': array([0.50431873, 0.49568127])}
{'data': "Hiss! Who dared to wake me from my nap? I'll have my revenge... "
         'later.',
 'prediction': 'cat',
 'probabilities': array([0.50088163, 0.49911837])}
{'data': "I'm so happy to see you! Can we go for a walk/play fetch/get treats "
         'now?',
 'prediction': 'dog',
 'probabilities': array([0.42178513, 0.57821487])}
{'data': "I'm feeling a little ruff today, can you give me a belly rub to make "
         'me feel better?',
 'prediction': 'dog',
 'probabilities': array([0.46141002, 0.53858998])}

Installation

Clone this repo, and run the train.sh or classify.sh command line tools.

Requires docker: https://docs.docker.com/get-docker

Setup your lamini keys (free): https://lamini-ai.github.io/

git clone [email protected]:lamini-ai/llm-classifier.git

cd llm-classifier

Train a new classifier.

./train.sh --help

usage: train.py [-h] [--class CLASS [CLASS ...]] [--train TRAIN [TRAIN ...]] [--save SAVE] [-v]

options:
  -h, --help            show this help message and exit
  --class CLASS [CLASS ...]
                        The classes to use for classification, in the format 'class_name:prompt'.
  --train TRAIN [TRAIN ...]
                        The training data to use for classification, in the format 'class_name:data'.
  --save SAVE           The path to save the model to.
  -v, --verbose         Whether to print verbose output.

Classify your data.

./classify.sh --help

usage: classify.py [-h] [--data DATA [DATA ...]] [--load LOAD] [-v] [classify ...]

positional arguments:
  classify              The data to classify.

options:
  -h, --help            show this help message and exit
  --data DATA [DATA ...]
                        The training data to use for classification, any string.
  --load LOAD           The path to load the model from.
  -v, --verbose         Whether to print verbose output.

These command line scripts just call python inside of docker so you don't have to care about an environment.

If you hate docker, you can also run this from python easily...

Python Library

Install it pip install lamini

Instantiate a classifier

from lamini import LaminiClassifier

# Create a new classifier
classifier = LaminiClassifier()

Define classes using prompts

classes = { "SOME_CLASS" : "SOME_PROMPT" }

classifier.prompt_train(classes)

Or if you have some training examples (optional)

data = ["example 1", "example 2"]
classifier.add_data_to_class("SOME_CLASS", data)

# Don't forget to train after adding data
classifier.prompt_train()

Classify your data

# Classify the data - in a list of string(s)
prediction = classifier.predict(list_of_strings)

# Get the probabilities for each class
probabilities = classifier.predict_proba(list_of_strings)

Save your model

classifier.save("SOME_PATH")

Load your model

classifier = LaminiClassifier.load("SOME_PATH")

FAQ

How does it work?

The LLM classifier converts your prompts into a pile of data, using the Llama 2 LLM. It then finetunes another LLM to distinguish between each pile of data.

We use several specialized LLMs derived from Llama 2 to convert prompts into piles of training examples for each class. The code for this is available in the lamini python package if you want to look at it.

Is this perfect?

No, this is a week night hackathon project, give us feedback and we will improve it. Some known issues:

  1. It doesn't use batching aggressively over classes, so training on many classes could be sped up by more than 100x.
  2. We are refining the LLM example generators. Send us any issues you find with your prompts adn we can improve these models.

Why wouldn't I just use a normal classifier like BART, XGBoost, BERT, etc?

You don't need to label any data using LaminiClassifier. Labeling data sucks.

No fiddling with hyperparameters. Fiddle with prompts instead. Hopefully english is easier than attention_dropout_pcts.

Why wouldn't I just use a LLM directly?

A classifier always outputs a valid class. An LLM might answer the question "Is this talking about a cat" with "Well... that depends on ....". Writing a parser sucks.

Added benefit: classifiers give you probabilities and can be calibrated: https://machinelearningmastery.com/threshold-moving-for-imbalanced-classification/

llm-classifier's People

Contributors

sharonzhou avatar greg1232 avatar shreyagupta08 avatar gdiamos avatar ninazwei90 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.