lennartpollvogt / ollama-instructor Goto Github PK

View Code? Open in Web Editor NEW

23.0 5.0 3.0 416 KB

Instruct and validate structured outputs from LLMs with Ollama.

License: MIT License

Python 100.00%

json-schema llm ollama pydantic local-llm validation instructor prompting json

ollama-instructor's People

Contributors

Stargazers

Watchers

Forkers

tamaya31

ollama-instructor's Issues

ValidationError when the input seems fine

I am encountering a ValidationError that seems to be coming from the fact that at times the output also contains the parent class name, for example if this was my model:

class Answer(BaseModel): a: str b: str

I'd get this output: 'Answer': {'a': 'A sentence', 'b': "Another sentence"}}

Add reasoning capabilities to ollama-instructor

Allow reasoning when „format“ = „“
In this case the system prompt will be (in any case) have a instruction for the LLM to respond the JSON in a code Block which starts with ‘‘‘ and ends with ‘‘‘.

What’s needed?
– Two kind of system prompts (old and new) + additional prompt when user comes with own system prompt but chooses format = „“
– new method to extract the JSON from the response (code block)

include in retry feature if code block is missing
– Maybe additional error guidance prompt with only requesting the false responses (if multi BaseModels were provided; only possible when having reasoning capabilities active [format=''])
– Provide the raw response within the response object (should already be the case)

Why?
– enhance the quality of the response by allowing the LLM to reason
– make a chat-like experience possible

Add multiple Pydantic BaseModels to ollama-instructor

Making it possible to provide a list of Pydantic BaseModels to chat_completion or chat_completion_with_stream

How?
Make a request to ollama server for each Pydantic BaseModel in the list

Few shot format when use llava to generate Different language caption of image.

Hi, Thank you very much for providing such an attractive project.
I try to use llava:13b to generate some caption in two language.
I have the following roadmap:

!pip install datasets  ollama-instructor

from datasets import load_dataset
ds = load_dataset("svjack/pokemon-blip-captions-en-zh")
ds = ds["train"]

from ollama_instructor.ollama_instructor_client import OllamaInstructorClient
from pydantic import BaseModel, Field
from enum import Enum
from typing import List

import base64
from io import BytesIO

def im_to_str(image):
    buffered = BytesIO()
    image.save(buffered, format="JPEG")
    img_str = base64.b64encode(buffered.getvalue())
    return img_str

class Caption(BaseModel):
    en: str = Field(...,
            description="English caption of image"
        )
    zh: str =  Field(...,
            description="Chinese caption of image"
        )

hist = []
for i in range(8):
    hist.append(
        str(
            {"en": ds[i]["en_text"],
             "zh": ds[i]["zh_text"]}
        )
    )
hist_str = "\n".join(hist)
print(hist_str)

client = OllamaInstructorClient()
response = client.chat_completion_with_stream(
    model='llava:13b',
    pydantic_model=Caption,
    messages=[
            {
                "content": f'''
                You are a image to caption transformer,
                Describe the image content in English and Chinese respectly.
                while adhering to the following JSON schema: {Caption.model_json_schema()}
                following are some samples you should give. :
                {hist_str}
                ''',
                "role": "system"
            }
            ,{
                "content": "Describe the image in English and Chinese",
                "role": "user",
                "images": [im_to_str(ds[-1]["image"])]
            }
    ]
)

from IPython.display import clear_output
for chunk in response:
    clear_output(wait = True)
    print(chunk['message']['content'])

The image is

And I have the output as

{'en': 'An angry Pokémon with claws out and eyes wide open, possibly preparing to battle or defend itself.', 'zh': '一只冷战蜘蛛，眼睛大张，可能准备进入战斗或防御状态。'}

This output is not accurate enough.

But when I use it in zero shot manner, it is accurate in sense but not meet the required semantic style or format. It output

{'en': 'Crab', 'zh': '蟹'}

Can you help it with me ?😊

And can you open a discord channel about this project, that we can improve the project together, and I'm interested with the unfinished meeting examples in the example dir, If it will related with group chat in the meeting, will be amazing. 😊

Add example for Pydantics create_model for custom models

For applications it could be interesting to provide users the possibility to create their own costume JSON schemas. On the fly schemas can be achieved by using Pydantics create_model function.

Possible use cases:

Image captioning and classification applications where users can specify what they want to extract from the images, in the needed languages and structure
Text classifications for a broader audience with their own preferred structure

Inspired: https://x.com/nicolaygerold/status/1808945912154320921?s=46&t=OBuCN0UlLka3px_SzJc60Q

lennartpollvogt / ollama-instructor Goto Github PK

ollama-instructor's People

Contributors

Stargazers

Watchers

Forkers

ollama-instructor's Issues

ValidationError when the input seems fine

Add reasoning capabilities to ollama-instructor

Add multiple Pydantic BaseModels to ollama-instructor

Few shot format when use llava to generate Different language caption of image.

Add example for Pydantics create_model for custom models

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent