Coder Social home page Coder Social logo

yival / yival Goto Github PK

View Code? Open in Web Editor NEW
2.4K 240.0 418.0 22.33 MB

Your Automatic Prompt Engineering Assistant for GenAI Applications

Home Page: https://yival.io/

License: Apache License 2.0

Python 92.97% CSS 0.47% JavaScript 1.33% HTML 0.11% Jupyter Notebook 3.41% Dockerfile 1.70%
ai prompt llm ai-experiments ai-toolkit promptengineering aigc generative-ai prompt-engineering fine-tuning

yival's Introduction

YiVal Logo YiVal

⚡ Auto Prompting ⚡

👉 Follow us: Twitter | Discord

👉 Sponsored by Discord AIGC community: Discord

License: MIT GitHub star chart Open Issues

What is YiVal?

YiVal: Your Automatic Prompt Engineering Assistant for GenAI Applications YiVal is a state-of-the-art tool designed to streamline the tuning process for your GenAI app prompts and ANY configs in the loop. With YiVal, manual adjustments are a thing of the past. This data-driven and evaluation-centric approach ensures optimal prompts, precise RAG configurations, and fine-tuned model parameters. Empower your applications to achieve enhanced results, reduce latency, and minimize inference costs effortlessly with YiVal!

Problems YiVal trying to tackle:

  1. Prompt Development Challenge: "I can't create a better prompt. A score of 60 for my current prompt isn't helpful at all🤔."
  2. Fine-tuning Difficulty: "I don't know how to fine-tune; the terminology and numerous fine-tune algorithms are overwhelming😵."
  3. Confidence and Scalability: "I learned tutorials to build agents from Langchain and LlamaIndex, but am I doing it right? Will the bot burn through my money when I launch? Will users like my GenAI app🤯?"
  4. Models and Data Drift: "Models and data keep changing; I worry a well-performing GenAI app now may fail later😰."
  5. Relevant Metrics and Evaluators: "Which metrics and evaluators should I focus on for my use case📊?"

Check out our quickstart guide!

Link to demo

Tiktok title autotune

Installation

Docker Runtime

Install Docker and pull ourimage on DockerHub:

docker pull yival/release:latest

Run our image:

docker run --it yival/release:latest

VSCode with Docker extension is recommended for running and developments. If you are developer using GPU with Pytorch, or need jupyter lab for data science:

docker pull yival/release:cu12_torch_jupyter
docker run --gpus all --it -p 8888:8888 yival/release:cu12_torch_jupyter

Prerequisites

  • Python Version: Ensure you have Python 3.10 or later installed.
  • OpenAI API Key: Obtain an API key from OpenAI. Once you have the key, set it as an environment variable named OPENAI_API_KEY.

Installation Methods

Using pip (Recommended for Users)

Install the yival package directly using pip:

pip install yival

Development Setup Using Poetry

If you're looking to contribute or set up a development environment:

  1. Install Poetry: If you haven't already, install Poetry.

  2. Clone the Repository, or use CodeSpace:

    2.1 Use CodeSpace The easiest way to get YiVal enviornment. Click below to use the GitHub Codespace, then go to the next step.

    Open in GitHub Codespaces

    2.2 Clone the Repository

    git clone https://github.com/YiVal/YiVal.git
    cd YiVal
  3. Setup with Poetry: Initialize the Python virtual environment and install dependencies using Poetry. Make sure to run the below cmd in /YiVal directory:

    poetry install --sync

Trying Out YiVal

After setting up, you can quickly get started with YiVal by generating datasets of random tech startup business names.

Steps to Run Your First YiVal Program

  1. Navigate to the yival Directory:

    cd /YiVal/src/yival
  2. Set OpenAI API Key: Replace $YOUR_OPENAI_API_KEY with your actual OpenAI API key.

    On macOS or Linux systems,

    export OPENAI_API_KEY=$YOUR_OPENAI_API_KEY

    On Windows systems,

    setx OPENAI_API_KEY $YOUR_OPENAI_API_KEY
  3. Define YiVal Configuration: Create a configuration file named config_data_generation.yml for automated test dataset generation with the following content:

    description: Generate test data
    dataset:
      data_generators:
        openai_prompt_data_generator:
          chunk_size: 100000
          diversify: true
          model_name: gpt-4
          input_function:
            description: # Description of the function
              Given a tech startup business, generate a corresponding landing
              page headline
            name: headline_generation_for_business
            parameters:
              tech_startup_business: str # Parameter name and type
          number_of_examples: 3
          output_csv_path: generated_examples.csv
      source_type: machine_generated
  4. Execute YiVal: Run the following command from within the /YiVal/src/yival directory:

    yival run config_data_generation.yml
  5. Check the Generated Dataset: The generated test dataset will be stored in generated_examples.csv.

Please refer to YiVal Docs Page for more details about YiVal!

Demo

Translation.Bot.mp4
Use Case Demo Supported Features Github Link Video Demo Link
🐯 Craft your AI story with ChatGPT and MidJourney Multi-modal support: Design an AI-powered narrative using YiVal's multi-modal support of simultaneous text and images. It supports native and seamless Reinforcement Learning from Human Feedback(RLHF) and Reinforcement Learning from AI Feedback(RLAIF). Please watch the video above for this use case. Open In GitHub Open In Youtube
🌟 Evaluate performance of multiple LLMs with your own Q&A test dataset Convenientlyevaluate and compare performance of your model of choice against 100+ models, thanks to LiteLLM. Analyze model performance benchmarks tailored to your customized test data or use case. Open In GitHub Open In Youtube
🔥 Startup Company Headline Generation Bot Streamline generation of headlines for your startup with automated test datacreation, prompt crafting, results evaluation, and performance enhancement via GPT-4. Open In GitHub Open In Youtube
🧳 Build a Customized Travel Guide Bot Leverageautomated prompts inspired by the travel community's most popular suggestions, such as those from awesome-chatgpt-prompts. Open In GitHub Open In Youtube
📖 Build a Cheaper Translator: Use GPT-3.5 to teach Llama2 to create a translator with lower inference cost UsingReplicate and GPT-3.5's test data, you can fine-tune Llama2's translation bot. Benefit from 18x savings while experiencing only a 6% performance decrease. Open In GitHub Open In Youtube
🤖️ Chat with Your Favorite Characters - Dantan Ji from Till the End of the Moon Bring your favorite characters to life through automated prompt creation andcharacter script retrieval. Open In GitHub Open In Youtube
🔍Evaluate guardrails's performance in generating Python(.py) outputs Guardrails: where are my guardrails? 😭 <br>Yival: I am here. ⭐️<br><br>The integrated evaluation experiment is carried out with 80 LeetCode problems in csv, using guardrail and using only GPT-4. The accuracy drops from 0.625 to 0.55 with guardrail, latency increases by 44%, and cost increases by 140%. Guardrail still has a long way to go from demo to production. Open In GitHub Open In Youtube
🍨Visualize different foods around the world!🍱 Just give the place where the food belongs and the best season to taste it, and you can get a video of the season-specific food!🤩 Open In GitHub Open In Youtube
🎈News article summary with CoD By integrating the"Chain of Density" method, evaluate the enhancer's ability in text summarization.🎆 Using 3 articles points generated by GPT-4 for evaluation, the coherent score increased by 20.03%, the attributive score increased by 25.18%!, the average token usage from 2054.6 -> 1473.4(-28.3%) 🚀. Open In GitHub Open In Youtube
🥐 Automated TikTok Title Generation Bot With only two input lines, you can easily createconcise and polished TikTok video titles based on your desired target audience and video content summaries. This is presented by our auto-prompt feature: the process is automated, so you can input your requirements and enjoy the results hassle-free! Open In GitHub Open In Youtube

Contribution Guidelines

If you want to contribute to YiVal, be sure to review the contribution guidelines. We use GitHub issues for tracking requests and bugs. Please join YiVal's discord channel for general questions and discussion. Join our collaborative community where your unique expertise as researchers and software engineers is highly valued! Contribute to our project and be a part of an innovative space where every line of code and research insight actively fuels advancements in technology, fostering a future that is intelligently connected and universally accessible.

Contributors


🌟 YiVal welcomes your contributions! 🌟

🥳 Thanks so much to all of our amazing contributors 🥳

Paper / Algorithm Implementation

Paper Author Topics YiVal Contributor Data Generator Variation Generator Evaluator Selector Enhancer Config
Large Language Models Are Human-Level Prompt Engineers Yongchao Zhou, Andrei Ioan Muresanu, Ziwen Han YiVal Evolver, Auto-Prompting OpenAIPromptDataGenerator OpenAIPromptVariationGenerator OpenAIPromptEvaluator, OpenAIEloEvaluator AHPSelector OpenAIPromptBasedCombinationEnhancer config
BERTScore: Evaluating Text Generation with BERT Tianyi Zhang, Varsha Kishore, Felix Wu YiVal Evaluator, bertscore, rouge @crazycth - - BertScoreEvaluator - - -
AlpacaEval Xuechen Li, Tianyi Zhang, Yann Dubois et. al YiVal Evaluator - - AlpacaEvalEvaluator - - config
Chain of Density Griffin Adams Alexander R. Fabbri et. al Prompt Engineering - ChainOfDensityGenerator - - - config
Large Language Models as Optimizers Chengrun Yang Xuezhi Wang et. al Prompt Engineering @crazycth - - - - optimize_by_prompt_enhancer config
LoRA: Low-Rank Adaptation of Large Language Models Edward J. Hu Yelong Shen et. al LLM Finetune @crazycth - - - - sft_trainer config

yival's People

Contributors

bang0518 avatar big-lele avatar crazycth avatar descpool avatar djvim avatar hhrhai avatar hothubby avatar kmigdol avatar kokeblom avatar larrywo avatar libresse avatar lovelinks avatar myfls avatar niaiji avatar oliverfeng avatar risakokudo avatar sanjole avatar smart-bear avatar theadrianbao avatar uni-zhuan avatar unscriptedguy avatar venchj avatar yanqd0 avatar yiyunke avatar yueyingcoding avatar yujun-zou avatar zehongsong avatar zenowzh avatar zetianluo avatar zhychina avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

yival's Issues

Clear instruction to use Azure OpenAI LLM

Is your feature request related to a problem? Please describe.
I can't find how to use YiVal with an Azure OpenAI LLM in the doc or github

Describe the solution you'd like
A clear example for using Azure OpenAI LLM

Describe alternatives you've considered
I've seen this page : (

"|--------------| ---- | ---- | ---- |--------------|\n",
)
But the examples are for using a local model.

Hosted Yival ?

Is your feature request related to a problem? Please describe.
I've been using YiVal for a couple days now, are you planning on building a hosted product ?

If yes - I'd love to help out (i'm the maintainer of LiteLLM https://github.com/BerriAI/litellm)

Comparing Unique Features and Competitive Advantages of YiVal and MetaGPT

Hello YiVal Team,

Firstly, congratulations on your fantastic work on the YiVal project. It is clear that this unique GenAI-Ops framework has been carefully designed with quality and utility in mind.

I have been utilizing your framework and appreciate the ability it offers to iteratively tune the Generative AI model metadata, parameters, prompts, and retrieval configurations. It's impressive that the users are allowed not only to select their test dataset generation and evaluation algorithms but also to choose the improvement strategies. This flexibility truly differentiates your work.

I've also been following the MetaGPT project, a multi-agent framework that empowers a GPT to operate within a software company, encouraging collaboration on more complex tasks. MetaGPT is particularly notable for its approach to orchestrate GPTs to carry out distinct roles within a software entity, transforming a one-line requirement into an extensive set of user stories, competitive analyses, requirements, APIs, and even data structures. It presents these outputs in its unique way of having GPTs fulfill roles equivalent to product managers, project managers, architects, and engineers. This recapitulation of a software company's operations and processes within the framework is interesting.

Given your familiarity with both YiVal and MetaGPT, I am curious to understand the prominent distinguishing features, key strengths, niche audience, or potential use cases, that YiVal offers relative to MetaGPT. What are the fundamental competitive advantages of YiVal over MetaGPT?

Looking forward to the clarifications and insights you can provide. Thank you for taking the time to address my query.

Best Regards,

Support Train Test Data split in Data Generation

image

In the current user flow, If we only use dataset generation, combination, define our custom function and evaluation we are fine consider all data in dataset generation as test data and input for evaluation. e.g. the headline example.

if we take improvement into consideration, then we cannot use 100% data in dataset generation as data for evaluation and improvement .It is like use both data as trainnig data and testing data.

The suggested modification is to include train test split in the dataset generation. If not specified, it is considered as 100% test data. ow follow the config specification.

Comparative Analysis of YiVal and DevOpsGPT: Unique Selling Points and Competitive Edges

Hello Contributors and Community,

I recently found that there are two very interesting projects, DevOpsGPT and YiVal, both of which are based on AI and specifically large language models, but seemingly aiming at two different aspects in the AI-ML deployment process. I wanted to open a discussion to understand the unique competitive advantages and features that YiVal holds in comparison to DevOpsGPT.

At the outset, let me provide a brief understanding of the two projects:

  • DevOpsGPT: An AI-Driven Software Development Automation Solution that combines Large Language Models with DevOps tools to convert natural language requirements into working software, thereby enhancing development efficiency and reducing communication costs. It generates code, performs validation, and can analyze existing project information and tasks.

  • YiVal: An GenAI-Ops framework aimed at iterative tuning of Generative AI model metadata, params, prompts, and retrieval configurations, tuned with test dataset generation, evaluation algorithms, and improvement strategies. It streamlines prompt development, supports multimedia and multimodel input, and offers automated prompt generation and prompt-related artifact configuration.

Looking at both of these, it seems they provide unique features to cater to different needs in the AI development and deployment pipeline. However, I'm curious to further understand the unique selling points and specific competitive advantages of YiVal.

Here are a few questions that might be worth discussing:

  1. DevOpsGPT seems to convert natural language requirements into working software while YiVal seems focused on fine-tuning Generative AI with test dataset generation and improvement strategies. In what ways does YiVal outperform DevOpsGPT in facilitating a more robust and efficient machine learning model iteration and training process?

  2. One of the highlighted features of YiVal is its focus on Human(RLHF) and algorithm-based improvers along with the inclusion of a detailed web view. Can you provide a bit more insight into how these features are leveraged in YiVal and how they compare to DevOpsGPT's project analysis and code generation features?

  3. DevOpsGPT offers a feature to analyze existing projects and tasks, whereas YiVal emphasizes streamlining prompt development and multimedia/multimodel input. How does YiVal handle integration with existing models and datasets? Is there any scope for reverse-engineering or retraining established models with YiVal?

  4. In terms of infrastructure, how does YiVal compare to DevOpsGPT? Do they need similar resources for deployment and operation, or does one offer more efficiency?

  5. Lastly, how is the user experience on YiVal compared to DevOpsGPT? I see YiVal boasts a "non-code" experience for building Gen-AI applications, but how does this hold up against DevOpsGPT's efficient and understandable automated development process?

I'd appreciate any insights or thoughts on these points. Looking forward to stimulating discussions!

Documentation

  1. Move test to tests/ folder
  2. Setup Issue category and permissions
  3. Improve the contributing guide, add a section in the README.md
  4. Fix mkdown lint

Resume an experiment

Is your feature request related to a problem? Please describe.
If an experiment stops due to some reason (network issue etc which causes yival to terminate), we lose all the experiment results, and have to run it again. This would be costly when we are running gpt-4 based evaluation for many data points.

Describe the solution you'd like
Save the experiment state and results on disk when running experiment, and add the ability to resume the experiment to yival.

Describe alternatives you've considered
It might help if we keep a cache of LLM calls. But I think a cache is not always favorable as sometimes we want LLM to generate new output on every request.

Unable to execute demo

Describe the bug
TypeError: OpenAIPromptBasedVariationGeneratorConfig.init() got an unexpected keyword argument 'model_name'

To Reproduce
When I execute ;yival run demo/configs/animal_story.yml' I get those error

Expected behavior
Successful launch of demo
Issue

Version and Logs
1.1

Screenshots (Optional)
If applicable, add screenshots to help explain your problem.

Additional context (Optional)
Add any other context about the problem here.

[openai_prompt_based_evaluator] function: extract_choice_from_response easy to attack

def debug():
    input = """
    response_content:Step 1: Evaluate if the headline clearly communicates what the startup does or what problem it solves. The headline "Unlock the Power of Blockchain: The Ultimate Solution for Enhanced Security" does communicate that the startup is involved in blockchain technology and aims to provide enhanced security solutions.
    Step 2: Determine if it is immediately clear to anyone who reads the headline what the startup's purpose is. The headline does make it clear that the startup's purpose is to provide security solutions using blockchain technology.
    Step 3: Assess if there is any lack of clarity that can lead to confusion and may discourage potential users or investors. The headline is straightforward and does not seem to have any elements that could cause confusion.
    Conclusion: The headline meets the criterion very well.

    E
    """
    choice = extract_choice_from_response(input, ["A", "B", "C", "D", "E"])
    print(f"choice is now {choice}")


if __name__ == "__main__":
    debug()

result:

choice is now C

The function extract_choice_from_response is too vulnerable to attack, it triggers when the beginning is 'choices', so the actual extracted score may be incorrect.

I'll fix this problem next week after I read all code

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.