Comments (6)
DSPy has a small default for max tokens. Override it to get a longer response:
lm = dspy.OpenAI(
[...]
max_tokens=4096,
)
from dspy.
Thanks @drawal1. We have that as a planned tutorial in the README (see section 4, within Advanced Demos).
It's pretty simple actually but I agree with you having a concrete example will go a long way. Will see who on our end has cycles to do this quickly
from dspy.
Btw I've been thinking of long-form answers for factoid questions, like "David Gregory inherited Kinnairdy castle. Kinnairdy castle has 5 floors. Therefore, the castle that David Gregory inherited has 5 floors."
You mention "how to" questions though. If you think that will look different from what I have in mind, do you have one full example question + answer?
from dspy.
Here is a how-to question from my specific domain with the exact answer and the truncated answer from DSPy:
Question: How do I upgrade my RadiantOne Platform?
DSPy truncated answer:
To upgrade your RadiantOne platform, please follow these steps:
-
Determine your current version: Check which version of RadiantOne you are currently using. This information will help you determine the appropriate upgrade path.
-
Access the upgrade guide: You can find the upgrade guide for upgrading from version 7.2 to 7.4 at this link: [RadiantOne Upgrade Guide v7.2 to v7.4](https://support.radiantlogic.com/hc/en-us/articles/790844406261
Exact answer:
To upgrade your RadiantOne platform, please follow these steps:
1. Determine your current version: Check which version of RadiantOne you are currently using. This information will help you determine the appropriate upgrade path.
2. Access the upgrade guide: You can find the upgrade guide for upgrading from version 7.2 to 7.4 at this link: RadiantOne Upgrade Guide v7.2 to v7.4. This guide provides step-by-step instructions on how to perform the upgrade.
4. Contact your Radiant Logic Account Representative: If you are using version 7.2 or 7.3 and require upgrade assistance, please reach out to your Radiant Logic Account Representative. They will be able to provide guidance and support for your specific upgrade needs.
Remember to always backup your data before proceeding with any upgrades to ensure the safety of your information.
If you have any further questions or need additional assistance, feel free to ask.
Examples of an upgrade are moving from v7.2 to v7.3, v7.2 to v7.4, or v7.3 to v7.4. The upgrade process is not applied with an update installer and may require a professional services engagement to assess the current usage/configuration of RadiantOne to determine an appropriate upgrade strategy and time requirement. All customers currently running v7.2 and v7.3 where ICS/Global Sync is deployed should contact their Radiant Logic Account Representative to have their configuration reviewed and the scope/complexity of their upgrade path assessed. If a customer has only RadiantOne FID deployed (no ICS/Global Sync) and can upgrade their own deployment, they can leverage the upgrade guides available in the Radiant Logic Knowledge Base: [Radiant Logic Knowledge Base](<https://support.radiantlogic.com>).
Here is a general QA example using ChatGPT (3.5). Challenge is how to replicate this kind of answer quality for our own documents?
Question: How do you build and train a deep neural network for playing video game pong from scratch?
Answer:
Building and training a deep neural network to play the video game Pong from scratch is a great way to learn about reinforcement learning and deep reinforcement learning. Pong is a simple but suitable environment for this task. Here's a step-by-step guide on how to do it using Python and popular libraries like TensorFlow and OpenAI Gym:
Environment Setup:
Install Python and necessary libraries like TensorFlow, NumPy, and Gym.
Install Gym with Atari support: pip install gym[atari].
Create the Environment:
Import the necessary libraries.
Create the Pong environment using Gym:
python
Copy code
import gym
env = gym.make('Pong-v0')
Preprocessing:
Preprocess the game frames to reduce the complexity of the input data. This usually involves resizing, converting to grayscale, and normalizing pixel values.
Neural Network Architecture:
Design the neural network architecture. Convolutional Neural Networks (CNNs) are commonly used for processing game frames.
Create a model that maps game frames to actions (move paddle up, down, or stay still).
python
Copy code
import tensorflow as tf
model = tf.keras.Sequential([
tf.keras.layers.Conv2D(32, (8, 8), strides=(4, 4), activation='relu', input_shape=(84, 84, 4)),
tf.keras.layers.Conv2D(64, (4, 4), strides=(2, 2), activation='relu'),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(256, activation='relu'),
tf.keras.layers.Dense(2, activation='softmax')
])
Define the Training Loop:
Define a training loop that uses reinforcement learning algorithms like Proximal Policy Optimization (PPO) or Deep Q-Networks (DQN). Here's a simplified example using PPO:
python
Copy code
from stable_baselines3 import PPO
model = PPO("CnnPolicy", env, verbose=1)
model.learn(total_timesteps=100000)
Training:
Train the neural network by interacting with the environment. The agent collects experience, updates the policy, and improves its performance over time.
Evaluation:
Evaluate the trained model by running it in the environment and measuring its performance.
Track metrics such as the average reward, win rate, and other relevant statistics.
Hyperparameter Tuning:
Experiment with different hyperparameters (e.g., learning rate, network architecture, training duration) to optimize performance.
Save and Load Model:
Save the trained model weights to disk so that you can use them later for inference without retraining.
python
Copy code
model.save("pong_model")
You can load the model later with model.load("pong_model").
Deployment:
If desired, integrate the trained model into a game-playing application or use it to control a virtual game of Pong.
Iterate and Refine:
Iterate on the training process, making improvements based on evaluation results and insights.
Experiment with different algorithms and architectures for better performance.
Monitoring and Maintenance:
Continuously monitor the model's performance and behavior.
Revisit the model to retrain it with new data or improvements as needed.
Please note that training a deep reinforcement learning agent for Pong from scratch can be computationally intensive and time-consuming. Consider using GPU acceleration and distributed training if available. Additionally, it's helpful to have a solid understanding of reinforcement learning concepts and algorithms before embarking on this project.
from dspy.
@detaos that worked!
I will now test scenarios where the long form answer is assembled from multiple short answers gathered from different contexts. I am not sure they fit within the DSP paradigm
from dspy.
I guess the problem comes as a tradeoff to long prompts that are compiled by dspy. I switched to 16k model to deal with this.
from dspy.
Related Issues (20)
- MultiChainComparison input from ChainOfThought HOT 3
- Update Google and Google VertexAI for Gemini 1.5 Pro / Flash HOT 2
- How to optimize only a part of instruction? HOT 1
- GROQ client not available in packages HOT 2
- Predict forward() does not use lm in kwargs HOT 1
- DSPY Copro Tutorial for Hotpot QA doesn't work with Bedrock Claude Sonnet Model HOT 4
- ETA on new release? HOT 2
- RAG.forward() missing 1 required positional argument: 'question' HOT 5
- Enhancement: Would be great if we can load dspy dataset directly from pandas dataframe
- Clarification on the use of hard-written output field prompt in TextToSQLAnswer Signature HOT 1
- dspy and langfuse integration HOT 6
- about openai version
- Best practice for train and validation set separation HOT 2
- Triggering the Azure OpenAi (GPT 4) Triggers content warning and DSPy exception
- DSPy COPRO Tutorial for Hotpot QA doesn't work with VertexAI Gemini
- ZeroDivisionError: division by zero when using MIPROv2 HOT 2
- DatabricksRM retrieval using dspy.Retrieve() throws TypeError HOT 2
- Unwanted STDOUT output when using Claude adapter in DSPy
- Adding textgrad as an optimizer? HOT 1
- Inconsistent Weaviate dependency
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dspy.