Comments (3)
Hi @Maxlinn , I'm not a team member, but I'll try to answer.
- These are "prompt engineering" techniques. They help the model (GPT3.5 in this case) to understand the format, and the instruction - what do we want it to do?
${a simple question for seeking the missing information}
, for example, seems like a placeholder with a description. The model looks at this and "thinks" (following the pattern, not actually thinking) "Ok, there should be a question, based on the info I have and I don't have". Thus, when we feed the prompt with the ending "Question: ", the model knows what we expect from it.
---
IMO just formatting splitters, they help the model to understand boundaries between blocks and improve visual recognition. I believe quality/principle won't degrade if we remove these. Th textual instructions are the most important parts of the prompt.- I'm not sure where these questions came from, but you can find very similar ones here, in the HotPotQA dataset. You can easily get hundreds of similar knowledge-based questions there.
hi stalkermustang, much thanks to your careful explaination : )
yes i know the semantic purposes of certain components of the prompt, i'm just wondering why uses the $ symbol (instead of something like #, &, % or anything else), why uses brackets as delimiters. it makes the prompt more like a programming language instead of natural language.
for now i think the format is not of great importance, the attention machanism will identify the useful parts of the prompt and ignore others, slim changes of symbols does not greatly affect the results.
for the train
quesion-answer pairs, i think it can be (semantically) searched from a huge QA dataset (like the hotpotQA you provided).
thanks again for your response, i'd like to mark this issue as closed!
from dspy.
Hi @Maxlinn ,
I'm not a team member, but I'll try to answer.
-
These are "prompt engineering" techniques. They help the model (GPT3.5 in this case) to understand the format, and the instruction - what do we want it to do?
${a simple question for seeking the missing information}
, for example, seems like a placeholder with a description. The model looks at this and "thinks" (following the pattern, not actually thinking) "Ok, there should be a question, based on the info I have and I don't have". Thus, when we feed the prompt with the ending "Question: ", the model knows what we expect from it.
---
IMO just formatting splitters, they help the model to understand boundaries between blocks and improve visual recognition. I believe quality/principle won't degrade if we remove these. Th textual instructions are the most important parts of the prompt. -
I'm not sure where these questions came from, but you can find very similar ones here, in the HotPotQA dataset. You can easily get hundreds of similar knowledge-based questions there.
from dspy.
the layout of references in Context field also seems unfamiliar.
do you mean the "context" section? Rows there are just numbers, titles (of wiki pages), and the associated wiki chunks (not whole pages, just parts). So model has access to this information during generation, and can look-and-copy from these passages. Copy factual information like names or dates.
from dspy.
Related Issues (20)
- Add support for Amazon Bedrock Claude 3 models HOT 1
- KeyError: 'content' error trying rag with weaviate and ollama HOT 2
- Using llama-3-70B via Groq DSPy, I found it sends invalid arguments for COPRO optimizer HOT 1
- ValueError: Required 'max_tokens' or 'max_output_tokens' not specified in settings when using meta.llama2-13b-chat-v1 in AWS SageMaker HOT 1
- Getting this backing off problem when I use COPRO HOT 1
- Can't load Llama Model HOT 7
- Add Support to FAISS Retiever! HOT 2
- MultiChainComparison input from ChainOfThought HOT 3
- Update Google and Google VertexAI for Gemini 1.5 Pro / Flash HOT 2
- How to optimize only a part of instruction? HOT 1
- GROQ client not available in packages HOT 2
- Predict forward() does not use lm in kwargs HOT 1
- DSPY Copro Tutorial for Hotpot QA doesn't work with Bedrock Claude Sonnet Model HOT 4
- ETA on new release? HOT 2
- RAG.forward() missing 1 required positional argument: 'question' HOT 5
- Enhancement: Would be great if we can load dspy dataset directly from pandas dataframe
- Clarification on the use of hard-written output field prompt in TextToSQLAnswer Signature HOT 1
- dspy and langfuse integration HOT 6
- about openai version
- Best practice for train and validation set separation HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dspy.