hi okhat, thanks for your open sourcing! i have two questions if you would like to hel

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-ho

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

curious about the prompt layout for language models about dspy HOT 3 CLOSED

Maxlinn commented on June 24, 2024

curious about the prompt layout for language models

from dspy.

Comments (3)

Maxlinn commented on June 24, 2024 2

Hi @Maxlinn , I'm not a team member, but I'll try to answer.

These are "prompt engineering" techniques. They help the model (GPT3.5 in this case) to understand the format, and the instruction - what do we want it to do? ${a simple question for seeking the missing information}, for example, seems like a placeholder with a description. The model looks at this and "thinks" (following the pattern, not actually thinking) "Ok, there should be a question, based on the info I have and I don't have". Thus, when we feed the prompt with the ending "Question: ", the model knows what we expect from it.
--- IMO just formatting splitters, they help the model to understand boundaries between blocks and improve visual recognition. I believe quality/principle won't degrade if we remove these. Th textual instructions are the most important parts of the prompt.

I'm not sure where these questions came from, but you can find very similar ones here, in the HotPotQA dataset. You can easily get hundreds of similar knowledge-based questions there.

hi stalkermustang, much thanks to your careful explaination : )

yes i know the semantic purposes of certain components of the prompt, i'm just wondering why uses the $ symbol (instead of something like #, &, % or anything else), why uses brackets as delimiters. it makes the prompt more like a programming language instead of natural language.

for now i think the format is not of great importance, the attention machanism will identify the useful parts of the prompt and ignore others, slim changes of symbols does not greatly affect the results.

for the train quesion-answer pairs, i think it can be (semantically) searched from a huge QA dataset (like the hotpotQA you provided).

thanks again for your response, i'd like to mark this issue as closed!

from dspy.

stalkermustang commented on June 24, 2024

Hi @Maxlinn ,
I'm not a team member, but I'll try to answer.

These are "prompt engineering" techniques. They help the model (GPT3.5 in this case) to understand the format, and the instruction - what do we want it to do? ${a simple question for seeking the missing information}, for example, seems like a placeholder with a description. The model looks at this and "thinks" (following the pattern, not actually thinking) "Ok, there should be a question, based on the info I have and I don't have". Thus, when we feed the prompt with the ending "Question: ", the model knows what we expect from it.
--- IMO just formatting splitters, they help the model to understand boundaries between blocks and improve visual recognition. I believe quality/principle won't degrade if we remove these. Th textual instructions are the most important parts of the prompt.
I'm not sure where these questions came from, but you can find very similar ones here, in the HotPotQA dataset. You can easily get hundreds of similar knowledge-based questions there.

from dspy.

stalkermustang commented on June 24, 2024

the layout of references in Context field also seems unfamiliar.

do you mean the "context" section? Rows there are just numbers, titles (of wiki pages), and the associated wiki chunks (not whole pages, just parts). So model has access to this information during generation, and can look-and-copy from these passages. Copy factual information like names or dates.

from dspy.

curious about the prompt layout for language models about dspy HOT 3 CLOSED

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent