Hi, I tried to train and the result seems cant stop (using llamacpp) what do you use f

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

llama chat, I was trying to say ending with </s></code

If you look at the official <a href="https://huggingface.co/blog/llama2#how-to-prompt-

Question : training result model won't stop generating ? about llama2lang HOT 12 CLOSED

understandlingbv commented on June 12, 2024

Question : training result model won't stop generating ?

from llama2lang.

Comments (12)

ErikTromp commented on June 12, 2024

I haven't tried with llama.cpp directly but we do not change the EOS token. We do however, use left-padding and change the pad-token so maybe llama.cpp relies on that? See https://github.com/UnderstandLingBV/LLaMa2lang/blob/main/finetune_llama.py#L49

Alpaca does exactly the same so it should work. We also do not stray from the normal LLaMa2 instruct prompt syntax - you can see a few examples by exploring one of our instruct datasets: https://huggingface.co/datasets/UnderstandLing/oasst1_es_threads?row=0

from llama2lang.

ErikTromp commented on June 12, 2024

As for the error level - they differ per language but we just always do 2 epochs, which has shown to be a generic optimum (3 or more deteriorates and 1 is too few).

from llama2lang.

x4080 commented on June 12, 2024

@ErikTromp thanks for answering. I tried again in colab and the result after training seems can't give the token, but it gives [/INST] ... [/INST]

Did you have different result after training ?
Oh BTW I'm using unsloth for training

from llama2lang.

ErikTromp commented on June 12, 2024

Yeah I only tried a few of the largest languages (NL, ES, DE) though, but here's some output from my notebook before turning it into the .py script in the repo, maybe it helps but if not, let me know so we can follow it up:

model.eval()
input_text = "<s>[INST] <<SYS>> Je bent een generieke chatbot die altijd in het Nederlands antwoord geeft. <</SYS>> Wat is de hoofdstad van Nederland? [/INST]"
inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
output_sequences = model.generate(
    input_ids=inputs['input_ids'],
    max_length=200,
    repetition_penalty=1.2
)
generated_text = tokenizer.decode(output_sequences[0], skip_special_tokens=True)

print(generated_text)

# Output:
# [INST] <<SYS>> Je bent een generieke chatbot die altijd in het Nederlands antwoord geeft. <</SYS>> Wat is de hoofdstad van Nederland? [/INST] Amsterdam

I haven't had time yet to properly look into unsloth but I did use Axolotl in other occasions - that one (perhaps unsloth too?) uses a bunch of custom tricks and non-standard merging and layer-alteration techniques (for example with flash attention) that actually modify the resulting model. If you then do not load the model back in with the same alterations, you will get random outcomes.

from llama2lang.

x4080 commented on June 12, 2024

Hi @ErikTromp, it seems your result is not ending with ? That's my result too, and when I run the inference without using the language specific request, it ends with like it should be, I suspected some the oasst data is not correct ?

from llama2lang.

ErikTromp commented on June 12, 2024

That sounds like a data issue indeed. What model are you using?

from llama2lang.

x4080 commented on June 12, 2024

llama chat, I was trying to say ending with </s>, but github change it to "?"

from llama2lang.

ErikTromp commented on June 12, 2024

If you look at the official llama prompt template you can see that </s> is only added after a second user prompt to denote the end of the previous answer, so it being missing in my example is correct.

from llama2lang.

x4080 commented on June 12, 2024

I see, did you get the </s> from inference ? I cant seem to get it at all, so it keep blabbering

from llama2lang.

ErikTromp commented on June 12, 2024

No it needs to be added manually if you continue a thread

from llama2lang.

x4080 commented on June 12, 2024

Ok thanks

from llama2lang.

ErikTromp commented on June 12, 2024

Just figured I'd get back on this one as we now use the official LLaMa2 chat templates by default everywhere (also in running inference). This should eliminate any code errors in reconstructing the chat template outputs ourselves

from llama2lang.

Question : training result model won't stop generating ? about llama2lang HOT 12 CLOSED

Comments (12)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent