Comments (12)
I haven't tried with llama.cpp directly but we do not change the EOS token. We do however, use left-padding and change the pad-token so maybe llama.cpp relies on that? See https://github.com/UnderstandLingBV/LLaMa2lang/blob/main/finetune_llama.py#L49
Alpaca does exactly the same so it should work. We also do not stray from the normal LLaMa2 instruct prompt syntax - you can see a few examples by exploring one of our instruct datasets: https://huggingface.co/datasets/UnderstandLing/oasst1_es_threads?row=0
from llama2lang.
As for the error level - they differ per language but we just always do 2 epochs, which has shown to be a generic optimum (3 or more deteriorates and 1 is too few).
from llama2lang.
@ErikTromp thanks for answering. I tried again in colab and the result after training seems can't give the token, but it gives [/INST] ... [/INST]
Did you have different result after training ?
Oh BTW I'm using unsloth for training
from llama2lang.
Yeah I only tried a few of the largest languages (NL, ES, DE) though, but here's some output from my notebook before turning it into the .py script in the repo, maybe it helps but if not, let me know so we can follow it up:
model.eval()
input_text = "<s>[INST] <<SYS>> Je bent een generieke chatbot die altijd in het Nederlands antwoord geeft. <</SYS>> Wat is de hoofdstad van Nederland? [/INST]"
inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
output_sequences = model.generate(
input_ids=inputs['input_ids'],
max_length=200,
repetition_penalty=1.2
)
generated_text = tokenizer.decode(output_sequences[0], skip_special_tokens=True)
print(generated_text)
# Output:
# [INST] <<SYS>> Je bent een generieke chatbot die altijd in het Nederlands antwoord geeft. <</SYS>> Wat is de hoofdstad van Nederland? [/INST] Amsterdam
I haven't had time yet to properly look into unsloth but I did use Axolotl in other occasions - that one (perhaps unsloth too?) uses a bunch of custom tricks and non-standard merging and layer-alteration techniques (for example with flash attention) that actually modify the resulting model. If you then do not load the model back in with the same alterations, you will get random outcomes.
from llama2lang.
Hi @ErikTromp, it seems your result is not ending with ? That's my result too, and when I run the inference without using the language specific request, it ends with like it should be, I suspected some the oasst data is not correct ?
from llama2lang.
That sounds like a data issue indeed. What model are you using?
from llama2lang.
llama chat, I was trying to say ending with </s>
, but github change it to "?"
from llama2lang.
If you look at the official llama prompt template you can see that </s>
is only added after a second user prompt to denote the end of the previous answer, so it being missing in my example is correct.
from llama2lang.
I see, did you get the </s>
from inference ? I cant seem to get it at all, so it keep blabbering
from llama2lang.
No it needs to be added manually if you continue a thread
from llama2lang.
Ok thanks
from llama2lang.
Just figured I'd get back on this one as we now use the official LLaMa2 chat templates by default everywhere (also in running inference). This should eliminate any code errors in reconstructing the chat template outputs ourselves
from llama2lang.
Related Issues (20)
- Question or bug HOT 5
- Feature request: ChatML support
- Madlad: unrecognized arguments: --model_size 7b HOT 2
- Got this error during finetuning HOT 3
- Support finetuning from local disk too
- Issue with THREAD_TEMPLATE HOT 3
- Sample Example for finetuning
- Feedback on the hindi fientuned model HOT 7
- SeamlessM4T-v2 default (medium) model removed from huggingface HOT 1
- nllb.py and madlad.py points to the incorrect HF repositories HOT 9
- error running benchmark.py with seamless HOT 2
- Dataset chat format independent HOT 2
- problem with run_inference.py HOT 4
- Best translation model for turkish HOT 2
- Translating takes too long (How to finetuning with QLoRA?) HOT 2
- Question: How would I do this with Phi 2? HOT 1
- Question: translating a monolingual HF Dataset HOT 2
- Can you make a dataset of LLaMa3-8B translated into Japanese? HOT 2
- [Question] What framework is able to load this adapter and serve the resulting model as an OpenAI endpoint? HOT 1
- [Bug] Error with benchmarking: 'NoneType' object is not iterable HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from llama2lang.