Comments (1)
Closing this as I found out what I wasn't understanding correctly. The CasualLM class from Transformers automatically offset these values in the forward pass, so the input to the model class above is correct, since it is then shifted correctly in the loss computation.
from stanford_alpaca.
Related Issues (20)
- Wonder how to inference after finetuning.
- How to finetune with a own private data and then build chatbot on that? HOT 2
- Utilize regen.json in finetuning
- Loss will suddenly turn 0 during SFT HOT 2
- AttributeError: 'ModelArguments' object has no attribute 'target_modules'
- Can you release your evaluation code and data?
- How to get the model
- ImportError when using `weight_diff.py` script HOT 2
- weight_diff.py state_dict_recovered[key].add_(state_dict_raw[key]) RuntimeError: The size of tensor a (32001) must match the size of tensor b (32000) at non-singleton dimension 0 HOT 3
- Problems generating my own data offline
- NotImplementedError: offload_to_cpu=True and NO_SHARD is not supported yet HOT 1
- The arugment order of Rouge score might be wrong.
- Cuda OOM during training
- openai version HOT 1
- Keyword arguments {'add_special_tokens': False} not recognized.
- train.py fails with TypeError: Object of type Tensor is not JSON serializable
- Tensors of the same index must be on the same device and the same dtype except `step` tensors that can be CPU and float32 notwithstanding
- RuntimeError: The size of tensor a (65539072) must match the size of tensor b (262156288) at non-singleton dimension 0
- SFT Mistral; HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from stanford_alpaca.