Comments (4)
How about using int8 quantization or parallelformers library?
(Just a suggestion.. Note that I'm not a maintainer of this repo)
from biomedlm.
Have you tried making things bf16
??
from biomedlm.
How about using int8 quantization or parallelformers library?
(Just a suggestion.. Note that I'm not a maintainer of this repo)
This approach yields:
Traceback (most recent call last):
File "demo.py", line 9, in <module>
model = GPT2LMHeadModel.from_pretrained("stanford-crfm/pubmedgpt").to(torch.int8).to(device)
File "/home/vitor/Projects/pubmedgpt/venv_1/lib/python3.8/site-packages/transformers/modeling_utils.py", line 1682, in to
return super().to(*args, **kwargs)
File "/home/vitor/Projects/pubmedgpt/venv_1/lib/python3.8/site-packages/torch/nn/modules/module.py", line 912, in to
raise TypeError('nn.Module.to only accepts floating point or complex '
TypeError: nn.Module.to only accepts floating point or complex dtypes, but got desired dtype=torch.int8
from biomedlm.
Have you tried making things
bf16
??
While bfloat16
yields
Photosynthesis is \[*M*~0′~ = *I*~0′~ × *N*~0′~ × 0.5 × 255\] the light absorbed, \[*M
float16
yields
Photosynthesis is \~520,000-fold more efficient in C~4~ plants than in C~3~ plants because CO~2~ is first incorporated into a C~4~ acid (malate or aspartate) by phospho
Apparently, it outputs less gibberish. Is this behavior related to the warning message?
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:28895 for open-end generation.
from biomedlm.
Related Issues (20)
- can it be fine tuned in samller GPU HOT 10
- fine tuning on seqcls task with deepspeed hit RuntimeError: a leaf Variable that requires grad is being used in an in-place operation. HOT 14
- BioMedLm for NER and sentiment analysis HOT 3
- 2pac style rap
- How to run the evaluator for MedQA-USMLE HOT 20
- Evaluate MedQA_USMLE on a saved model HOT 1
- Using a UMLS based retriever to enhance MedQA-USMLE performance HOT 1
- sentence embedding HOT 1
- How to cite BioMedLM HOT 3
- How can I try question answering ?
- Unexpected bug for generate function
- Seqcls for multi-label task HOT 2
- Finetuning BioMedLM for Medical QA HOT 10
- Running generation batch misses file HOT 4
- Max Input and Output length HOT 2
- Generation is suspiciously slow for long sequences HOT 4
- The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. HOT 2
- torch.distributed.launch on eight 40G A100, CUDA out of memory.
- Requesting for dataset used in the section "Free Response Question Answering"
- I set tokenizer.pad_token = tokenizer.eos_token and found tokenizer.pad_token_id==None, which leads to an error.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from biomedlm.