Comments (7)
I'm having similar issues. I am following the readme instructions and getting poor evaluation and prediction results after fine tuning. I downloaded the DNABERT6 pre-trained model and ran the fine-tune command for the prom-core
task under section 3.3.
I get similar behaviour as seen in @Z-Abbas' eval_results.txt
where the fine-tuning will be running okay until all of a sudden the accuracy drops to 0.507 or 0.492 and is stuck there, sometimes until the end of fine-tuning, causing the final prediction accuracy to be around the same. Sometimes this behaviour will stop after about 1000-2000 steps and if it does the final prediction accuracy might be a bit better (0.7-0.85). After running fine-tuning from a fresh clone several times, the best accuracy I can achieve is ~0.85 but most attempts either finish fine-tuning with accuracies of exactly 0.507/0.492 or around 0.75.
04/06/2021 00:40:20 - INFO - __main__ - ***** Eval results *****
04/06/2021 00:40:20 - INFO - __main__ - acc = 0.49248183814833585
04/06/2021 00:40:20 - INFO - __main__ - auc = 0.8266465752924059
04/06/2021 00:40:20 - INFO - __main__ - f1 = 0.3299750962191533
04/06/2021 00:40:20 - INFO - __main__ - mcc = 0.0
04/06/2021 00:40:20 - INFO - __main__ - precision = 0.24624091907416792
04/06/2021 00:40:20 - INFO - __main__ - recall = 0.5
I am using the repo exactly (using the same dev.tsv
and train.tsv
). I have also noticed that the loss often doesn't improve over the 3 epochs after the first few hundred steps.
TLDR; Results from fine-tuning DNABERT several times yields poor accuracies. Often it is either exactly 0.492 or 0.507, while other times it will be between 0.7-0.85.
Thanks for your help.
from dnabert.
Hi,
We have recently updated the test data and performed many bug fixes. Please kindly see if the reported issue still occurs.
Thanks,
Jerry
from dnabert.
Hi,
For the first question, could you please show me the command you are using when running the model.
For the second question, yes, you can use it. But in our case, BERT and Roberta essential the same. GPT is for sequence generation. So I am not sure if it makes sense to use it in the DNA setting.
from dnabert.
Hi
I have a similar issue. I run the example using the commands below:
3.3 Fine-tune with pre-trained model
export KMER=6
export MODEL_PATH='/home/zeeshan/DNABERT/6-new-12w-0/'
export DATA_PATH='/home/zeeshan/DNABERT/examples/sample_data/ft/prom-core/6/'
export OUTPUT_PATH='/home/zeeshan/DNABERT/examples/OUTPUT/'
- Prediction
export KMER=6
export MODEL_PATH='/home/zeeshan/DNABERT/examples/OUTPUT/'
export DATA_PATH='/home/zeeshan/DNABERT/examples/sample_data/ft/prom-core/6/'
export PREDICTION_PATH='/home/zeeshan/DNABERT/examples/predout/'
After running the training and prediction codes, the result of the last (100th) one is exactly the same as the prediction result. Am I doing something wrong? or is there any option to choose the best weights for prediction purpose?
Please, find the eval_results.txt file attached with the screenshot of the prediction result for your reference.
from dnabert.
Hi,
I think this task is relatively tricky. We found that the model may fail to converge in some cases. So please try different hyperparameter settings and random seed.
from dnabert.
Hi,
I think this task is relatively tricky. We found that the model may fail to converge in some cases. So please try different hyperparameter settings and random seed.
Can you give us an example? because personally it a bit hard for me to follow your codes, even for simplest things like loading the model and pridicting labels
from dnabert.
@jerryji1993 @Zhihan1996 I am still facing the same issue even after cloning the latest repo. I am performing a binary classification task. I downloaded the pretrained model and finetuned it on my own data but I am getting all 0 predictions and mcc value is always 0. With the data available on github initially the model does work and the accuracy is good but after few iterations the accuracy drops to 50% and mcc is 0 and I get all 0 predictions at the end. I have tried changing hyper parameters but I am unable to get good results. Can you please advice as I have my research based on this model?
from dnabert.
Related Issues (20)
- How can I create my own processor? HOT 1
- There is a bug about attention mask in source code
- Importing error of Transformers HOT 4
- How to get the high attention regions of a given sequence.
- AssertionError in kmer2seq for motif search
- attention maps generated in pre-training stage or fine-turning stage
- Pretraining error
- benchmark for the time and computation cost during the fine-tuning
- Shape of atten.npy
- install packages using pip HOT 2
- the seq longer than 512
- Reverse complement of a sequence
- Transformers not recognized in test run
- comparing DNABERT to where it was forked from transformers
- Can you use the pre-trained BERT models, but add novel tokens to the vocabulary?
- provided example does not use GPU HOT 1
- How to divide our own dataset into test, dev and train data and assign them labels for fine tuning process HOT 2
- Changing max_seq_length does not update max_length in config.json
- early_stop not being triggered?
- How Can I track model loss and accuracy of each epoch during fine-tuning, to make sure model is stable?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dnabert.