Comments (2)
Hi I think I had this issue. If I remember correctly my dev.tsv file's first line was 'sequence \t label'. I assume yours would be as well as finetuning in previous steps reads in the dev.tsv with pandas but automatically assigns a header to the first line. My solution was to change line 86
from
pd.read_csv(os.path.join(args.data_dir,"dev.tsv"),sep='\t',header=None)
to
pd.read_csv(os.path.join(args.data_dir,"dev.tsv"),sep='\t')
and comment out line 87
#dev.columns = ['sequence','label']
Hope this helps :)
from dnabert.
Hi @merouone and @mitchgill16,
Thanks for reporting this issue and sorry about the delay in my response. The previous dev data we used has no headers while we later switched to one that has headers. Hence causing this bug. We have recently updated the test data and fixed this in the latest version of the code. Please try it out and see if the bugs are fixed. Thank you!
Best,
Jerry
from dnabert.
Related Issues (20)
- How can I create my own processor? HOT 1
- There is a bug about attention mask in source code
- Importing error of Transformers HOT 4
- How to get the high attention regions of a given sequence.
- AssertionError in kmer2seq for motif search
- attention maps generated in pre-training stage or fine-turning stage
- Pretraining error
- benchmark for the time and computation cost during the fine-tuning
- Shape of atten.npy
- install packages using pip HOT 2
- the seq longer than 512
- Reverse complement of a sequence
- Transformers not recognized in test run
- comparing DNABERT to where it was forked from transformers
- Can you use the pre-trained BERT models, but add novel tokens to the vocabulary?
- provided example does not use GPU HOT 1
- How to divide our own dataset into test, dev and train data and assign them labels for fine tuning process HOT 2
- Changing max_seq_length does not update max_length in config.json
- early_stop not being triggered?
- How Can I track model loss and accuracy of each epoch during fine-tuning, to make sure model is stable?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dnabert.