Comments (3)
Thanks for your interest in the code. In the leaderboard (https://visualcommonsense.com/leaderboard/), if you see Entry #18, there seems to be a third party running this code to reproduce the results.
Since you only have 1 gpu, please change the hyper-parameter "gradient_accumulation_steps" from 5 to 20, and have a try again. Let me know how this goes, and hopefully the performance can catch up.
from villa.
Thanks for your interest in the code. In the leaderboard (https://visualcommonsense.com/leaderboard/), if you see Entry #18, there seems to be a third party running this code to reproduce the results.
Since you only have 1 gpu, please change the hyper-parameter "gradient_accumulation_steps" from 5 to 20, and have a try again. Let me know how this goes, and hopefully the performance can catch up.
Thanks for your reply! I am trying to modify the configuration file and re-experiment, after which I will feed back the results.
from villa.
Hi, this strategy worked. After I adjusted the hyper-parameter, when the training level reached 90%, I got the following results.
88%|########7 | 7000/8000 [15:35:25<2:10:08, 7.81s/it][1,0]:09/14/2021 18:04:58 - INFO - main - ============Step 7000=============
[1,0]:09/14/2021 18:04:58 - INFO - main - 4481024 examples trained at 79 ex/s
[1,0]:09/14/2021 18:04:58 - INFO - main - ===========================================
[1,0]: [1,0]:09/14/2021 18:04:58 - INFO - main - start running validation...
[1,0]: [[[[1,0]:09/14/2021 18:10:05 - INFO - main - validation finished in 307 seconds, score_qa: 75.16 score_qar: 78.46 score: 59.18 1,0]:
89%|########8 | 7100/8000 [15:53:25<1:57:14, 7.82s/it][1,0]:09/14/2021 18:22:58 - INFO - main - ============Step 7100=============
Thanks for your solution!
from villa.
Related Issues (11)
- When will the adversarial training code of pretraining in indomain dataset be released? HOT 3
- Features of img_pos_feat
- RK Villa
- Cannot find txt pre-processing file prepro.py
- training setup HOT 9
- As the epoch increased, so did the GPU memory HOT 1
- How to extract features to do image retrieval HOT 4
- Visualization of text-to-image attention
- VQA pre-processing
- Checkpoints of Villa models to run on validation set
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from villa.