Hi can you clarify which dataset is in the chain_set_splits.

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Provide Datasets from Paper about grade_if HOT 10 CLOSED

hnisonoff commented on June 30, 2024

Provide Datasets from Paper

from grade_if.

Comments (10)

hnisonoff commented on June 30, 2024

Can you also provide the code you used to compute recovery and perplexity. It is not clear to me what number of ensembles you used and what step size you used. Thank you!

from grade_if.

ykiiiiii commented on June 30, 2024

Hi @hnisonoff,

The dataset in chain_set_splits.json file is CATH 4.2 used in Generative models for graph-based protein design.

For how to compute the recovery rate and perplexity, I show an example on the notebook. In order to reproduce the paper result, you can set the step size as 100 and I usually sample 50 times and do ensembles.

Hope it is clear to you now.

from grade_if.

hnisonoff commented on June 30, 2024

Hi @ykiiiiii thank you from the reply. I am still running into some issues. I trained the CATH model using the hyper parameters provided for CATH and then evaluated on the test set set as you described above (step size 100, ensembles 50). I am getting a perplexity of 6.26 and a recovery of 53.5%. Can you help me reproduce the paper results?

from grade_if.

ykiiiiii commented on June 30, 2024

Hi @hnisonoff, could you check whether the diverse mode is switched on? Just set diverse = True in the ddim_sample function. This will maintain the same recovery rate but yield a more flat distribution, which results in a higher likelihood value (lower perplexity).

You might also consider using larger step sizes, such as 250, which can further reduce perplexity.

Hope it fix your issue.

from grade_if.

ykiiiiii commented on June 30, 2024

Hi @hnisonoff, I also update more comprehensive results for different parameter chosen in README.md. Hope it helps!

Cheers,
Kai

from grade_if.

hnisonoff commented on June 30, 2024

Hi @ykiiiiii thank you for the updates. I was using diversity_model and I am still unable to reproduce the results with both your provided model weights (BLOSUM_3M_small.pt) and with my own trained model using the code provided in gradeif.py. My perplexity scores are much higher >6 for step=100.

Can you please provide:

The exact command needed to train the model.
The exact code to reproduce the numbers in the readme from a trained model

from grade_if.

hnisonoff commented on June 30, 2024

@ykiiiiii I actually am able to much more closely match your reported results if I use the non-EMA model. It would still be helpful to have the actual code used to reproduce the results though! Thank you!

from grade_if.

hnisonoff commented on June 30, 2024

One more question: I'm confused why none of the numbers in the README correspond to what is reported in the paper? It isn't clear what parameter settings you actually used to get the numbers in the paper. Thank you.

from grade_if.

ykiiiiii commented on June 30, 2024

Hi @hnisonoff, I have added the script to compute recovery rate and perplexity in test_rr.py. Hope it helps.

The improvements were primarily due to a lower learning rate and the use of the AdamW optimizer. I didn't try too many hyperparameters when I caught the paper DDL. You can replicate the improved results using the configuration in our training script.

from grade_if.

hnisonoff commented on June 30, 2024

Thank you this is helpful!

from grade_if.

Provide Datasets from Paper about grade_if HOT 10 CLOSED

Comments (10)

Related Issues (10)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent