Hi everybody, first of all, thanks for this great repo. <p dir="

Input pipeline about chemprop HOT 8 CLOSED

wengong-jin commented on August 18, 2024

Input pipeline

from chemprop.

Comments (8)

yangkevin2 commented on August 18, 2024

Hi Florian,

Most of the slow parts should be parallelized (in particular, the main training loop should be run on a GPU with cuda-- it'll be very very slow on a CPU). Is there a particular part that's being very slow for you?

Kevin

from chemprop.

FloMru commented on August 18, 2024

Ok that is odd.
For a batch size of 256 I get around 1 it/s and my 1080TI hoovers around 10% utilization.
Because my dataset has around 4 million samples, you can imagine that I wait some time for an epoch to finish. :-)
If you say the slow parts are already parallelized, then the error has to be on my site.
Do you have an idead what the problem could be?

from chemprop.

yangkevin2 commented on August 18, 2024

Ah, we've never run it on anything quite so big. I guess that would explain why it's taking forever. So it takes ~4 hr per epoch?

Our code caches a lot of the computation during the first epoch, though, so the first epoch is the slowest epoch; subsequent epochs should be roughly 4x faster. (Though the cache for a dataset of that size would use something like half a terabyte of RAM... so if you end up having trouble with memory you can chunk your dataset using the --num_chunks option, which also turns off the caching.)

We may also look into parallelizing some of the CPU computation that happens with each batch, if you're still running into trouble; just let us know. (We haven't done this parallelization yet because we usually just cache that computation during the first epoch.)

from chemprop.

FloMru commented on August 18, 2024

Ah, we've never run it on anything quite so big. I guess that would explain why it's taking forever. So it takes ~4 hr per epoch?

Yeah roughly, it is more in the 3 hour range.

Our code caches a lot of the computation during the first epoch, though, so the first epoch is the slowest epoch; subsequent epochs should be roughly 4x faster. (Though the cache for a dataset of that size would use something like half a terabyte of RAM... so if you end up having trouble with memory you can chunk your dataset using the --num_chunks option, which also turns off the caching.)

I should have said that I already had to turn off the caching (unfortunatly), because I had problems with the dataset using up all my precious memory.

We may also look into parallelizing some of the CPU computation that happens with each batch, if you're still running into trouble; just let us know. (We haven't done this parallelization yet because we usually just cache that computation during the first epoch.)

I also started to look into it:
You call the featurization step (mol2graph) directly in the arguments section of the forward step of the encoder (e.g. for the mpn it is in mpn.py row 335).
For parallelization would´nt it be better to call this somewhere before?
Do you know a better place to start looking for good parallelization opportunities?

from chemprop.

yangkevin2 commented on August 18, 2024

Yeah, I believe the mol2graph step is the slowest CPU-based step based on some profiling tests we've run in the past, so that's probably the best place to start. We can look into parallelizing it too.

from chemprop.

yangkevin2 commented on August 18, 2024

Hi Florian,

Try pulling the master branch again. You can use the new option --no_cache to turn of the caching without having to hack it, and you can use the new option --parallel_featurization to do the CPU-based featurization asynchronously with the model (which will probably become default in the near future). We observed ~75% speedup compared to the previous version with cache turned off, when running with this option on a dataset of about 100k (this was rather surprising to us too; even though we typically cache from the second epoch onwards, it seems like the featurization was still taking more time than we thought). If you find that it's using too much RAM, you can just decrease the value of the flag --batches_per_queue_group, which would likely cause just a small performance hit. Hope this helps! And please let us know if anything goes wrong when using these new options.

(There are a lot of other code changes since we finally merged our dev branch, but the basic interface should still be the same-- most of the new code is just our experimental options. We just merged so that we could sync the branches before making some helpful engineering changes, like the one I described above. Please let us know if you encounter any problems, though.)

from chemprop.

FloMru commented on August 18, 2024

Hi Kevin,

that was fast!
Thanks a lot and I will try the new code as soon as possible, but probably not until the new year.
I´ll give you my feedback, too.

I wish you happy holidays!

Best wishes,
Florian

from chemprop.

yangkevin2 commented on August 18, 2024

Sounds good, happy holidays to you too!

from chemprop.

Input pipeline about chemprop HOT 8 CLOSED

Comments (8)

Related Issues (15)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent