Comments (9)
I got it. I sometimes got complained about selecting GPUs randomly too. I guess limiting the visibility of GPUs to ts
can be useful. If you don't mind waiting I can work on it in a few days.
from task-spooler.
Hi @bermeitinger-b. You can pull to get the new features. To run multiple processes in a GPU, you can set the free memory threshold (in percentage) appropriately via ts --set_gpu_free_perc
.
from task-spooler.
Hi @bermeitinger-b. If I understand it correctly then ts
is not possible to auto select GPUs from a subset as it uses NVML, which will discover all possible GPUs. You can try to use -g
in this case to manually specify the GPU ID. May I ask about your specific use case why you want to do that?
from task-spooler.
Thanks, let me clarify.
I'm running experiments on a machine that has 16 GPUs. I'm running a lot of tasks and use ts
with -S 24
to schedule the tasks such that they are distributed among the GPUs. That is working very well.
But there are other users on the machine that also require access to the GPUs. So, I want to limit my user's capabilities to specific GPUs (e.g. 0-8). (Of course with a reduced -S
)
Fixing which job runs on which GPU with -g
would make ts
useless in my case. If I would know beforehand which task should run on which GPU, I could use simple shell scripts for each GPU with the tasks one after another, right?
That's the beauty of ts
: As soon as one tasks finishes, it will check which GPU is free and run it there, so there are no idle GPUs.
from task-spooler.
For now you can use -g
together with -D
or -W
to queue jobs for a specific GPU ID.
from task-spooler.
Hi @bermeitinger-b. I prototyped this feature in this branch. Basically, you can set an env var like TS_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
before starting ts
for the first time. If ts
is already up, you can use the flag --setenv TS_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
to set the env var. Please see the Readme for more details.
from task-spooler.
Thank you very much. I tried the branch and it seems it working exactly as intended.
from task-spooler.
Does this new limit also the number of concurrent tasks?
My current approach to use GPUs 0-8 with 16 concurrent jobs (so 2 per GPU) would be to
ts -K
(to make sure)TS_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 ts -S 16
for i in $(seq 1 16); do ts -G 1 ./job.sh; done
However, I'm not seeing 16 concurrent jobs but only 8. (The jobs are small enough that the 90% full-rate is not reached.)
from task-spooler.
Hi @bermeitinger-b. This is intended. -G
queues jobs until there is free GPU to run. A GPU is deemed free if its memory is more than 90% free. The only solution for running multiple processes on a GPU is using -g
so far.
I have thought about having an option to manually set the free percentage value but never materialized. If you could wait a couple of days, I can make a quick patch for this option.
from task-spooler.
Related Issues (20)
- Advice on how to cancel (kill or remove) task HOT 5
- Prompt to uninstall the apt installation of tsp before running ts in README HOT 1
- install breaks without CUDA HOT 1
- Bug: cannot add a very long command to queue HOT 14
- Structured output HOT 2
- Please edit the README HOT 2
- make cpu giving error: implicitly declaring library function 'snprintf' with type 'int HOT 2
- Evaluate $(...) in commands at run not at enqueue HOT 2
- Separate logging and queueing? HOT 3
- Using `-n` `-f` flags: pass through SIGINT (and other signals?) HOT 1
- ts -F stochastically crashes the server HOT 3
- Contributors HOT 7
- GUI addon link point to 404 HOT 3
- asynchronous launch HOT 6
- Timeout HOT 3
- json format for listing jobs HOT 1
- Unable to redirect output from command line HOT 2
- Enhancement request: support for priorities HOT 2
- Enhancement request: ability to postpone jobs HOT 2
- Enhancement request: bigger queue size HOT 9
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from task-spooler.