Comments (4)
https://github.com/NormanTUD/OmniOpt/tree/main/ax
Main script:
https://github.com/NormanTUD/OmniOpt/blob/main/ax/.omniopt.py
Maybe for anyone looking through the environment the problem is appearing in, my general plan is to allow this:
./omniopt --partition=alpha --experiment_name=example --mem_gb=1 --time=60 --worker_timeout=60 --max_eval=500 --num_parallel_jobs=500 --gpus=1 --follow --run_program=ZWNobyAiUkVTVUxUOiAlKHBhcmFtKSI= --parameter param range 0 1000 float
and to run that optimization on our clusters and to use ax/botorch internally for hyper parameter optimization. We have basically unlimited resources for free (university) and want to have as many workers in parallel as possible to gain from the HPC as much as possible in finding good hyperparameters for every type of problem or just researching those areas (depending on what your program does).
On the top of the code is a large comment showing some things I tried, the list is anything but complete though.
It would really be appreciated by us if you helped us with that.
Yours sincerly
NormanTUD
from ax.
Hi @NormanTUD! Thanks so much for engaging with our tool - happy to help. Could you provide the logs from AxClient for your experiment? These logs usually contain information about the trial generation and generation strategy that will be helpful for us debugging the issue.
Also good catch on "use_batch_trials"
not having an effect. This code hasn't been opensourced yet (hopefully soon!), so it isn't doing anything at this time. Let me raise an error to make that more clear.
from ax.
@NormanTUD -- added a PR for an error to populate with use_batch_trials, it'll be live once we cut a new release :)
Let me know if you have the logs from AxClient for additional support. Thanks!
from ax.
Hi,
thanks for your reply. I was on vacation and as such, didn't code anything. But currently, I am trying to get all logs now. Thanks for the patience. I will update this post when I have the logs.
First a bit of my own debugging code:
Update #1:
1531 trial_index_to_param, _ = ax_client.get_next_trials(
1532 max_trials=1
1533 )
1534
1535 print_debug(f"Got {len(trial_index_to_param.items())} new items (m = {m}, in range(0, {calculated_max_trials})).")
These lines are only executed when there are new jobs to be generated (in a for loop for further testing instead of by changing max_trials=
to the number of new trials, it's set to 1, but in a for
loop for each new job). But sometimes, I get this:
2024-03-26 11:14:13: Got 0 new items (m = 0, in range(0, 33)).
So it just returns 0 jobs.
These are the number of workers over time:
17
7
5
8
(No time given there though, it's in each generative loop)
It should be around ~20, so 17 is fine for a snapshot during starting the jobs, but over time, it gets much less.
The only message I can see from ax that seems relevant seems to be this:
ax.models.torch.botorch_modular.acquisition:
Encountered Xs pending for some Surrogates but observed for others. Considering
these points to be pending.
from ax.
Related Issues (20)
- Issue when starting an AxClient with out-of-design points HOT 2
- cannot import name 'TrainingData' HOT 2
- applying complex constrains HOT 2
- Evaluating custom candidates HOT 2
- Input Feature Selection - Does the relevant code exist? HOT 6
- [Feature Request] support constraints on `ChoiceParameters` HOT 4
- Extending Models.THOMPSON with an extra parameter HOT 1
- There are some questions when i use the Ax HOT 7
- Space characters in the objective name AND specifying a threshold leads to an error message: "AssertionError: Outcome constraint should be of form `metric_name >= x" HOT 1
- Pandas deprecation warning when deserializing AxClient JSON HOT 2
- AX seems to get stuck with Ray
- `StandardizeY` transform requires non-empty data." when using SAASBO
- Plotting outside of a notebook HOT 1
- Setting search space step size in Ax Service API HOT 10
- Problem when Sobol falls back to HitAndRunPolytopeSampler HOT 3
- Arms from previous batch keep appearing in new batches HOT 5
- EHVI & NEHVI break with more than 7 objectives HOT 4
- Multi-objective experiments generate duplicated data HOT 5
- Question: Transforming objective when passing `best_f` to `ProbabilityOfImprovement`, etc. HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ax.