Comments (15)
I see now that my suggestion was not correctly written in my comment, for some reason.
Anyway, I still think that we should split and reverse the namelist and date part from the path:
<output_dir>/<namelist>/YYYYMMDD_HHMMSS/preproc/
This will make a lot easier to search for outputs if you run multiple namelists multiple times
If we don't want to add another nesting level, I think we should at least switch the order.
<output_dir>/<namelist>_YYYYMMDD_HHMMSS/preproc/
from esmvaltool.
Ready to merge (PR167)
The output of the perfmetrics namelist now looks like this:
<output_dir>/tmp
<output_dir>/tmp/ANNUAL_CYCLE_ta
<output_dir>/preproc
<output_dir>/preproc/CMIP5
<output_dir>/preproc/OBS
<output_dir>/work
<output_dir>/work/perfmetrics_main
<output_dir>/plots
<output_dir>/plots/perfmetrics_main
Don't forget to update your config-user.yml
file!
from esmvaltool.
New plan:
Have two folder, one for really "temporary files", and one for output.
temporary files default to /tmp
the output folder will have further subfolders automatically generated for plots, netdf output, preprocessed files, etc.
from esmvaltool.
To create a temporary dir, it is possibly best to use the standard library function tempfile.mkdtemp
from esmvaltool.
At the moment, if the output directory already exists, it is moved to some other name and a new directory is created. This can cause nasty surprises for users, who suddenly find their files moved. Of course this can be remedied by generating a new config(-user).yml file for every run, but that is not very convenient.
A better setup would be the following:
Inside the run_dir defined in config.yml, create a directory with the name of the namelist and current datetime, e.g. for a run starting on 2017-09-25 15:48:03 UTC, create /path/to/run_dir/20170925_154803_namelist_MyVar and put all output that goes to run_dir in there.
For the other paths specified in config.yml:
- preproc_dir
- work_dir
- plot_dir
If they are relative paths, put them inside the directory mentioned above, e.g.
/path/to/run_dir/20170925_154803_namelist_MyVar/preproc_dir
/path/to/run_dir/20170925_154803_namelist_MyVar/work_dir
/path/to/run_dir/20170925_154803_namelist_MyVar/plots_dir
if they are absolute paths, create a namelist + current datetime subfolder inside them too, like so:
/path/to/preproc_dir/20170925_154803_namelist_MyVar
/path/to/work_dir/20170925_154803_namelist_MyVar
/path/to/plot_dir/20170925_154803_namelist_MyVar
This is convenient and minimizes the risk of overwriting files.
Edge case: it seems very unlikely that two namelists are started at the exact same second, but if this happens, we can add _1, _2, etc to the name, or raise an exception, TBD.
from esmvaltool.
from esmvaltool.
I find the structure of the output directory a bit confusing. An unexperienced user may have problems finding the output.
I think we should have only 2 output paths in config (as originally suggested): one for the output (output_dir
), containing the subdirs preproc
, work
and plots
, and one for the temporary files (run_dir
or tmp_dir
), containing a subdir with the diagnostic id (I don't think we need to have another subdir interface_data
here).
The directory structure should look like
<output_dir>/YYYYMMDD_HHMMSS_<namelist>/preproc/
<output_dir>/YYYYMMDD_HHMMSS_<namelist>/work/
<output_dir>/YYYYMMDD_HHMMSS_<namelist>/plots/
<tmp_dir>/YYYYMMDD_HHMMSS_<namelist>/<diag_id>/
There is no risk of overwriting files in work
and plots
, as every diagnostic in the namelist produces different ones. The advantage would be that the files in preproc
could be recycled, for example when 2 variables in the same namelist are processed using the same preproc_id
.
One problem I could foreseen is if we are going to allow for parallel execution of the diagnostics from the same namelist. But in that case we would need anyway to sort out in advance the diagnostics using the same preproc_id
, to allow for recycling the preprocessed file, and also to manage the dependencies (i.e., diagnostics which needs other diagnostics to be run first).
from esmvaltool.
We also have the issue that we can not use previously preprocessed files with any of this structures...
... but anyway, I also think that the preproc dir should be in the output folder.
One small thing: I think we should also change the folder to the following structure
<output_dir>//YYYYMMDD_HHMMSS /
My output folder is getting too many folders and also becomes difficult to find which ones correspond to a given namelist.
from esmvaltool.
We also have the issue that we can not use previously preprocessed files with any of this structures...
Right, only within the same namelist. But I think this is OK for the moment: recycling of preproc files across different namelists shouldn't be a very common case (@axel-lauer ?).
One small thing: I think we should also change the folder to the following structure
<output_dir>//YYYYMMDD_HHMMSS /
Fine with me.
from esmvaltool.
from esmvaltool.
As discussed in the telecon, we now go for 1 user-specified output path (output_dir
), containing the following:
<output_dir>/YYYYMMDD_HHMMSS_<namelist>/preproc/
<output_dir>/YYYYMMDD_HHMMSS_<namelist>/work/
<output_dir>/YYYYMMDD_HHMMSS_<namelist>/plots/
<output_dir>/YYYYMMDD_HHMMSS_<namelist>/tmp/<diag_id>/
The path is mandatory.
from esmvaltool.
from esmvaltool.
Sound good. I have a slight preference for the "combined" format, but not by much.
<output_dir>/<namelist>_YYYYMMDD_HHMMSS/
from esmvaltool.
I'd also like to avoid too much nesting.
from esmvaltool.
from esmvaltool.
Related Issues (20)
- Fixing global attributes for recipe_martin18grl HOT 1
- Documentation build is failing HOT 2
- CMORize tool fails for RAWOBS if directory structure does not include Tier2/Tier3 HOT 1
- Warnings during full development installation HOT 2
- Consider using sphinx-autoapi HOT 1
- New recipe and diagnostic for calculating Lamb Weathertypes HOT 2
- data format command broken for HadCRUT4: too many months? HOT 1
- Replace the MO-specific URL in the RTW with a public URL
- Investigate slurm `--ntasks` and ESMValTool's `MAX_PARALLEL_TASKS` for RTW HOT 1
- Add "How to add a site" to RTW documentation
- Missing data for `recipe_bock20jgr_fig_8-10.yml` HOT 2
- Missing data on DKRZ for `recipe_check_obs.yml` HOT 3
- Diagnostic failure for `recipe_wenzel16jclim.yml` on `v2.11.0rc1` HOT 2
- Update the list of broken recipes for `v2.11.0` HOT 2
- Add the `recipe_ocean_amoc.yml` recipe to the RTW
- Update `precommit` rev to fix `precommit` installation error
- Add code owners for the RTW
- Include verbose output from `compare.py` in RTW
- Fix failing tests after CMIP6 climate patterns merge HOT 4
- Broken R recipes from v2.11.0 due to use of R v4.3.0
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from esmvaltool.