Coder Social home page Coder Social logo

Define API to set/get SCR params about scr HOT 8 CLOSED

llnl avatar llnl commented on August 27, 2024
Define API to set/get SCR params

from scr.

Comments (8)

adammoody avatar adammoody commented on August 27, 2024

@white238 , as the next task, I'd like to look at creating an API to set SCR parameters from within the code. I'd like to have one of our new SCR developers assist with that.

Do you have an idea of different parameters you'd like to set to get started?

from scr.

adammoody avatar adammoody commented on August 27, 2024

Some of these params are used only in the scripts (like SCR_RUNS), some are only in the library (like SCR_SET_SIZE), and some are in both (like SCR_CACHE_BASE). The ones used in both must be consistent so that the scripts and the library see the same values. For the scripts, some portion of the scripts run before the library and some portion runs after the library.

As a first task, we can list all params and where they must be valid. If a script needs a value, but the library runs first, we could have the library record its set of params in a file that the scripts can read after the library has run. If a script needs a value before the library runs, we probably should not allow the application to set that param via the API, since that would lead to inconsistent values.

Most of the params, but probably not all, are listed here:
https://scr.readthedocs.io/en/latest/users/config.html#scr-parameters

from scr.

rhaas80 avatar rhaas80 commented on August 27, 2024

I have a kind-of API implemented in https://github.com/rhaas80/scr/tree/rhaas/scr_param_set which does let you add parameters to be used by scr_get_params() (or any of the scr_param_get calls). This lets a code set all SCR parameters but right now, at least without some largish changes to SCR, will require that these calls are made before SCR_Init (since it calls scr_get_params()).
There's multiple things that are not nice about this:

  • it will not transfer these parameters to eg a daemon process
  • having to set them before SCR_Init is annoying
  • it is impossible to change parameters after SCR_Init (well one can change the parameters but b/c the data has been parsed into global variables changing parameters no longer does anything)

The current API follows: ECP-VeloC/AXL#38 ie there is a call

SCR_Config("<string>")

that basically takes a line of the configuration files. If there is an item in the string that does not have an equal sign then this is just a query and no value is set.
Since I am just getting started with adding to SCR I would rather get some feedback if this is at all the desired direction to go to or if there is a desire to have an API where the global scr variables are updated (which probably requires some sort of "listener" interface to update the globals when the parameter hashes are updated).
Right now there's no check for "valid" parameters, mostly b/c no such check exists inside of SCR anywhere right now.

from scr.

adammoody avatar adammoody commented on August 27, 2024

@rhaas80 , I may not have time to review your code in detail, but your description here sounds good. @gonsie mentioned that you had your own parser logic in there. You could perhaps simplify things if you can use the existing functions that are used to parse the SCR config files instead. It would be nice for the end user to support the same syntax in both places if possible.

https://github.com/rhaas80/scr/blob/ddfa45939b92ca3997b51153bba37f2b23c73cc3/src/scr_config.c#L265

I wouldn't worry about setting properties after SCR_Init. Perhaps return an error if someone tries to do that? For that matter, it could be nice to return an error in the case someone tries to set a parameter that can't be set using SCR_Config even before SCR_Init. For example, any parameter that has to be set before the user's job runs cannot be set with SCR_Config. Having said that, I think returning an error is a nice to have thing, rather than a need.

We'll also want to document which parameters can be set when. We could create a table to show whether the parameter can be set: as an environment variable, in the user config file, the system config file, and SCR_Config.

from scr.

rhaas80 avatar rhaas80 commented on August 27, 2024

@adammoody thank you. I was certainly not expecting a full review. Mostly I had the code written to make sure I had an understanding of what needed changing and to be able to point to an explicit implementation.
Since I am new to the SCR project I wanted to get a feeling about what kind of style of implementation is acceptable and a rough idea whether I was going in the right direction. This is both to avoid stepping on people's toe's as well as avoiding putting in a lot of work that needs removing.

I had thought about trying to re-use the file parsing code, but in the end ended up written my own parser which should (untested, so likely untrue) parse the same set of configuration lines (minus comments and minus expanding environment variables I think). The main reason I decided that I could not use the existing function (easily) is because that function is tied to read from a file eg in the whitespace skipping code https://github.com/rhaas80/scr/blob/ddfa45939b92ca3997b51153bba37f2b23c73cc3/src/scr_config.c#L277 .

There's also a difference in syntax because in the configuration files a line like

CKPT=1 STORE

is invalid but is is a valid query for the configuration function. And finally I did not want to make the existing config file reader use a (modified) version of my parser since such a change almost always ends up causing changes in corner case behaviour that someone relied on.

If indeed changing parameters after SCR_init is not required then this makes my life much easier. I will set up a hash similar to the exisiting no_user hash (or maybe that one is already what I need) to forbid some parameters from being set programmatically since eg they need to be available for scripts before / after the main code runs.

I will go ahead and collect information on which parameters can be set programmatically and will try and see what would be required to have a single function parse either lines from a configuration file or strings passed to the config functions.

from scr.

rhaas80 avatar rhaas80 commented on August 27, 2024

Parameters used by non-main code installed into bin:

  • grep -- '->get' $(grep -rl 'use .*scr_param.*')
	bin/scr_scavenge:my $param_buf_size = $param->get("SCR_FILE_BUF_SIZE");
	bin/scr_scavenge:my $param_crc = $param->get("SCR_CRC_ON_FLUSH");
	bin/scr_list_dir:  my $cachedesc = $param->get_hash("CACHE");
	bin/scr_list_dir:  push @bases, $param->get("SCR_CNTL_BASE");
	bin/scr_watchdog:my $param_timeout = $param->get("SCR_WATCHDOG_TIMEOUT");
	bin/scr_watchdog:my $param_timeout_pfs = $param->get("SCR_WATCHDOG_TIMEOUT_PFS");
	bin/scr_list_down_nodes:my $exclude = $param->get("SCR_EXCLUDE_NODES");
	bin/scr_list_down_nodes:  my $cntldirs = $param->get_hash("CNTLDIR");
	bin/scr_list_down_nodes:  my $cachedirs = $param->get_hash("CACHEDIR");
  • binaries that match scr_param_get:
    • scr_log_event
    • scr_log_transfer
    • scr_retries_halt
    • scr_halt
  • scr_log_event.c:
    • SCR_LOG_ENABLE
    • calls scr_log_init(), scr_log_job(), scr_log_event()
  • scr_log.c:
    • SCR_DB_DEBUG
    • SCR_DB_ENABLE
    • SCR_DB_HOST
    • SCR_DB_NAME
    • SCR_DB_PASS
    • SCR_DB_USER
  • scr_log_transfer.c:
    • SCR_LOG_ENABLE
    • calls scr_log_init(), scr_log_job(), scr_log_transfer()
  • scr_retries_halt.c:
    • SCR_HALT_SECONDS
    • calls scr_file_exists, scr_halt_read
  • scr_halt.c:
    • calls scr_file_is_readable, scr_open, scr_file_lock_read, scr_close, scr_file_unlock
  • scr_io.c:
    • no calls outside of file itself

scr_globals.h defines a good number of global variables, all prefixed by scr_ but does not seem to be used by any of the helper tools (anymore, removed use from scr_index.c). Either directly or indirectly.

  • scr_param.pm uses its own get() function:
    • SCR_CACHE_BASE
    • SCR_CACHE_SIZE
  • scr_param.pm directly uses ENV variables for:
    • SCR_PREFIX
    • SCR_CONF_FILE

Scripts only in scr but not installed on my workstation contain some more perl code, grep -r -- '->get' $PWD:

	TLCC/scr_scavenge.in:my $param_buf_size = $param->get("SCR_FILE_BUF_SIZE");
	TLCC/scr_scavenge.in:my $param_crc = $param->get("SCR_CRC_ON_FLUSH");
	TLCC/scr_list_down_nodes.in:my $exclude = $param->get("SCR_EXCLUDE_NODES");
	TLCC/scr_list_down_nodes.in:  my $cntldirs = $param->get_hash("CNTLDIR");
	TLCC/scr_list_down_nodes.in:  my $cachedirs = $param->get_hash("CACHEDIR");
	common/scr_list_dir.in:  my $cachedesc = $param->get_hash("CACHE");
	common/scr_list_dir.in:  push @bases, $param->get("SCR_CNTL_BASE");
	common/scr_list_dir:  my $cachedesc = $param->get_hash("CACHE");
	common/scr_list_dir:  push @bases, $param->get("SCR_CNTL_BASE");
	common/scr_watchdog:my $param_timeout = $param->get("SCR_WATCHDOG_TIMEOUT");
	common/scr_watchdog:my $param_timeout_pfs = $param->get("SCR_WATCHDOG_TIMEOUT_PFS");
	common/scr_watchdog.in:my $param_timeout = $param->get("SCR_WATCHDOG_TIMEOUT");
	common/scr_watchdog.in:my $param_timeout_pfs = $param->get("SCR_WATCHDOG_TIMEOUT_PFS");
	cray_xt/scr_scavenge.in:my $param_buf_size = $param->get("SCR_FILE_BUF_SIZE");
	cray_xt/scr_scavenge.in:my $param_crc = $param->get("SCR_CRC_ON_FLUSH");
	cray_xt/scr_scavenge.in:my $param_container = $param->get("SCR_USE_CONTAINERS");
	cray_xt/scr_list_down_nodes.in:my $exclude = $param->get("SCR_EXCLUDE_NODES");
	cray_xt/scr_list_down_nodes.in:  my $cntldirs = $param->get_hash("CNTLDIR");
	cray_xt/scr_list_down_nodes.in:  my $cachedirs = $param->get_hash("CACHEDIR");
	param_test:$value = $param->get_param('SCR_FLUSH');
	param_test:$value = $param->get_param('HOMEY');
	param_test:$value = $param->get_subparam('CACHEDIR', '/tmp', 'BYTES');
	LSF/scr_scavenge.in:my $param_buf_size = $param->get("SCR_FILE_BUF_SIZE");
	LSF/scr_scavenge.in:my $param_crc = $param->get("SCR_CRC_ON_FLUSH");
	LSF/scr_list_down_nodes.in:my $exclude = $param->get("SCR_EXCLUDE_NODES");
	LSF/scr_list_down_nodes.in:#  my $cntldirs = $param->get_hash("CNTLDIR");
	LSF/scr_list_down_nodes.in:#  my $cachedirs = $param->get_hash("CACHEDIR");
	PMIX/scr_scavenge.in:my $param_buf_size = $param->get("SCR_FILE_BUF_SIZE");
	PMIX/scr_scavenge.in:my $param_crc = $param->get("SCR_CRC_ON_FLUSH");
	PMIX/scr_scavenge.in:my $param_container = $param->get("SCR_USE_CONTAINERS");
	PMIX/scr_list_down_nodes.in:my $exclude = $param->get("SCR_EXCLUDE_NODES");
	PMIX/scr_list_down_nodes.in:  my $cntldirs = $param->get_hash("CNTLDIR");
	PMIX/scr_list_down_nodes.in:  my $cachedirs = $param->get_hash("CACHEDIR");

Kathryn's grep list of parameters

  • Common/scr_param.pm: SCR_PREFIX, SCR_CONF_FILE
  • Common/scr_prefix.pm: SCR_PREFIX
  • Common/scr_list_dir: CACHE, SCR_CNTL_BASE
  • Common/scr_watchdog: SCR_WATCHDOG_TIMEOUT, SCR_WATCHDOG_TIMEOUT_PFS
  • TLCC/scr_list_down_nodes: SCR_EXCLUDE_NODES, CNTLDIR, CACHEDIR
  • TLCC/scr_scavenge: SCR_FILE_BUF_SIZE, SCR_CRC_ON_FLUSH, SCR_USE_CONTAINERS
  • Cray_xt: scr_list_down_Nodes and scr_scavenge same as above
  • Cray_xt: scr_env: PBS_NUM_NODES,USER,PBS_JOBID
  • Common/scr_env: USER, SLURM_JOBID, SLURM_NODELIST

Compile time:

  • SCR_CNTL_BASE

What is not used in src:

  • SCR_WATCHDOG*, SCR_EXCLUDE_NODES

from scr.

adammoody avatar adammoody commented on August 27, 2024

@rhaas80 , how are things coming along on this work?

I'm working several codes teams to update the SCR calls in their applications. It's a prime opportunity to add the new config API as part of that effort.

from scr.

rhaas80 avatar rhaas80 commented on August 27, 2024

Addressed in merge request #184

from scr.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.