dctools.utilities.machine_profile
Runtime auto-tuning of Dask parallelism parameters based on hardware.
Called once at config-load time by dctools.utilities.args_config.load_args_and_config().
A parameter is filled in when all of the following hold:
auto_tune: true(or key absent) in the root config or the YAML value is the literal string"auto"The value is not already an explicit number (integers/floats are always kept)
The YAML can therefore be used in three modes:
Fully automatic (recommended) — set
auto_tune: trueat the root and omit per-source parallelism keys (or set them tonull/"auto"):auto_tune: true sources: - dataset: swot observation_dataset: true # n_parallel_workers, nthreads_per_worker, memory_limit_per_worker # are all filled automatically
Selective override — keep
auto_tune: truebut pin specific params:sources: - dataset: swot n_parallel_workers: 3 # fixed; everything else still auto-tuned
Fully manual — set
auto_tune: false; only params set to the string"auto"are filled, everything else is kept as-is:auto_tune: false sources: - dataset: swot n_parallel_workers: 5 # kept as-is memory_limit_per_worker: "auto" # ← filled from hardware
Functions
|
Fill auto-tuned parallelism parameters into config in-place. |